Bloomberg · Finance
BloombergGPT
A 50-billion parameter language model purpose-built for finance, trained on Bloomberg's proprietary financial data alongside general-purpose datasets.
Overview
BloombergGPT is a large language model specifically built for the financial domain. It was trained on a 363-billion token dataset combining Bloomberg's vast proprietary financial data archive (FinPile) with general-purpose text. The model outperforms existing open models of similar size on financial NLP benchmarks while maintaining competitive performance on general language tasks, demonstrating the value of domain-specific training at scale.
Parameters
50B
Training Data
363B tokens (FinPile + general corpus)
Architecture
Decoder-only transformer
FinPile Size
345B tokens of financial data
Training Duration
53 days on 512 A100 GPUs
Capabilities
Financial sentiment analysis
Financial named entity recognition
News headline classification
Financial question answering
Financial document summarization
Use Cases
Analyzing market sentiment from news articles and social media
Extracting financial entities from earnings reports and filings
Classifying financial news for real-time trading signals
Automating financial research report summarization
Pros
- +Largest and most comprehensive financial domain LLM
- +Trained on unmatched proprietary financial data
- +Strong performance across all financial NLP benchmarks
- +Maintains competitive general-purpose language capabilities
Cons
- -Not publicly accessible or open-source
- -Cannot be deployed or fine-tuned by external organizations
- -Extremely expensive to train and replicate
- -Model weights and training data are proprietary
Pricing
Not publicly available. Access is limited to Bloomberg internal use and select research partnerships. Bloomberg Terminal subscribers may see integrated features.