Microsoft Research · Healthcare
BioGPT
A generative pre-trained transformer model trained on large-scale biomedical literature for biomedical text generation and mining.
Overview
BioGPT is a domain-specific generative language model pre-trained on 15 million PubMed abstracts. It achieves state-of-the-art results on several biomedical NLP benchmarks including relation extraction, question answering, and document classification. The model can generate fluent biomedical text and is particularly effective at extracting structured knowledge from scientific literature.
Parameters
1.5B
Architecture
GPT-2 style autoregressive transformer
Training Data
15M PubMed abstracts
Context Window
1024 tokens
License
MIT
Capabilities
Biomedical text generation
Relation extraction from scientific papers
Biomedical question answering
Document classification
Drug-disease relation identification
Use Cases
Mining drug interactions from published literature
Generating summaries of biomedical research papers
Answering clinical questions based on medical evidence
Identifying potential drug repurposing candidates from literature
Pros
- +Strong biomedical text generation capabilities
- +State-of-the-art on multiple biomedical NLP benchmarks
- +Open-source with permissive MIT license
- +Effective knowledge extraction from scientific literature
Cons
- -Limited to biomedical domain knowledge
- -Smaller context window compared to modern LLMs
- -May generate plausible but factually incorrect biomedical claims
- -Requires significant compute for inference at 1.5B parameters
Pricing
Free and open-source under MIT license. Self-hosted deployment. Cloud GPU costs approximately $0.50-$2.00/hour depending on provider.