Microsoft Research · Healthcare

BioGPT

A generative pre-trained transformer model trained on large-scale biomedical literature for biomedical text generation and mining.

Overview

BioGPT is a domain-specific generative language model pre-trained on 15 million PubMed abstracts. It achieves state-of-the-art results on several biomedical NLP benchmarks including relation extraction, question answering, and document classification. The model can generate fluent biomedical text and is particularly effective at extracting structured knowledge from scientific literature.

Parameters

1.5B

Architecture

GPT-2 style autoregressive transformer

Training Data

15M PubMed abstracts

Context Window

1024 tokens

License

MIT

Capabilities

Biomedical text generation

Relation extraction from scientific papers

Biomedical question answering

Document classification

Drug-disease relation identification

Use Cases

Mining drug interactions from published literature

Generating summaries of biomedical research papers

Answering clinical questions based on medical evidence

Identifying potential drug repurposing candidates from literature

Pros

  • +Strong biomedical text generation capabilities
  • +State-of-the-art on multiple biomedical NLP benchmarks
  • +Open-source with permissive MIT license
  • +Effective knowledge extraction from scientific literature

Cons

  • -Limited to biomedical domain knowledge
  • -Smaller context window compared to modern LLMs
  • -May generate plausible but factually incorrect biomedical claims
  • -Requires significant compute for inference at 1.5B parameters

Pricing

Free and open-source under MIT license. Self-hosted deployment. Cloud GPU costs approximately $0.50-$2.00/hour depending on provider.

Related Models