AI21 Labs · General LLM
Jamba
A novel hybrid architecture model combining transformer and Mamba state-space model layers for efficient long-context processing.
Overview
Jamba by AI21 Labs introduces a groundbreaking hybrid architecture that interleaves transformer attention layers with Mamba structured state-space model (SSM) layers. This combination achieves a 256K effective context window with significantly reduced memory footprint compared to pure transformer models. Jamba demonstrates that hybrid architectures can offer the best of both worlds: the strong in-context learning of transformers and the linear-time sequence processing of SSMs, enabling efficient handling of very long documents.
Parameters
52B total, 12B active (MoE)
Context Window
256K tokens
Architecture
Hybrid Transformer-Mamba SSM + MoE
Memory Efficiency
~2x less KV cache than pure transformer
License
Apache 2.0 (Jamba base)
Capabilities
Ultra-long context processing up to 256K tokens
Efficient memory usage through hybrid SSM-transformer architecture
General-purpose text generation and reasoning
Document analysis across extremely long inputs
Use Cases
Processing and analyzing extremely long documents and codebases
Building AI applications requiring efficient long-context handling
Research into hybrid transformer-SSM architectures
Enterprise document analysis with extensive context requirements
Pros
- +Novel hybrid architecture achieves efficient long-context processing
- +256K context window with manageable memory requirements
- +Open-source base model enables research and customization
- +MoE design keeps active parameters efficient for inference
Cons
- -Newer architecture with less community tooling and optimization
- -Hybrid approach adds complexity for deployment and fine-tuning
- -Benchmark performance may trail pure transformer models on some tasks
- -Limited ecosystem support compared to standard transformer models
Pricing
Jamba base model is free under Apache 2.0. AI21 API offers Jamba Instruct with usage-based pricing. Enterprise plans available for high-volume use.