AI21 Labs · General LLM

Jamba

A novel hybrid architecture model combining transformer and Mamba state-space model layers for efficient long-context processing.

Overview

Jamba by AI21 Labs introduces a groundbreaking hybrid architecture that interleaves transformer attention layers with Mamba structured state-space model (SSM) layers. This combination achieves a 256K effective context window with significantly reduced memory footprint compared to pure transformer models. Jamba demonstrates that hybrid architectures can offer the best of both worlds: the strong in-context learning of transformers and the linear-time sequence processing of SSMs, enabling efficient handling of very long documents.

Parameters

52B total, 12B active (MoE)

Context Window

256K tokens

Architecture

Hybrid Transformer-Mamba SSM + MoE

Memory Efficiency

~2x less KV cache than pure transformer

License

Apache 2.0 (Jamba base)

Capabilities

Ultra-long context processing up to 256K tokens

Efficient memory usage through hybrid SSM-transformer architecture

General-purpose text generation and reasoning

Document analysis across extremely long inputs

Use Cases

Processing and analyzing extremely long documents and codebases

Building AI applications requiring efficient long-context handling

Research into hybrid transformer-SSM architectures

Enterprise document analysis with extensive context requirements

Pros

  • +Novel hybrid architecture achieves efficient long-context processing
  • +256K context window with manageable memory requirements
  • +Open-source base model enables research and customization
  • +MoE design keeps active parameters efficient for inference

Cons

  • -Newer architecture with less community tooling and optimization
  • -Hybrid approach adds complexity for deployment and fine-tuning
  • -Benchmark performance may trail pure transformer models on some tasks
  • -Limited ecosystem support compared to standard transformer models

Pricing

Jamba base model is free under Apache 2.0. AI21 API offers Jamba Instruct with usage-based pricing. Enterprise plans available for high-volume use.

Related Models