51 models. 12 benchmark categories. Updated nightly.
or browse by task / provider / price
Structured Output
JSON schema compliance and format adherence
Strategic Analysis
Nuanced tradeoff reasoning with real numbers
Constrained Rewriting
Compression within hard character limits
Creative Problem Solving
Non-obvious, specific, feasible ideas
Tool Calling
Function selection, argument accuracy, sequencing
Faithfulness
Sticks to source material without hallucinating
Classification
Accurate categorization and routing
Long Context
Retrieval accuracy at 30K+ tokens
Cheapest Strong Models
Models with Strong benchmark grade at the lowest output cost
Best Value Per Dollar
Highest benchmark score relative to output cost
Largest Context Windows
Models with the biggest context windows for processing long documents
Best AI Models in 2026
Top-performing models across all benchmark categories
Newest Models
Most recently added models with benchmark data