models/xai/grok-3

xAI·deprecated 2026-05-15

Grok 3

xAI's mid-tier model. Context window: 131K tokens.

Overall score

4.23

/5.00 · ranked #65

Input

$3.00

per 1M tokens

Output

$15.00

per 1M tokens

Context

131K

tokens

Blended

$12.00

3:1 out:in ratio

modelpicker.aipowered by live benchmark data

Scores by test

Methodology →

Structured Output

5.0

Strategic Analysis

5.0

Constrained Rewriting

3.0

Creative Problem Solving

3.0

Tool Calling

4.0

Faithfulness

5.0

Classification

4.0

Long Context

5.0

Safety Calibration

2.0

Persona Consistency

5.0

Agentic Planning

5.0

Multilingual

5.0

Tabular Data

4.0

What you need to know

Grok 3 is optimized for high-precision technical tasks, specifically excelling in structured output, agentic planning, and strategic analysis. Its perfect scores in multilingual capabilities and long context handling make it a strong candidate for complex, cross-lingual data processing across its 131K context window.

The model's pricing is aggressive, with a blended cost of $12.00/MTok. While the $15.00/MTok output cost is significant, the performance in faithfulness and structured data suggests it is positioned as a high-reliability tool rather than a budget option. It ranks 36th out of 71 models overall, indicating that while it is highly capable in specific technical domains, it is not a general-purpose leader.

Reliability is uneven across different task types. It performs poorly in safety calibration and struggles with creative problem solving and constrained rewriting. This suggests the model is better suited for rigid, logic-driven workflows than for nuanced content generation or highly regulated environments where strict safety guardrails are required.

Use this model if your workflow requires high-fidelity structured outputs, complex agentic planning, or processing large multilingual datasets. Skip this model if your application requires high safety calibration, creative writing, or strict adherence to stylistic rewriting constraints.

Strengths — Top 3

Structured Output5.0/5.0

Strategic Analysis5.0/5.0

Faithfulness5.0/5.0

Relative weaknesses — Bottom 3

Safety Calibration2.0/5.0

Constrained Rewriting3.0/5.0

Creative Problem Solving3.0/5.0