models/anthropic/claude-opus-4-6
A
Anthropic·active

Claude Opus 4.6

Anthropic's flagship model. Long-context specialist with 1M window.

Overall score
4.62
/5.00 · ranked #6
Input
$5.00
per 1M tokens
Output
$25.00
per 1M tokens
Context
1M
tokens
Blended
$20.00
3:1 out:in ratio

Price drops, new benchmarks, model updates. Stay current on Claude Opus 4.6.

One email per change. Unsubscribe anytime.

modelpicker.aipowered by live benchmark data

Scores by test

Methodology →
Structured Output
4.0
Strategic Analysis
5.0
Constrained Rewriting
3.0
Creative Problem Solving
5.0
Tool Calling
5.0
Faithfulness
5.0
Classification
3.0
Long Context
5.0
Safety Calibration
5.0
Persona Consistency
5.0
Agentic Planning
5.0
Multilingual
5.0
Tabular Data
5.0
SWE-bench Verified
78.7
AIME 2025
94.4

What you need to know

Claude Opus 4.6 is a high-reasoning model optimized for complex agentic workflows and long-context processing. It demonstrates exceptional proficiency in tool calling, agentic planning, and strategic analysis, all scoring 5/5 internally. Its external performance is particularly strong in technical domains, achieving a 78.7% score on SWE-bench Verified and 94.4% on AIME 2025, indicating a high capacity for software engineering and mathematical reasoning.

The model is positioned at a premium price point, with a blended cost of $20.00/MTok. While expensive, this cost aligns with its rank as the 4th strongest model out of 71. The 1M token context window is fully utilized, as the model maintains a 5/5 internal score for long-context performance and faithfulness.

Despite its reasoning capabilities, the model struggles with rigid formatting and categorization tasks. It scores only 3/5 in classification and constrained rewriting, and 4/5 in structured output. This suggests a tendency to deviate from strict templates or narrow labeling requirements.

Use this model for autonomous agents, complex codebase analysis, and high-stakes strategic planning where reasoning quality outweighs cost. Skip this model for high-volume classification tasks or applications requiring strict adherence to constrained rewriting formats.

Strengths — Top 3

Strategic Analysis5.0/5.0
Creative Problem Solving5.0/5.0
Tool Calling5.0/5.0

Relative weaknesses — Bottom 3

Constrained Rewriting3.0/5.0
Classification3.0/5.0
Structured Output4.0/5.0

Similar models

ZGLM-4.7$1.414.69XMiMo-V2.5$1.604.69XXiaomi: MiMo-V2-Pro$2.504.54QQwen: Qwen3.6 Max Preview$4.944.85