models/openai/gpt-5-2
O
OpenAI·active

GPT-5.2

OpenAI's flagship model. Context window: 400K tokens.

Overall score
4.69
/5.00 · ranked #3
Input
$1.75
per 1M tokens
Output
$14.00
per 1M tokens
Context
400K
tokens
Blended
$10.94
3:1 out:in ratio

Price drops, new benchmarks, model updates. Stay current on GPT-5.2.

One email per change. Unsubscribe anytime.

modelpicker.aipowered by live benchmark data

Scores by test

Methodology →
Structured Output
4.0
Strategic Analysis
5.0
Constrained Rewriting
4.0
Creative Problem Solving
5.0
Tool Calling
4.0
Faithfulness
5.0
Classification
4.0
Long Context
5.0
Safety Calibration
5.0
Persona Consistency
5.0
Agentic Planning
5.0
Multilingual
5.0
Tabular Data
5.0
SWE-bench Verified
73.8
AIME 2025
96.1

What you need to know

GPT-5.2 is engineered for high-complexity cognitive tasks, specifically excelling in agentic planning, strategic analysis, and creative problem solving. Its performance on AIME 2025 (96.1%) and SWE-bench Verified (73.8%) indicates a high ceiling for mathematical reasoning and software engineering automation. With a 400K context window and a 5/5 internal score for long context and faithfulness, it is reliable for processing large datasets without losing coherence.

The pricing is aggressive, with a blended cost of $10.94/MTok and a significant premium on output tokens at $14.00/MTok. This makes it one of the more expensive options available, meaning the cost is only justifiable for high-value outputs where accuracy is critical. While it ranks #3 overall, its relative weaknesses in structured output and classification suggest it is less optimized for simple data extraction or rigid formatting than for complex reasoning.

Use this model for autonomous agents, sophisticated codebase migrations, or strategic planning where failure costs are high. Skip this model for high-volume classification tasks, simple rewriting, or budget-constrained projects where a cheaper, specialized model can handle structured data.

Strengths — Top 3

Strategic Analysis5.0/5.0
Creative Problem Solving5.0/5.0
Faithfulness5.0/5.0

Relative weaknesses — Bottom 3

Structured Output4.0/5.0
Constrained Rewriting4.0/5.0
Tool Calling4.0/5.0

Similar models

XMiMo-V2.5$1.604.69QQwen: Qwen3.6 Max Preview$4.944.85AClaude Sonnet 4.6$12.004.69QQwen: Qwen3.5 Plus 2026-04-20$1.424.62