models/openai/gpt-5-4
O
OpenAI·active

GPT-5.4

OpenAI's mid-tier model. Long-context specialist with 1.1M window.

Overall score
4.54
/5.00 · ranked #11
Input
$2.50
per 1M tokens
Output
$15.00
per 1M tokens
Context
1.1M
tokens
Blended
$11.88
3:1 out:in ratio

Price drops, new benchmarks, model updates. Stay current on GPT-5.4.

One email per change. Unsubscribe anytime.

modelpicker.aipowered by live benchmark data

Scores by test

Methodology →
Structured Output
5.0
Strategic Analysis
5.0
Constrained Rewriting
4.0
Creative Problem Solving
4.0
Tool Calling
4.0
Faithfulness
5.0
Classification
3.0
Long Context
5.0
Safety Calibration
5.0
Persona Consistency
5.0
Agentic Planning
5.0
Multilingual
5.0
Tabular Data
4.0
SWE-bench Verified
76.9
AIME 2025
95.3

What you need to know

GPT-5.4 is optimized for high-reliability autonomous workflows, distinguished by perfect internal scores in agentic planning, structured output, and faithfulness. Its performance in complex reasoning is validated by a 95.3% score on AIME 2025 and a 76.9% success rate on SWE-bench Verified, making it a top-tier choice for software engineering and strategic analysis tasks.

The model handles massive datasets efficiently with a 1.1M token context window and maintains perfect scores for long-context retrieval and persona consistency. However, it is less effective for simple classification tasks, where it scores significantly lower than in its primary reasoning categories.

At a blended cost of $11.88 per million tokens, this is a premium-priced model. The high output cost of $15.00/MTok reflects its positioning as a high-intelligence engine rather than a cost-efficient utility for high-volume, simple tasks.

Use this model for complex agentic orchestration, large-scale codebase analysis, and tasks requiring strict adherence to structured formats. Skip this model for basic text classification or high-throughput applications where cost efficiency is prioritized over deep reasoning.

Strengths — Top 3

Structured Output5.0/5.0
Strategic Analysis5.0/5.0
Faithfulness5.0/5.0

Relative weaknesses — Bottom 3

Classification3.0/5.0
Constrained Rewriting4.0/5.0
Creative Problem Solving4.0/5.0

Similar models

XMiMo-V2.5$1.604.69GGemini 3.1 Flash Lite Preview$1.194.46MMoonshotAI: Kimi K2.6$2.804.62GGoogle: Gemini 3.1 Flash Lite$1.194.38