GPT-5.4 Mini
OpenAI's efficiency model. Context window: 400K tokens.
Scores by test
Methodology →What you need to know
GPT-5.4 Mini distinguishes itself through high reliability in structured output, faithfulness, and long-context processing, all scoring 5/5 internally. Its 400K context window is paired with a high AIME 2025 score of 87.2%, indicating strong reasoning capabilities for a mini-tier model.
The pricing is moderate for the performance tier, with a blended cost of $3.56/MTok. While it excels in strategic analysis and multilingual tasks, it has significant deficits in handling tabular data and safety calibration, both scoring 2/5. These gaps suggest the model may struggle with precise data extraction from tables or strict adherence to safety guardrails.
Use this model if your workflow requires high-fidelity structured data, complex reasoning over large documents, or consistent persona maintenance. Skip this model if your application relies heavily on tabular data processing or requires high-precision safety calibration.
Strengths — Top 3
Relative weaknesses — Bottom 3
Similar models