/best/mathupdated July 202613 models evaluated

Best AI for math

Arithmetic, proofs, optimization, and symbolic reasoning.

Coding Math Writing Research Translation Data Analysis Chatbots Students Business Creative Writing Tabular Data & Spreadsheets

Top score

Best value

Largest context

Math benchmarks are the cleanest signal we have. Answers are verifiable, contamination is detectable, and reasoning-RL fine-tuning produces outsized gains — which is why thinking-style models (o-series, R1, Gemini thinking variants) dominate.

What matters: AIME, MATH Level 5, and GPQA-Diamond. If your application does anything quantitative — finance, logistics, scientific computing — pay the premium for a reasoning model. The gap between a frontier reasoning model and a frontier chat model on hard math problems can be 20+ percentage points.

Our math rank weights the math benchmark (2.5×), reasoning (1.5×), and structured output (0.5×). The structured output weight matters for applications that pipe results into downstream systems — a model that gets the right answer but formats it wrong is still broken.

Full rankings

All 13 models, scored for math

weighted composite · lower-is-worse

#	Model	Provider	Task score	$/in	$/out	Context
01	GPT-5	OOpenAI	98.1%	$1.25	$10.00	400K
02	GPT-5 Mini	OOpenAI	97.8%	$0.250	$2.00	400K
03	o4 Mini	OOpenAI	97.8%	$1.10	$4.40	200K
04	o3	OOpenAI	97.8%	$2.00	$8.00	200K
05	R1 0528	DDeepSeek	96.6%	$0.500	$2.15	164K
06	GPT-5 Nano	OOpenAI	95.2%	$0.050	$0.400	400K
07	R1	DDeepSeek	93.1%	$0.700	$2.50	164K
08	GPT-4.1 Mini	OOpenAI	87.3%	$0.400	$1.60	1.0M
09	GPT-4.1	OOpenAI	83.0%	$2.00	$8.00	1.0M
10	GPT-4.1 Nano	OOpenAI	70.0%	$0.100	$0.400	1.0M
11	GPT-4o	OOpenAI	53.3%	$2.50	$10.00	128K
12	GPT-4o-mini	OOpenAI	52.6%	$0.150	$0.600	128K
13	Llama 3.3 70B Instruct	MMeta	41.6%	$0.100	$0.320	131K

Pricing — top 5 for math

OGPT-5

$7.81/MTok

98.1%

OGPT-5 Mini

$1.56/MTok

97.8%

Oo4 Mini

$3.58/MTok

97.8%

Oo3

$6.50/MTok

97.8%

DR1 0528

$1.74/MTok

96.6%

modelpicker.aipowered by live benchmark data

The best AI for math changes every month.

We'll email you when rankings shift, new models hit the top 5, or pricing cuts reshuffle the value leaders.

Get notified when models change

Price drops, new models, benchmark updates. One email per change, no spam.