Question 1

Is R1 0528 better than GPT-5 Mini?

Accepted Answer

It depends on the task. In our 12-test suite R1 0528 wins 3 tests (tool_calling, safety_calibration, agentic_planning) while GPT-5 Mini wins 2 (structured_output, strategic_analysis) and 7 tests tie. Use R1 for agentic/tool workflows; use GPT-5 Mini for structured-output and strategy tasks.

Question 2

Which model is cheaper?

Accepted Answer

GPT-5 Mini is cheaper per token. Payload prices: R1 input $0.50/mTok and output $2.15/mTok (combined $2.65/mTok); GPT-5 Mini input $0.25/mTok and output $2.00/mTok (combined $2.25/mTok). At 1M tokens/month that's ~$2,650 vs $2,250 (R1 vs GPT-5 Mini).

Question 3

Which model is better for coding and tool use?

Accepted Answer

For tool selection and sequencing our testing favors R1 0528: R1 scored 5/5 on tool_calling and is tied for 1st in that ranking, while GPT-5 Mini scored 3/5 and ranks 47/54. For code-format or strict output formats, GPT-5 Mini scored 5/5 on structured_output and is tied for 1st.

Question 4

How do they compare on math and contest benchmarks?

Accepted Answer

On Epoch AI external tests (reported in the payload): MATH Level 5 — R1 96.6% vs GPT-5 Mini 97.8% (Epoch AI). AIME 2025 — R1 66.4% vs GPT-5 Mini 86.7% (Epoch AI). GPT-5 Mini leads on AIME in the Epoch AI data.

Question 5

Are there any production quirks to watch for?

Accepted Answer

Yes. R1 0528 documents a quirk: it can return empty responses on structured_output, constrained_rewriting, and agentic_planning in short tasks and consumes reasoning tokens from the output budget. Plan for higher min/max completion tokens and test structured-output flows on R1 before deploying.

R1 0528 vs GPT-5 Mini

R1 0528

GPT-5 Mini

Benchmark Analysis

Pricing Analysis

Real-World Cost Comparison

Bottom Line

How We Test

Frequently Asked Questions