Question 1

Is R1 0528 better than R1?

Accepted Answer

In our testing R1 0528 wins 5 of 12 benchmarks while R1 wins 2 and 5 are ties. R1 0528 is stronger on tool calling (5 vs 4), long-context (5 vs 4), safety (4 vs 1), agentic planning (5 vs 4) and classification (4 vs 2). R1 is stronger on strategic analysis and creative problem solving (5 vs 4).

Question 2

Which model is cheaper?

Accepted Answer

R1 0528 is cheaper. Per the payload: R1 0528 input $0.50 / output $2.15 per mTok; R1 input $0.70 / output $2.50 per mTok. That yields roughly a 16% cost advantage to R1 0528 (priceRatio 1.1628).

Question 3

Which model is better for coding and tool-based workflows?

Accepted Answer

R1 0528: scores 5 on tool_calling vs R1's 4 and ranks tied for 1st on tool calling (rank tied for 1st of 54). R1 0528 also posts higher external MATH Level 5 (96.6% vs R1 93.1% per Epoch AI), making it the better choice for coding and function/agentic flows in our tests.

Question 4

Which model is better at long documents and retrieval over 30k tokens?

Accepted Answer

R1 0528 scored 5 on long_context vs R1’s 4 and is tied for 1st in our long_context ranking (tied for 1st of 55). Expect stronger retrieval accuracy and coherence on >30k-token contexts with R1 0528 in our tests.

Question 5

Are there any operational quirks I should know when switching?

Accepted Answer

Yes. Both models use reasoning tokens and have min_max_completion_tokens of 1000. R1 0528 notes: it may return empty responses on structured_output, constrained_rewriting, and agentic_planning for short tasks (payload 'empty_on_structured_output' and 'returns empty responses' quirks). R1 also uses reasoning tokens and needs high max_completion_tokens. Expect adjustments to max_tokens and prompt lengths when migrating.

Question 6

How do they compare on safety?

Accepted Answer

R1 0528 scored 4 on safety_calibration vs R1's 1 in our tests; R1 0528 ranks 6 of 55 on safety_calibration while R1 ranks 32 of 55. In our testing R1 0528 is considerably better at refusing harmful requests while allowing legitimate ones.

R1 vs R1 0528

R1

R1 0528

Benchmark Analysis

Pricing Analysis

Real-World Cost Comparison

Bottom Line

How We Test

Frequently Asked Questions