Question 1

Why did R1 0528 win when Claude Haiku 4.5 has a larger context window?

Accepted Answer

We based the winner on our constrained_rewriting test score (R1 0528 = 4, Claude Haiku 4.5 = 3). The larger context window helps with very long inputs, but on compression quality and safety calibration in our tests R1 0528 performed better. Note operational trade-offs: Claude's 200,000-token window is larger than R1 0528's 163,840, which can matter for extra-long sources.

Question 2

What is the impact of R1 0528's 'empty response' quirk on constrained rewriting?

Accepted Answer

R1 0528's quirks note that it can return empty responses for structured_output, constrained_rewriting, and agentic_planning on short tasks because reasoning tokens consume the output budget. In practice, you must configure higher max completion tokens (and avoid very low token limits) to get consistent, non-empty compressed outputs. When configured correctly it scored higher in our constrained_rewriting test.

Question 3

How should I choose between these models for batch compression jobs?

Accepted Answer

If cost per output and per-item compression quality matter, R1 0528 is the more cost-effective winner in our testing ($2.15 vs $5 per mTok and a 4 vs 3 constrained_rewriting score). If you prefer minimal engineering to avoid quirks or need multimodal input handling, choose Claude Haiku 4.5.

Question 4

Are there external benchmark scores used to decide the winner?

Accepted Answer

No. This comparison uses our internal constrained_rewriting test (the payload shows no externalBenchmark for this task). The winner call is based on those internal scores and the operational metadata in the payload.

Question 5

Do both models preserve meaning equally well when compressing?

Accepted Answer

Both models scored 5 on faithfulness in our testing, indicating similar strength in preserving source meaning. The deciding factor for constrained_rewriting was compression accuracy within hard limits and operational behavior, where R1 0528 edged out Claude Haiku 4.5.

Claude Haiku 4.5 vs R1 0528 for Constrained Rewriting

Claude Haiku 4.5

R1 0528

Task Analysis

Practical Examples

Bottom Line

How We Test

Frequently Asked Questions