Question 1

Why did Claude Haiku 4.5 win for Writing?

Accepted Answer

In our tests Claude Haiku 4.5 scores 3.5 vs Devstral Small 1.1's 2.5 on Writing. The decisive factors are a 4 vs 2 advantage on creative_problem_solving and higher persona_consistency (5 vs 2), long_context (5 vs 4), and faithfulness (5 vs 4).

Question 2

Is Devstral Small 1.1 ever the better choice?

Accepted Answer

Yes. Devstral Small 1.1 is much cheaper per mTok (input $0.10 / output $0.30) and works well for high-volume, templated copy or when budget constraints outweigh the need for stronger creativity and persona control.

Question 3

How do both models handle tight character or headline constraints?

Accepted Answer

They tie on constrained_rewriting (3 vs 3) in our tests, so both models are roughly equivalent at compressing content to hard limits, though Haiku still has advantages in preserving voice across those rewrites.

Question 4

Does modality or context window affect Writing outcomes?

Accepted Answer

Yes. Haiku supports text+image->text and a larger 200,000-token window versus Devstral's text->text and 131,072-token window; that helps Haiku with long-form drafts and multimodal briefs in our testing.

Question 5

Are there external benchmarks backing this Writing verdict?

Accepted Answer

No. externalBenchmark is null in the payload, so this comparison and winner call rely on our internal Writing task scores and subtest metrics.

Claude Haiku 4.5 vs Devstral Small 1.1 for Writing

Claude Haiku 4.5

Devstral Small 1.1

Task Analysis

Practical Examples

Bottom Line

How We Test

Frequently Asked Questions