Claude Haiku 4.5 vs Devstral Small 1.1 for Writing
Claude Haiku 4.5 is the winner for Writing in our testing. It scores 3.5 vs Devstral Small 1.1's 2.5 on the Writing task in our suite, driven by a 4 vs 2 advantage on creative_problem_solving and stronger supporting abilities (persona_consistency 5 vs 2, long_context 5 vs 4, faithfulness 5 vs 4). Devstral Small 1.1 is lower-cost (input $0.10 / output $0.30 per mTok) and can be suitable for high-volume templated copy, but it trails on the creative and persona-driven aspects required for high-quality blog and marketing content.
anthropic
Claude Haiku 4.5
Benchmark Scores
External Benchmarks
Pricing
Input
$1.00/MTok
Output
$5.00/MTok
modelpicker.net
mistral
Devstral Small 1.1
Benchmark Scores
External Benchmarks
Pricing
Input
$0.100/MTok
Output
$0.300/MTok
modelpicker.net
Task Analysis
Writing (blog posts, marketing copy, content creation) demands creative idea generation, tight constrained rewrites (headlines, length limits), consistent voice/persona across documents, long-context coherence for multi-section drafts, and reliable faithfulness to brief. Our Writing task uses two focused tests: creative_problem_solving and constrained_rewriting. In our testing Claude Haiku 4.5 scores 4 on creative_problem_solving vs Devstral Small 1.1's 2, while both score 3 on constrained_rewriting. Haiku's higher persona_consistency (5 vs 2), long_context (5 vs 4), faithfulness (5 vs 4) and tool_calling (5 vs 4) explain why it produces more coherent, on-brief, and personality-rich drafts. Note also modality and context differences: Haiku supports text+image->text and a 200,000-token window; Devstral Small 1.1 is text->text with a 131,072-token window. There is no third-party external benchmark for Writing included in the payload, so this verdict is based on our internal task scores and supporting subtest metrics.
Practical Examples
High-effort brand storytelling: Choose Claude Haiku 4.5 (creative_problem_solving 4 vs 2) — it better generates original campaign concepts, multi-section blog drafts, and maintains persona across a long draft (persona_consistency 5 vs 2; long_context 5 vs 4). Headline/character-limited rewrites: Both models tie on constrained_rewriting (3 vs 3), so either can meet strict length constraints, but Haiku will better preserve voice. Rapid volume A/B marketing copy: Devstral Small 1.1 is practical for low-cost bulk variants (input $0.10 / output $0.30 per mTok) when creativity demand is moderate. Fact-sensitive product descriptions: Haiku's faithfulness 5 vs 4 reduces editing. Structured outputs and formatting (JSON/article templates) are equal (structured_output 4 vs 4), so both handle schema-based generation similarly.
Bottom Line
For Writing, choose Claude Haiku 4.5 if you need stronger creative idea generation, long-context drafting, and consistent persona (task score 3.5 vs 2.5). Choose Devstral Small 1.1 if you prioritize lower per-token cost (input $0.10 / output $0.30 per mTok) for high-volume, template-driven copy and can accept lower creative and persona fidelity.
How We Test
We test every model against our 12-benchmark suite covering tool calling, agentic planning, creative problem solving, safety calibration, and more. Each test is scored 1–5 by an LLM judge. Read our full methodology.