Claude Haiku 4.5 vs Devstral Small 1.1 for Constrained Rewriting
Winner: Claude Haiku 4.5. In our testing both Claude Haiku 4.5 and Devstral Small 1.1 score 3/5 on Constrained Rewriting (rank 31 of 52), so the task-specific tie is broken by supporting capabilities. Haiku 4.5 delivers higher faithfulness (5 vs 4), long-context handling (5 vs 4), and tool-calling (5 vs 4) in our benchmarks—capabilities that matter for reliably compressing text under hard character limits. Devstral Small 1.1 is materially cheaper (output cost $0.30/mTok vs Haiku $5.00/mTok) and is a valid choice when budget and throughput trump absolute fidelity. We pick Claude Haiku 4.5 for quality-critical constrained rewriting and Devstral Small 1.1 when cost or scale is the primary constraint.
anthropic
Claude Haiku 4.5
Benchmark Scores
External Benchmarks
Pricing
Input
$1.00/MTok
Output
$5.00/MTok
modelpicker.net
mistral
Devstral Small 1.1
Benchmark Scores
External Benchmarks
Pricing
Input
$0.100/MTok
Output
$0.300/MTok
modelpicker.net
Task Analysis
What Constrained Rewriting demands: per our benchmark description, the task is "compression within hard character limits." That requires precise fidelity to source meaning, predictable structured output, robust handling of long inputs so important context isn't dropped, and sometimes programmatic tool sequencing when conversions must meet exact constraints. External benchmarks are not present for this task in the payload, so our verdict uses our internal scores. In our testing both models score 3/5 on constrained_rewriting and occupy the same task rank (31 of 52). To break the tie, compare supporting capabilities: Claude Haiku 4.5 scores faithfulness=5, long_context=5, tool_calling=5 and structured_output=4; Devstral Small 1.1 scores faithfulness=4, long_context=4, tool_calling=4 and structured_output=4. Those deltas explain why Haiku is better at preserving meaning and handling long-source compression. Note cost and runtime trade-offs: Haiku has a 200k token context window and max_output_tokens=64k, while Devstral Small has a 131k context window and no max_output_tokens reported in the payload—useful when source size approaches limits.
Practical Examples
- High-fidelity legal or regulatory summarization that must fit strict character counts: choose Claude Haiku 4.5. In our tests Haiku's faithfulness (5 vs 4) and long-context (5 vs 4) reduce the risk of losing critical clauses during compression. 2) Social or marketing copy constrained to exact character counts but produced at scale and low cost: choose Devstral Small 1.1. Both models score 3/5 on the task itself, but Devstral output cost is $0.30/mTok vs Haiku $5.00/mTok, making bulk workflows far cheaper. 3) Tool-driven transformations where you call formatting/validation functions to enforce length limits: Claude Haiku 4.5 is preferable (tool_calling 5 vs 4) for more reliable argument selection and sequencing. 4) Long-document compression where context beyond typical windows matters: Haiku's 200,000-token context window and long_context=5 give it an edge over Devstral's 131,072 window and long_context=4 in our tests.
Bottom Line
For Constrained Rewriting, choose Claude Haiku 4.5 if you need higher fidelity, stronger long-context retention, or more reliable tool-assisted formatting (Haiku: faithfulness 5, long_context 5, tool_calling 5). Choose Devstral Small 1.1 if you must compress at high volume on a tight budget (Devstral output cost $0.30/mTok vs Haiku $5.00/mTok) and can accept a modest drop in fidelity and context handling.
How We Test
We test every model against our 12-benchmark suite covering tool calling, agentic planning, creative problem solving, safety calibration, and more. Each test is scored 1–5 by an LLM judge. Read our full methodology.