Which model scored higher on Constrained Rewriting in your tests?

Neither — both Claude Haiku 4.5 and Claude Sonnet 4.6 scored 3/5 on constrained_rewriting in our testing and share the same ranked display ("rank 31 of 53 (22 models share this score)").

If both models tie on score, why do you recommend Claude Haiku 4.5?

We recommend Haiku 4.5 because it delivers the same constrained-rewriting score (3/5) with materially lower output cost (output_cost_per_mtok 5 vs Sonnet’s 15) and is described as faster and more efficient — a practical win for high-volume or budget-sensitive pipelines.

When should I choose Claude Sonnet 4.6 for constrained rewriting?

Choose Sonnet 4.6 when safety and nuanced creativity matter: Sonnet scores 5 on safety_calibration (Haiku 2) and 5 on creative_problem_solving (Haiku 4), and it provides a much larger context window (1,000,000 vs 200,000) for extremely long source inputs.

Do structured output and faithfulness differ between the two models for this task?

No — both models scored 4 on structured_output and 5 on faithfulness in our tests, so they are essentially tied on format adherence and staying true to source material during compression.

Claude Haiku 4.5 vs Claude Sonnet 4.6 for Constrained Rewriting

Winner: Claude Haiku 4.5. In our constrained-rewriting tests both models score 3/5 and share the same ranked position, but Claude Haiku 4.5 is the practical winner because it delivers equivalent constrained-rewriting quality at materially lower cost (output_cost_per_mtok 5 vs 15) and is described as faster and more efficient. Choose Sonnet 4.6 when you need stronger safety calibration (Sonnet 5 vs Haiku 2), additional creative problem-solving (5 vs 4), or a much larger context window (1,000,000 vs 200,000).

anthropic

Claude Haiku 4.5

Overall

4.33/5Strong

Benchmark Scores

Faithfulness

5/5

Long Context

5/5

Multilingual

5/5

Tool Calling

5/5

Classification

4/5

Agentic Planning

5/5

Structured Output

4/5

Safety Calibration

2/5

Strategic Analysis

5/5

Persona Consistency

5/5

Constrained Rewriting

3/5

Creative Problem Solving

4/5

External Benchmarks

SWE-bench Verified

N/A

MATH Level 5

N/A

AIME 2025

N/A

Pricing

Input

$1.00/MTok

Output

$5.00/MTok

Context Window200K

modelpicker.net

anthropic

Claude Sonnet 4.6

Overall

4.67/5Strong

Benchmark Scores

Faithfulness

5/5

Long Context

5/5

Multilingual

5/5

Tool Calling

5/5

Classification

4/5

Agentic Planning

5/5

Structured Output

4/5

Safety Calibration

5/5

Strategic Analysis

5/5

Persona Consistency

5/5

Constrained Rewriting

3/5

Creative Problem Solving

5/5

External Benchmarks

SWE-bench Verified

75.2%

MATH Level 5

N/A

AIME 2025

85.8%

Pricing

Input

$3.00/MTok

Output

$15.00/MTok

Context Window1000K

modelpicker.net

Task Analysis

What Constrained Rewriting demands: per our benchmark description it is primarily "compression within hard character limits." That requires: faithful content compression (faithfulness), strict schema/format adherence when truncation rules apply (structured_output), consistent handling of long source material (long_context), and sometimes creative rephrasing to preserve nuance in fewer characters (creative_problem_solving). In our testing both Claude Haiku 4.5 and Claude Sonnet 4.6 scored 3/5 on constrained_rewriting and display the same ranked position ("rank 31 of 53 (22 models share this score)"), so the core capacity for compression is equivalent on the constrained task. Supporting signals: both models score 4 on structured_output and 5 on faithfulness and long_context, indicating they preserve source material and handle long inputs well. Differences that matter operationally: Sonnet 4.6 scores higher on safety_calibration (5 vs 2) and creative_problem_solving (5 vs 4), which helps when compressed outputs must avoid policy issues or require inventive condensation; Haiku 4.5 is described as faster and more cost-efficient, which matters for high-volume batch rewriting.

Practical Examples

When to pick each model — grounded in our scores and costs:

Claude Haiku 4.5 (recommended winner for most constrained-rewrite workloads): batch-compressing product descriptions to a 280-character limit for thousands of SKUs. Both models score 3/5 on the task, but Haiku’s output_cost_per_mtok is 5 versus Sonnet’s 15, so per-token budget is ~3x lower while retaining structured_output 4 and faithfulness 5.
Claude Sonnet 4.6 (recommended when nuance, safety, or extreme context matter): compressing legal disclaimers or medical summaries where refusal rules and subtle safety tradeoffs matter — Sonnet’s safety_calibration is 5 (vs Haiku 2) and creative_problem_solving is 5 (vs 4), so it’s better at safe, inventive condensations. Sonnet also offers a 1,000,000 token context window and max_output_tokens 128,000 versus Haiku’s 200,000 / 64,000, useful when the rewrite must draw on very long source material.
Edge case: single, highly creative marketing lines where the brief must preserve tone and inventiveness — Sonnet’s creative_problem_solving advantage helps; for large-scale, cost-sensitive pipelines Haiku is the pragmatic choice. Both models tie on constrained_rewriting score (3/5) and share the same ranked display in our data ("rank 31 of 53 (22 models share this score)").

Bottom Line

For Constrained Rewriting, choose Claude Haiku 4.5 if you need equivalent compression quality at much lower cost and faster throughput (output_cost_per_mtok 5 vs 15, described as faster/more efficient). Choose Claude Sonnet 4.6 if you must prioritize safety calibration, richer creative condensation, or ultra-long context (safety_calibration 5 vs 2; creative_problem_solving 5 vs 4; context_window 1,000,000 vs 200,000).

How We Test

We test every model against our 12-benchmark suite covering tool calling, agentic planning, creative problem solving, safety calibration, and more. Each test is scored 1–5 by an LLM judge. Read our full methodology.

Claude Haiku 4.5 vs Claude Sonnet 4.6 for Constrained Rewriting

Claude Haiku 4.5

Claude Sonnet 4.6

Task Analysis

Practical Examples

Bottom Line

How We Test

Frequently Asked Questions

Which model scored higher on Constrained Rewriting in your tests?

If both models tie on score, why do you recommend Claude Haiku 4.5?

When should I choose Claude Sonnet 4.6 for constrained rewriting?

Do structured output and faithfulness differ between the two models for this task?