Claude Sonnet 4.6 vs Gemini 2.5 Pro for Constrained Rewriting
Tie on the core task but Gemini 2.5 Pro is the practical winner for Constrained Rewriting. In our testing both Claude Sonnet 4.6 and Gemini 2.5 Pro score 3/5 on the constrained_rewriting test (rank 31 of 52). Where they differ matters for real projects: Gemini has structured_output 5 vs Claude Sonnet 4.6's 4 and is cheaper (input cost $1.25 vs $3 per mTok; output cost $10 vs $15 per mTok). Those two differences — stronger format/length adherence and lower pricing — make Gemini 2.5 Pro the better choice for most constrained-rewriting workflows. Choose Claude Sonnet 4.6 only when its stronger safety calibration (5 vs 1) or other Sonnet strengths are explicitly required alongside rewriting.
anthropic
Claude Sonnet 4.6
Benchmark Scores
External Benchmarks
Pricing
Input
$3.00/MTok
Output
$15.00/MTok
modelpicker.net
Gemini 2.5 Pro
Benchmark Scores
External Benchmarks
Pricing
Input
$1.25/MTok
Output
$10.00/MTok
modelpicker.net
Task Analysis
What Constrained Rewriting demands: precise compression under hard character limits, reliable adherence to length/format constraints, faithfulness to the source text, and creative rewording to preserve meaning while shortening. Because no external benchmark is provided for this task in the payload, our constrained_rewriting test (one of 12 internal tests) is the primary measure: both models scored 3/5 and rank 31 of 52 in our testing. Use supporting signals to explain differences: structured_output measures schema/format compliance (Gemini 2.5 Pro = 5, Claude Sonnet 4.6 = 4), faithfulness is equal (5 each), and long_context is equal (5 each) — indicating both preserve source content and context well. Practical edge: Gemini's higher structured_output score shows it's more reliable at strict length/format enforcement; Claude's higher safety_calibration (5 vs 1) matters when rewriting sensitive content that requires refusal or careful moderation.
Practical Examples
- Tight marketing meta (155 characters): Gemini 2.5 Pro is preferable — both models scored 3/5 on constrained_rewriting, but Gemini's structured_output 5 vs 4 means it is likelier to meet exact character-count schemas without extra prompt engineering. 2) SMS or push-notification compression (<=160 chars): Gemini again has the advantage for reliable schema compliance and is cheaper per token (output $10 vs $15 per mTok), lowering per-message cost at scale. 3) Sensitive content that must be rewritten but also safety-reviewed: choose Claude Sonnet 4.6 — its safety_calibration is 5 vs Gemini's 1 in our testing, so Claude is more likely to refuse or safely transform risky inputs while maintaining faithfulness (faithfulness=5 for both). 4) Long-source summarization that must preserve precise phrasing: both models have long_context=5 and faithfulness=5, so either can retain source detail; prefer Gemini when strict output format or cost matters, prefer Claude when safety handling is required.
Bottom Line
For Constrained Rewriting, choose Claude Sonnet 4.6 if you must prioritize safety calibration and cautious handling of risky input while still getting competent rewriting. Choose Gemini 2.5 Pro if you need stricter format/length enforcement and lower input/output costs — it edges out Sonnet 4.6 in structured_output (5 vs 4) and is cheaper ($1.25/$10 vs $3/$15 per mTok). Both scored 3/5 on the constrained_rewriting test in our testing and rank 31 of 52, so the decision hinges on format precision and cost versus safety needs.
How We Test
We test every model against our 12-benchmark suite covering tool calling, agentic planning, creative problem solving, safety calibration, and more. Each test is scored 1–5 by an LLM judge. Read our full methodology.