Question 1

Both models have the same constrained_rewriting score. Why pick one over the other?

Accepted Answer

They tie 3/5 on the constrained_rewriting test in our benchmarks, but supporting metrics differ. Claude Haiku 4.5 scores 5/5 on faithfulness and has a larger context window (200,000 tokens), favoring fidelity in compression. DeepSeek V3.1 Terminus scores 5/5 on structured_output and is much cheaper ($0.21/$0.79 vs $1/$5 per mTok), favoring strict format compliance and volume.

Question 2

How big is the cost difference between the two models?

Accepted Answer

In the data provided, Claude Haiku 4.5 costs $1 input and $5 output per mTok; DeepSeek V3.1 Terminus costs $0.21 input and $0.79 output per mTok. That makes DeepSeek roughly 6.33× cheaper on the combined per-mTok price ratio in our payload.

Question 3

If I need exact character limits (e.g., 280 chars), which model is better?

Accepted Answer

For strict adherence to format/length, DeepSeek V3.1 Terminus is stronger on structured_output (5 vs 4 in our tests), so it will more reliably produce schema-compliant or exact-length outputs. However, if the condensed output must remain fully faithful to a long or complex source, Claude Haiku 4.5’s higher faithfulness and larger context may reduce factual loss.

Question 4

Does context window size matter for constrained rewriting?

Accepted Answer

Yes. Larger context windows reduce the chance that important source text is truncated before compression. Claude Haiku 4.5 offers a 200,000-token window vs DeepSeek’s 163,840 tokens in the payload — both very large, but Haiku’s window is larger in our data.

Question 5

Can I rely on prompting to close the gap between them?

Accepted Answer

Prompting can improve adherence to length and format for both models, but our scores show structural strengths: DeepSeek is stronger at structured_output and cheaper for high-volume runs; Haiku is stronger at faithfulness. Prompting may narrow differences, but it won’t change the fundamental tradeoffs reflected in our testing.

Claude Haiku 4.5 vs DeepSeek V3.1 Terminus for Constrained Rewriting

Claude Haiku 4.5

DeepSeek V3.1 Terminus

Task Analysis

Practical Examples

Bottom Line

How We Test

Frequently Asked Questions