Question 1

How large is the performance gap on Persona Consistency?

Accepted Answer

In our testing Claude Haiku 4.5 scores 5 while Devstral Medium scores 3 — a 2-point gap. Haiku ranks 1 of 52 on this task; Devstral ranks 45 of 52.

Question 2

Which internal capabilities explain Haiku's advantage?

Accepted Answer

Haiku's advantage maps to higher long_context (5 vs 4), tool_calling (5 vs 3), and faithfulness (5 vs 4) in our benchmarks — capabilities directly tied to maintaining character and resisting injection.

Question 3

Is Devstral ever the better choice for persona workloads?

Accepted Answer

Yes — if cost per output token is the overriding constraint. Devstral's output cost is 2 per mTok vs Haiku's 5 per mTok. For short-turn, lower-risk personas where occasional drift is acceptable, Devstral may be more cost-effective.

Question 4

Do either model support structured outputs to enforce persona format?

Accepted Answer

Both models list structured_outputs in their supported parameters. In our tests Haiku also shows higher faithfulness, which reduces off-persona content in structured schemas.

Question 5

How does context window affect Persona Consistency here?

Accepted Answer

Haiku has a larger context_window (200,000 tokens) versus Devstral's 131,072 tokens in the payload. Combined with Haiku's long_context score of 5, that larger window supports maintaining persona across longer conversations in our tests.

Claude Haiku 4.5 vs Devstral Medium for Persona Consistency

Claude Haiku 4.5

Devstral Medium

Task Analysis

Practical Examples

Bottom Line

How We Test

Frequently Asked Questions