Claude Haiku 4.5 vs Codestral 2508 for Safety Calibration
Winner: Claude Haiku 4.5. In our safety_calibration test Claude Haiku 4.5 scores 2/5 versus Codestral 2508's 1/5, placing Haiku 4.5 at rank 12 of 52 vs Codestral at rank 31 of 52. That 1-point lead reflects measurably stronger refusal behavior, better classification (4 vs 3) and higher persona consistency (5 vs 3) in our testing — all traits that reduce risky permissions. Codestral 2508 is weaker on safety calibration but wins on structured_output (5 vs 4) and is much cheaper (output cost 0.9 vs 5 per mTok), so it may still fit constrained budgets or workflows that prioritize strict schema adherence over conservative refusals.
anthropic
Claude Haiku 4.5
Benchmark Scores
External Benchmarks
Pricing
Input
$1.00/MTok
Output
$5.00/MTok
modelpicker.net
mistral
Codestral 2508
Benchmark Scores
External Benchmarks
Pricing
Input
$0.300/MTok
Output
$0.900/MTok
modelpicker.net
Task Analysis
What Safety Calibration demands: per our benchmark description, safety calibration is about refusing harmful requests while allowing legitimate ones. Key capabilities that drive this are: accurate classification/routing of intent, consistent persona and instruction-following to resist jailbreaks, faithfulness to source/context, and the ability to produce clear, structured refusals when required. In our testing Claude Haiku 4.5 scored 2/5 on safety_calibration and ranks 12/52; Codestral 2508 scored 1/5 and ranks 31/52. Supporting metrics that explain the gap: Claude has higher classification (4 vs 3) and persona_consistency (5 vs 3), which help it detect and consistently refuse disallowed prompts. Both models tie on tool_calling (5/5), so tool orchestration is not the differentiator here; Codestral's advantage is structured_output (5 vs 4), which helps produce schema-compliant refusal messages. No external benchmark is present in the payload, so these internal scores are our primary evidence.
Practical Examples
Scenario A — Moderation gateway for user-submitted instructions: Claude Haiku 4.5 (safety 2 vs 1) is more likely in our tests to refuse subtly harmful prompts and correctly classify borderline cases (classification 4 vs 3). Choose Haiku when false positives are acceptable but false negatives (letting harmful content through) are not. Scenario B — Automated API that must return a strict JSON refusal object to downstream systems: Codestral 2508 excels at structured output (5 vs 4), so it will more reliably produce schema-compliant refusal payloads even though its overall refusal policy is weaker. Scenario C — Cost-sensitive bulk filtering: Codestral 2508 has much lower pricing (output cost 0.9 per mTok vs Claude Haiku 4.5 at 5 per mTok). For high-volume, low-risk content screening where occasional permissive answers are tolerable, Codestral may be preferred. Quantified differences used: safety_calibration 2 vs 1 (Claude Haiku 4.5 vs Codestral 2508), classification 4 vs 3, persona_consistency 5 vs 3, structured_output 4 vs 5, and ranks 12 vs 31 out of 52.
Bottom Line
For Safety Calibration, choose Claude Haiku 4.5 if you need stronger, more consistent refusals, better intent classification, and tighter persona consistency for moderation or compliance. Choose Codestral 2508 if you prioritize lower cost and stricter schema/structured-output generation and can accept weaker refusal behavior.
How We Test
We test every model against our 12-benchmark suite covering tool calling, agentic planning, creative problem solving, safety calibration, and more. Each test is scored 1–5 by an LLM judge. Read our full methodology.