Ministral 3 14B 2512 vs Mistral Small 3.2 24B

Pick Ministral 3 14B 2512 for tasks that need classification, creative problem solving, or strong persona consistency — it wins 4 of 12 benchmarks in our testing. Mistral Small 3.2 24B is preferable for agentic planning (it wins that test) and has a lower input-token price, so it’s the cost-conscious choice for heavy input-heavy workloads.

mistral

Ministral 3 14B 2512

Overall
3.75/5Strong

Benchmark Scores

Faithfulness
4/5
Long Context
4/5
Multilingual
4/5
Tool Calling
4/5
Classification
4/5
Agentic Planning
3/5
Structured Output
4/5
Safety Calibration
1/5
Strategic Analysis
4/5
Persona Consistency
5/5
Constrained Rewriting
4/5
Creative Problem Solving
4/5

External Benchmarks

SWE-bench Verified
N/A
MATH Level 5
N/A
AIME 2025
N/A

Pricing

Input

$0.200/MTok

Output

$0.200/MTok

Context Window262K

modelpicker.net

mistral

Mistral Small 3.2 24B

Overall
3.25/5Usable

Benchmark Scores

Faithfulness
4/5
Long Context
4/5
Multilingual
4/5
Tool Calling
4/5
Classification
3/5
Agentic Planning
4/5
Structured Output
4/5
Safety Calibration
1/5
Strategic Analysis
2/5
Persona Consistency
3/5
Constrained Rewriting
4/5
Creative Problem Solving
2/5

External Benchmarks

SWE-bench Verified
N/A
MATH Level 5
N/A
AIME 2025
N/A

Pricing

Input

$0.075/MTok

Output

$0.200/MTok

Context Window128K

modelpicker.net

Benchmark Analysis

Across our 12-test suite (scores 1–5), Ministral 3 14B 2512 wins 4 tests, Mistral Small 3.2 24B wins 1, and 7 tests tie. Detailed comparison (our scores):

  • Strategic analysis: Ministral 3 14B 2512 4 vs Mistral Small 3.2 24B 2 — in our testing Ministral provides stronger nuanced tradeoff reasoning (ranks 27 of 54). Expect better numeric tradeoff answers and stepwise reasoning.
  • Creative problem solving: Ministral 4 vs Mistral Small 2 — Ministral ranks 9 of 54 on this task in our testing, so it produces more specific, feasible ideas for brainstorming and product concepts.
  • Classification: Ministral 4 vs Mistral Small 3 — Ministral ties for 1st on classification in our testing (tied with 29 others), so it’s better at routing and label accuracy.
  • Persona consistency: Ministral 5 vs Mistral Small 3 — Ministral is tied for 1st on persona consistency in our testing (36 others), useful when maintaining character or resisting prompt injection.
  • Agentic planning: Ministral 3 vs Mistral Small 4 — Mistral Small 3.2 24B wins here and ranks 16 of 54 on agentic planning in our testing, so it’s stronger at goal decomposition and recovery strategies.
  • Ties (structured output 4/4, constrained rewriting 4/4, tool calling 4/4, faithfulness 4/4, long context 4/4, safety calibration 1/1, multilingual 4/4): both models perform equivalently on JSON/schema adherence, function selection, faithfulness, 30k+ context retrieval, and multilingual output in our testing. Tool calling ranks the same (rank 18 of 54 for both) indicating both handle function selection and argument accuracy similarly. Safety calibration is low (1/5) for both in our testing — expect conservative refusal behavior issues to be similar. Overall interpretation: Ministral 3 14B 2512 is the better choice for classification-heavy, creative, and persona-dependent applications; Mistral Small 3.2 24B has a measurable edge for agentic planning and is cheaper on input tokens.
BenchmarkMinistral 3 14B 2512Mistral Small 3.2 24B
Faithfulness4/54/5
Long Context4/54/5
Multilingual4/54/5
Tool Calling4/54/5
Classification4/53/5
Agentic Planning3/54/5
Structured Output4/54/5
Safety Calibration1/51/5
Strategic Analysis4/52/5
Persona Consistency5/53/5
Constrained Rewriting4/54/5
Creative Problem Solving4/52/5
Summary4 wins1 wins

Pricing Analysis

Costs per 1k tokens: Ministral 3 14B 2512 charges $0.20 input / $0.20 output; Mistral Small 3.2 24B charges $0.075 input / $0.20 output. Assuming a 50/50 input/output token split: 1M total tokens => Ministral 3 14B 2512 = $200.00, Mistral Small 3.2 24B = $137.50 (savings $62.50 per 1M). At 10M tokens (50/50) Ministral = $2,000, Mistral Small = $1,375 (savings $625). At 100M tokens (50/50) Ministral = $20,000, Mistral Small = $13,750 (savings $6,250). Who should care: teams running millions+ tokens/month (chat operators, high-volume APIs, indexing pipelines) — input-cost differences scale linearly and produce meaningful savings at high volume. For low-volume or mostly output-heavy workloads the difference narrows because both models share the same $0.20/mk output price.

Real-World Cost Comparison

TaskMinistral 3 14B 2512Mistral Small 3.2 24B
iChat response<$0.001<$0.001
iBlog post<$0.001<$0.001
iDocument batch$0.014$0.011
iPipeline run$0.140$0.115

Bottom Line

Choose Ministral 3 14B 2512 if you need stronger classification, creative idea generation, or strict persona consistency (it wins those tests in our suite). Choose Mistral Small 3.2 24B if agentic planning is key or you run high-volume, input-heavy workloads — it wins agentic planning and has a $0.075/mk input price that materially reduces cost at scale.

How We Test

We test every model against our 12-benchmark suite covering tool calling, agentic planning, creative problem solving, safety calibration, and more. Each test is scored 1–5 by an LLM judge. Read our full methodology.

Frequently Asked Questions