Ministral 3 14B 2512 vs Mistral Large 3 2512

No clear performance sweep: Ministral 3 14B 2512 and Mistral Large 3 2512 split our 12-test suite (4 wins each, 4 ties). For most production use cases where cost matters, pick Ministral 3 14B 2512 — it matches or outperforms on creative problem solving, constrained rewriting, classification and persona consistency while charging $0.2/$0.2 per mtok. Choose Mistral Large 3 2512 if you need top-tier structured-output, multilingual fidelity, agentic planning or faithfulness and can justify higher costs.

mistral

Ministral 3 14B 2512

Overall
3.75/5Strong

Benchmark Scores

Faithfulness
4/5
Long Context
4/5
Multilingual
4/5
Tool Calling
4/5
Classification
4/5
Agentic Planning
3/5
Structured Output
4/5
Safety Calibration
1/5
Strategic Analysis
4/5
Persona Consistency
5/5
Constrained Rewriting
4/5
Creative Problem Solving
4/5

External Benchmarks

SWE-bench Verified
N/A
MATH Level 5
N/A
AIME 2025
N/A

Pricing

Input

$0.200/MTok

Output

$0.200/MTok

Context Window262K

modelpicker.net

mistral

Mistral Large 3 2512

Overall
3.67/5Strong

Benchmark Scores

Faithfulness
5/5
Long Context
4/5
Multilingual
5/5
Tool Calling
4/5
Classification
3/5
Agentic Planning
4/5
Structured Output
5/5
Safety Calibration
1/5
Strategic Analysis
4/5
Persona Consistency
3/5
Constrained Rewriting
3/5
Creative Problem Solving
3/5

External Benchmarks

SWE-bench Verified
N/A
MATH Level 5
N/A
AIME 2025
N/A

Pricing

Input

$0.500/MTok

Output

$1.50/MTok

Context Window262K

modelpicker.net

Benchmark Analysis

Overview: Across our 12-test suite the two models split wins 4–4 with 4 ties (per winLossTie). Detailed walk-through: - Constrained rewriting: Ministral 3 14B 2512 4 vs Mistral Large 3 2512 3 — Ministral wins; it ranks “rank 6 of 53 (25 models share this score)”, so it’s relatively strong at tight-compression and strict character-limited rewrites. - Creative problem solving: Ministral 4 vs Mistral Large 3 — Ministral wins and ranks 9 of 54 (21 models share), meaning better at non-obvious, feasible idea generation. - Classification: Ministral 4 vs Mistral Large 3 — Ministral wins; its classification score ties for 1st with many models, indicating reliable routing/labeling in our tests. - Persona consistency: Ministral 5 vs Mistral Large 3 — Ministral wins and is tied for 1st, so it maintains character and resists injection in our runs. - Structured output: Mistral Large 5 vs Ministral 4 — Mistral Large wins and is tied for 1st (24 other models share), making it the safer pick for strict JSON/schema compliance and format adherence. - Faithfulness: Mistral Large 5 vs Ministral 4 — Mistral Large wins and ties for 1st, so it sticks closer to source material in our tests. - Agentic planning: Mistral Large 4 vs Ministral 3 — Mistral Large wins (rank 16 of 54 tied with 25), showing better goal decomposition and recovery. - Multilingual: Mistral Large 5 vs Ministral 4 — Mistral Large wins and ties for 1st, so non-English tasks favored it in our evaluation. - Ties (no clear winner): strategic analysis 4/4 (both rank 27 of 54), tool calling 4/4 (both rank 18 of 54), long context 4/4 (both rank 38 of 55), safety calibration 1/1 (both rank 32 of 55). Practical interpretation: Ministral 3 14B 2512 performs better for creative ideation, tight rewrites, classification and persona-heavy chat at a much lower cost. Mistral Large 3 2512 is the better choice when strict schema output, multilingual equivalence, faithfulness to sources, or multi-step agentic planning are the priority — but at substantially higher per-token cost.

BenchmarkMinistral 3 14B 2512Mistral Large 3 2512
Faithfulness4/55/5
Long Context4/54/5
Multilingual4/55/5
Tool Calling4/54/5
Classification4/53/5
Agentic Planning3/54/5
Structured Output4/55/5
Safety Calibration1/51/5
Strategic Analysis4/54/5
Persona Consistency5/53/5
Constrained Rewriting4/53/5
Creative Problem Solving4/53/5
Summary4 wins4 wins

Pricing Analysis

Per-mtok pricing (input/output): Ministral 3 14B 2512 = $0.20/$0.20; Mistral Large 3 2512 = $0.50/$1.50. Assuming a 50/50 split of input/output tokens, monthly costs for total tokens are: - 1M tokens: Ministral = $200; Mistral Large = $1,000. - 10M tokens: Ministral = $2,000; Mistral Large = $10,000. - 100M tokens: Ministral = $20,000; Mistral Large = $100,000. The Mistral Large bill is ~5× larger at these volumes. Teams with heavy inference (chatbots, API products, scale deployments) should care about this gap; small pilots or tasks that require Mistral Large’s specific strengths may justify the higher spend.

Real-World Cost Comparison

TaskMinistral 3 14B 2512Mistral Large 3 2512
iChat response<$0.001<$0.001
iBlog post<$0.001$0.0033
iDocument batch$0.014$0.085
iPipeline run$0.140$0.850

Bottom Line

Choose Ministral 3 14B 2512 if: - You need cost-efficient inference at scale ( $0.20/$0.20 per mtok) and strong creative problem solving, constrained rewriting, classification, or persona consistency. - You run high-volume chatbots, content-generation, or interactive apps where token costs dominate. Choose Mistral Large 3 2512 if: - Your product requires strict structured outputs (JSON/schema), multilingual parity, or maximum faithfulness and agentic planning, and you can absorb the higher costs ($0.50 input, $1.50 output per mtok). - You prioritize output precision and format compliance over price for lower-volume, mission-critical workflows.

How We Test

We test every model against our 12-benchmark suite covering tool calling, agentic planning, creative problem solving, safety calibration, and more. Each test is scored 1–5 by an LLM judge. Read our full methodology.

Frequently Asked Questions