Question 1

Is Mistral Medium 3.1 better than Mistral Small 3.2 24B?

Accepted Answer

In our testing Mistral Medium 3.1 wins 9 of 12 benchmarks and ties 3 — examples: strategic analysis 5 vs 2, multilingual 5 vs 4, long context 5 vs 4. Small 3.2 24B does not win any tests here.

Question 2

Which model is cheaper to run?

Accepted Answer

Mistral Small 3.2 24B is far cheaper: combined input+output = $0.275 per mTok vs Mistral Medium 3.1 at $2.40 per mTok. That equals about $275 vs $2,400 per 1M tokens (1,000 mTok).

Question 3

Which model is better for coding and function calling?

Accepted Answer

Tool calling scores are tied at 4 vs 4 and both models share the same rank (18 of 54) in our tests, so function selection and sequencing perform similarly. If your coding workload needs stronger long-context, classification, or strategic reasoning (e.g., large codebases, architectural planning), Mistral Medium 3.1 (long context 5 vs 4, classification 4 vs 3) is advantaged.

Question 4

Is Small 3.2 24B suitable for production chat at high volume?

Accepted Answer

Yes — Small 3.2 24B is optimized for instruction following and improved function calling per its description and is much cheaper (≈ $275 per 1M tokens), making it a pragmatic choice for high-volume, cost-sensitive chat. However, it trails Medium on safety calibration (1 vs 2) and persona consistency (3 vs 5), so monitor failure modes for high-stakes or safety-sensitive chats.

Question 5

How large is the price gap in real usage (10M–100M tokens)?

Accepted Answer

At 10M tokens/month Medium ≈ $24,000 vs Small ≈ $2,750. At 100M tokens/month Medium ≈ $240,000 vs Small ≈ $27,500. The payload lists a priceRatio of 10, reflecting this ~10x gap.

Mistral Medium 3.1 vs Mistral Small 3.2 24B

Mistral Medium 3.1

Mistral Small 3.2 24B

Benchmark Analysis

Pricing Analysis

Real-World Cost Comparison

Bottom Line

How We Test

Frequently Asked Questions