models/mistral/mistral-small-2603
M
Mistral·active

Mistral Small 4

Mistral's efficiency model. Context window: 262K tokens.

Overall score
3.77
/5.00 · ranked #66
Input
$0.150
per 1M tokens
Output
$0.600
per 1M tokens
Context
262K
tokens
Blended
$0.487
3:1 out:in ratio

Price drops, new benchmarks, model updates. Stay current on Mistral Small 4.

One email per change. Unsubscribe anytime.

modelpicker.aipowered by live benchmark data

Scores by test

Methodology →
Structured Output
5.0
Strategic Analysis
4.0
Constrained Rewriting
3.0
Creative Problem Solving
4.0
Tool Calling
4.0
Faithfulness
4.0
Classification
2.0
Long Context
4.0
Safety Calibration
2.0
Persona Consistency
5.0
Agentic Planning
4.0
Multilingual
5.0
Tabular Data
3.0

What you need to know

Mistral Small 4 is optimized for high-precision formatting and multilingual deployment. It achieves perfect scores in structured output, multilingual capabilities, and persona consistency, making it a reliable choice for applications requiring strict schema adherence or consistent brand voice across different languages.

The model provides a massive 262K context window at a low price point, with a blended cost of $0.487/MTok. This makes it a cost-effective option for processing large documents, though its overall performance rank (#52 of 71) suggests it is a utility model rather than a frontier-class reasoning engine.

Performance is inconsistent across logic tasks. While it handles agentic planning and strategic analysis well, it struggles significantly with basic classification and safety calibration. Developers should expect poor results when using this model for sentiment analysis, labeling tasks, or environments requiring strict safety guardrails.

Use this model if you need an affordable, long-context engine for generating structured data or maintaining a specific persona in multiple languages. Skip this model if your primary use case is data classification or if you require high safety calibration.

Strengths — Top 3

Structured Output5.0/5.0
Persona Consistency5.0/5.0
Multilingual5.0/5.0

Relative weaknesses — Bottom 3

Classification2.0/5.0
Safety Calibration2.0/5.0
Constrained Rewriting3.0/5.0

Similar models

QQwen: Qwen3 235B A22B Instruct 2507$0.0934.08OGPT-4.1 Mini$1.303.92OOpenAI: gpt-oss-20b$0.1133.54XGrok 4.3$2.194.15