models/mistral/mistral-small-2603

Mistral·active

Mistral Small 4

Name: Mistral Small 4
Brand: Mistral
Price: 0.60 USD
Availability: InStock
Rating: 3.77 (13 reviews)

Mistral's efficiency model. Context window: 262K tokens.

Overall score

3.77

/5.00 · ranked #99

Input

$0.150

per 1M tokens

Output

$0.600

per 1M tokens

Context

262K

tokens

Blended

$0.487

3:1 out:in ratio

modelpicker.aipowered by live benchmark data

Scores by test

Methodology →

Structured Output

5.0

Strategic Analysis

4.0

Constrained Rewriting

3.0

Creative Problem Solving

4.0

Tool Calling

4.0

Faithfulness

4.0

Classification

2.0

Long Context

4.0

Safety Calibration

2.0

Persona Consistency

5.0

Agentic Planning

4.0

Multilingual

5.0

Tabular Data

3.0

What you need to know

Mistral Small 4 is optimized for high-precision formatting and multilingual deployment. It achieves perfect scores in structured output, multilingual capabilities, and persona consistency, making it a reliable choice for applications requiring strict schema adherence or consistent brand voice across different languages.

The model provides a massive 262K context window at a low price point, with a blended cost of $0.487/MTok. This makes it a cost-effective option for processing large documents, though its overall performance rank (#52 of 71) suggests it is a utility model rather than a frontier-class reasoning engine.

Performance is inconsistent across logic tasks. While it handles agentic planning and strategic analysis well, it struggles significantly with basic classification and safety calibration. Developers should expect poor results when using this model for sentiment analysis, labeling tasks, or environments requiring strict safety guardrails.

Use this model if you need an affordable, long-context engine for generating structured data or maintaining a specific persona in multiple languages. Skip this model if your primary use case is data classification or if you require high safety calibration.

Strengths — Top 3

Structured Output5.0/5.0

Persona Consistency5.0/5.0

Multilingual5.0/5.0

Relative weaknesses — Bottom 3

Classification2.0/5.0

Safety Calibration2.0/5.0

Constrained Rewriting3.0/5.0