models/mistral/mistral-small-3-1-24b-instruct
M
Mistral·active

Mistral Small 3.1 24B

Mistral's efficiency model. Context window: 128K tokens.

Overall score
2.77
/5.00 · ranked #86
Input
$0.351
per 1M tokens
Output
$0.555
per 1M tokens
Context
128K
tokens
Blended
$0.504
3:1 out:in ratio

Price drops, new benchmarks, model updates. Stay current on Mistral Small 3.1 24B.

One email per change. Unsubscribe anytime.

modelpicker.aipowered by live benchmark data

Scores by test

Methodology →
Structured Output
4.0
Strategic Analysis
3.0
Constrained Rewriting
3.0
Creative Problem Solving
2.0
Tool Calling
1.0
Faithfulness
4.0
Classification
3.0
Long Context
5.0
Safety Calibration
1.0
Persona Consistency
2.0
Agentic Planning
3.0
Multilingual
4.0
Tabular Data
1.0

What you need to know

Mistral Small 3.1 24B is optimized for high-volume long-context processing and structured data generation. It achieves a perfect 5/5 score in long context handling and strong 4/5 ratings in faithfulness, structured output, and multilingual capabilities. These metrics indicate the model is reliable for extracting information from large documents and adhering to specific formatting requirements.

The model's pricing is competitive, with a blended cost of $0.508/MTok, making it an affordable option for high-throughput tasks. However, this low cost comes with significant functional trade-offs. It fails in technical execution areas, scoring 1/5 in tool calling, tabular data processing, and safety calibration. It is not a viable candidate for agentic workflows that require external API interactions or precise data manipulation.

Overall performance is low, with an average internal score of 2.77/5.0 and the lowest overall rank among compared models. While it excels at reading and formatting, it struggles with persona consistency and creative problem solving.

Use this model if you need a low-cost solution for processing long documents, multilingual translation, or generating structured text. Skip this model if your application requires tool use, data analysis of tables, or strict safety guardrails.

Strengths — Top 3

Long Context5.0/5.0
Structured Output4.0/5.0
Faithfulness4.0/5.0

Relative weaknesses — Bottom 3

Tool Calling1.0/5.0
Safety Calibration1.0/5.0
Tabular Data1.0/5.0

Similar models

MMistral Small 3.2 24B$0.1693.00MDevstral Small 1.1$0.2502.85QQwen: Qwen3 Coder 30B A3B Instruct$0.2203.23MLlama 3.3 70B Instruct$0.2653.46