models/mistral/mistral-small-3-1-24b-instruct

Mistral·active

Mistral Small 3.1 24B

Name: Mistral Small 3.1 24B
Brand: Mistral
Price: 0.55 USD
Availability: InStock
Rating: 2.77 (13 reviews)

Mistral's efficiency model. Context window: 128K tokens.

Overall score

2.77

/5.00 · ranked #122

Input

$0.351

per 1M tokens

Output

$0.555

per 1M tokens

Context

128K

tokens

Blended

$0.504

3:1 out:in ratio

modelpicker.aipowered by live benchmark data

Scores by test

Methodology →

Structured Output

4.0

Strategic Analysis

3.0

Constrained Rewriting

3.0

Creative Problem Solving

2.0

Tool Calling

1.0

Faithfulness

4.0

Classification

3.0

Long Context

5.0

Safety Calibration

1.0

Persona Consistency

2.0

Agentic Planning

3.0

Multilingual

4.0

Tabular Data

1.0

What you need to know

Mistral Small 3.1 24B is optimized for high-volume long-context processing and structured data generation. It achieves a perfect 5/5 score in long context handling and strong 4/5 ratings in faithfulness, structured output, and multilingual capabilities. These metrics indicate the model is reliable for extracting information from large documents and adhering to specific formatting requirements.

The model's pricing is competitive, with a blended cost of $0.508/MTok, making it an affordable option for high-throughput tasks. However, this low cost comes with significant functional trade-offs. It fails in technical execution areas, scoring 1/5 in tool calling, tabular data processing, and safety calibration. It is not a viable candidate for agentic workflows that require external API interactions or precise data manipulation.

Overall performance is low, with an average internal score of 2.77/5.0 and the lowest overall rank among compared models. While it excels at reading and formatting, it struggles with persona consistency and creative problem solving.

Use this model if you need a low-cost solution for processing long documents, multilingual translation, or generating structured text. Skip this model if your application requires tool use, data analysis of tables, or strict safety guardrails.

Strengths — Top 3

Long Context5.0/5.0

Structured Output4.0/5.0

Faithfulness4.0/5.0

Relative weaknesses — Bottom 3

Tool Calling1.0/5.0

Safety Calibration1.0/5.0

Tabular Data1.0/5.0