models/mistral/devstral-medium
M
Mistral·active

Devstral Medium

Mistral's efficiency model. Context window: 131K tokens.

Overall score
3.15
/5.00 · ranked #81
Input
$0.400
per 1M tokens
Output
$2.00
per 1M tokens
Context
131K
tokens
Blended
$1.60
3:1 out:in ratio

Price drops, new benchmarks, model updates. Stay current on Devstral Medium.

One email per change. Unsubscribe anytime.

modelpicker.aipowered by live benchmark data

Scores by test

Methodology →
Structured Output
4.0
Strategic Analysis
2.0
Constrained Rewriting
3.0
Creative Problem Solving
2.0
Tool Calling
3.0
Faithfulness
4.0
Classification
4.0
Long Context
4.0
Safety Calibration
1.0
Persona Consistency
3.0
Agentic Planning
4.0
Multilingual
4.0
Tabular Data
3.0

What you need to know

Devstral Medium is best suited for technical execution tasks requiring high reliability in output format and data integrity. It performs strongly in structured output, faithfulness, and agentic planning, making it a capable choice for pipeline automation and classification tasks. Its 131K context window is supported by a high long-context score, ensuring it can handle large datasets without significant degradation in accuracy.

The model struggles with high-level cognitive tasks, specifically strategic analysis and creative problem solving. It also shows a critical weakness in safety calibration, scoring 1/5, which indicates a lack of robust guardrails. Developers should expect poor performance when the use case requires nuanced reasoning or strict content filtering.

At a blended cost of $1.60/MTok, the model is priced moderately, but its low overall rank (#69 of 71) suggests poor value relative to the current market. While the input costs are low, the performance trade-offs in reasoning and safety make it an expensive option for general-purpose intelligence.

Use this model if you need a reliable tool for structured data extraction, classification, or agentic workflows within a large context. Skip this model if your application requires creative synthesis, complex strategic planning, or strict safety compliance.

Strengths — Top 3

Structured Output4.0/5.0
Faithfulness4.0/5.0
Classification4.0/5.0

Relative weaknesses — Bottom 3

Safety Calibration1.0/5.0
Strategic Analysis2.0/5.0
Creative Problem Solving2.0/5.0

Similar models

QQwen: Qwen3 Coder 30B A3B Instruct$0.2203.23MLlama 3.3 70B Instruct$0.2653.46OGPT-4o$8.133.46MLlama 4 Scout$0.2453.31