models/meta/llama-4-scout
M
Meta·active

Llama 4 Scout

Meta's efficiency model. Long-context specialist with 10M window.

Overall score
3.31
/5.00 · ranked #78
Input
$0.080
per 1M tokens
Output
$0.300
per 1M tokens
Context
10M
tokens
Blended
$0.245
3:1 out:in ratio

Price drops, new benchmarks, model updates. Stay current on Llama 4 Scout.

One email per change. Unsubscribe anytime.

modelpicker.aipowered by live benchmark data

Scores by test

Methodology →
Structured Output
4.0
Strategic Analysis
2.0
Constrained Rewriting
3.0
Creative Problem Solving
3.0
Tool Calling
4.0
Faithfulness
4.0
Classification
4.0
Long Context
5.0
Safety Calibration
2.0
Persona Consistency
3.0
Agentic Planning
2.0
Multilingual
4.0
Tabular Data
3.0

What you need to know

Llama 4 Scout is primarily a long-context utility model, distinguished by a perfect 5/5 internal score for long-context handling and a substantial 328K context window. It excels at processing large datasets and maintaining faithfulness, making it effective for retrieval-heavy tasks where accuracy and volume are prioritized over reasoning depth.

The model is highly efficient for structured operational tasks, scoring 4/5 in tool calling, classification, and structured output. These strengths, combined with a low blended cost of $0.245/MTok, make it an economical choice for high-volume pipelines that require strict formatting or multilingual support.

However, the model lacks advanced cognitive capabilities. With scores of 2/5 in strategic analysis and agentic planning, it is unsuitable for autonomous decision-making or complex multi-step reasoning. Its low safety calibration score also suggests a need for rigorous external guardrails in production environments.

Use this model if you need a low-cost solution for analyzing very large documents, performing classification, or extracting structured data across multiple languages. Skip this model if your application requires complex planning, strategic reasoning, or high-precision safety alignment.

Strengths — Top 3

Long Context5.0/5.0
Structured Output4.0/5.0
Tool Calling4.0/5.0

Relative weaknesses — Bottom 3

Strategic Analysis2.0/5.0
Safety Calibration2.0/5.0
Agentic Planning2.0/5.0

Similar models

QQwen: Qwen3 Coder 30B A3B Instruct$0.2203.23MDevstral Medium$1.603.15OOpenAI: gpt-oss-20b$0.1133.54OGPT-4o$8.133.46