models/google/gemma-4-31b-it
G
Google·active·free tier available

Gemma 4 31B

Google's mid-tier model. Context window: 262K tokens.

Overall score
4.38
/5.00 · ranked #30
Input
$0.120
per 1M tokens
Output
$0.370
per 1M tokens
Context
262K
tokens
Blended
$0.307
3:1 out:in ratio

Price drops, new benchmarks, model updates. Stay current on Gemma 4 31B.

One email per change. Unsubscribe anytime.

modelpicker.aipowered by live benchmark data

Scores by test

Methodology →
Structured Output
5.0
Strategic Analysis
5.0
Constrained Rewriting
4.0
Creative Problem Solving
4.0
Tool Calling
5.0
Faithfulness
5.0
Classification
4.0
Long Context
4.0
Safety Calibration
2.0
Persona Consistency
5.0
Agentic Planning
5.0
Multilingual
5.0
Tabular Data
4.0

What you need to know

Gemma 4 31B is optimized for high-precision technical tasks, specifically excelling in tool calling, structured output, and strategic analysis. With perfect 5/5 internal scores across agentic planning and faithfulness, it is designed for reliability in complex workflows where hallucination must be minimized and strict adherence to schemas is required.

The model provides a high performance-to-cost ratio, ranking 18th out of 71 models while maintaining a low blended cost of $0.318 per million tokens. This makes it a cost-effective alternative for developers who need frontier-level capabilities in multilingual support and persona consistency without the premium pricing of larger proprietary models.

A significant trade-off is found in safety calibration, which scores a 2/5, indicating a potential lack of restrictive filtering or a higher tendency to bypass safety guardrails compared to other models in its class. While it handles a substantial 262K context window, its long-context performance is rated slightly lower than its core logical capabilities.

Use this model for agentic workflows, automated tool integration, and data-heavy strategic analysis. Skip this model if your application requires strict safety alignment or if your primary use case is highly creative, open-ended problem solving.

Strengths — Top 3

Structured Output5.0/5.0
Strategic Analysis5.0/5.0
Tool Calling5.0/5.0

Relative weaknesses — Bottom 3

Safety Calibration2.0/5.0
Constrained Rewriting4.0/5.0
Creative Problem Solving4.0/5.0

Similar models

DR1 0528$1.744.46OGPT-5$7.814.54QQwen 3.7 Max$3.134.62MMistral Medium 3.5$6.004.15