models/google/gemini-3-1-pro-preview
G
Google·active

Gemini 3.1 Pro Preview

Google's mid-tier model. Long-context specialist with 1.0M window.

Overall score
4.38
/5.00 · ranked #25
Input
$2.00
per 1M tokens
Output
$12.00
per 1M tokens
Context
1.0M
tokens
Blended
$9.50
3:1 out:in ratio

Price drops, new benchmarks, model updates. Stay current on Gemini 3.1 Pro Preview.

One email per change. Unsubscribe anytime.

modelpicker.aipowered by live benchmark data

Scores by test

Methodology →
Structured Output
5.0
Strategic Analysis
5.0
Constrained Rewriting
4.0
Creative Problem Solving
5.0
Tool Calling
4.0
Faithfulness
5.0
Classification
2.0
Long Context
5.0
Safety Calibration
2.0
Persona Consistency
5.0
Agentic Planning
5.0
Multilingual
5.0
Tabular Data
5.0
AIME 2025
95.6

What you need to know

Gemini 3.1 Pro Preview is built for high-complexity reasoning and massive data ingestion, distinguished by a 1.0M token context window and an AIME 2025 score of 95.6%. It excels in strategic analysis, agentic planning, and creative problem solving, consistently hitting the top of internal benchmarks for faithfulness and tabular data processing.

The model is highly reliable for developers requiring strict adherence to formats, scoring a 5/5 in structured output. This capability, paired with its multilingual proficiency, makes it a strong candidate for complex, multi-step workflows. However, it struggles with basic classification and safety calibration, both scoring 2/5, indicating a potential lack of precision in simple labeling tasks and a loose approach to safety guardrails.

At a blended cost of $9.50/MTok, this model sits in a premium price tier. While the input cost is moderate at $2.00/MTok, the $12.00/MTok output cost makes it expensive for high-volume generative tasks. The price is justified for long-context analysis and high-reasoning requirements, but it is inefficient for simple API calls.

Use this model if you need to process massive datasets, generate precise structured data, or solve complex mathematical and strategic problems. Skip this model if your primary use case is simple text classification or if you require strict safety filtering.

Strengths — Top 3

Structured Output5.0/5.0
Strategic Analysis5.0/5.0
Creative Problem Solving5.0/5.0

Relative weaknesses — Bottom 3

Classification2.0/5.0
Safety Calibration2.0/5.0
Constrained Rewriting4.0/5.0

Similar models

DDeepSeek V3.2$0.3464.31QQwen: Qwen3.6 Flash$0.8914.23XGrok 4.3$2.194.15DR1$2.054.00