Question 1

Is R1 better than Gemini 3.1 Pro Preview?

Accepted Answer

It depends on the task. In our testing, R1 wins zero benchmarks outright and ties Gemini 3.1 Pro Preview on eight of twelve tests — including multilingual (5/5), strategic analysis (5/5), faithfulness (5/5), and creative problem solving (5/5). Gemini 3.1 Pro Preview wins four benchmarks: structured output, long context, agentic planning, and safety calibration. On external math benchmarks from Epoch AI, Gemini 3.1 Pro Preview scores 95.6% on AIME 2025 (rank 2 of 23) vs R1's 53.3% (rank 17 of 23). For general use, R1 is competitive at a fraction of the cost. For long-context, agentic, or math-heavy work, Gemini 3.1 Pro Preview is the stronger model.

Question 2

Which is cheaper, R1 or Gemini 3.1 Pro Preview?

Accepted Answer

R1 is significantly cheaper. R1 costs $0.70/M input tokens and $2.50/M output tokens. Gemini 3.1 Pro Preview costs $2.00/M input and $12.00/M output — 4.8x more expensive on output. At 10M output tokens/month, that's $25 (R1) vs $120 (Gemini 3.1 Pro Preview). At 100M output tokens/month, the gap is $250 vs $1,200.

Question 3

Which model is better for coding?

Accepted Answer

Neither model has a SWE-bench Verified score in our current data payload, so we can't make a direct coding-specific comparison on that external benchmark. On internal proxies relevant to coding — structured output (JSON/format compliance), tool calling, and agentic planning — Gemini 3.1 Pro Preview scores higher: 5/5 on structured output vs R1's 4/5, and 5/5 on agentic planning vs R1's 4/5. For multi-step coding agents that need to call tools, parse structured responses, and maintain long context windows, Gemini 3.1 Pro Preview has measurable advantages in our testing.

Question 4

Which model handles long documents better?

Accepted Answer

Gemini 3.1 Pro Preview is the clear choice for long documents. It has a 1,048,576-token context window (roughly 800,000 words) and scores 5/5 on our long context benchmark (tied for 1st among 55 models). R1 has a 64,000-token context window — over 16x smaller — and scores 4/5 on long context (rank 38 of 55). If your use case involves large codebases, lengthy PDFs, or retrieval across 30K+ token documents, R1 may simply be unable to fit the input.

Question 5

Which model is better for math?

Accepted Answer

Gemini 3.1 Pro Preview is substantially better on advanced math. According to Epoch AI's external benchmarks, Gemini 3.1 Pro Preview scores 95.6% on AIME 2025, ranking 2nd among 23 models tested. R1 scores 53.3% on the same benchmark, ranking 17th. On MATH Level 5 (competition math), R1 scores 93.1% (rank 8 of 14 models with available data); Gemini 3.1 Pro Preview has no MATH Level 5 score in our current data. These scores are from Epoch AI, not our internal testing.

Question 6

Does R1 support multimodal inputs like images or audio?

Accepted Answer

No. According to our data, R1 is a text-to-text model only. Gemini 3.1 Pro Preview supports text, image, file, audio, and video inputs. If your application needs to process images, audio, or video alongside text, Gemini 3.1 Pro Preview is the only option of the two.

R1 vs Gemini 3.1 Pro Preview

R1

Gemini 3.1 Pro Preview

Benchmark Analysis

Pricing Analysis

Real-World Cost Comparison

Bottom Line

How We Test

Frequently Asked Questions