Question 1

Is R1 0528 better than Gemini 3 Flash Preview overall?

Accepted Answer

In our testing, Gemini 3 Flash Preview wins more benchmarks outright: it outscores R1 0528 on structured output (5 vs 4), strategic analysis (5 vs 4), and creative problem solving (5 vs 4). R1 0528's one clear win is safety calibration (4 vs 1), which is a major gap if your use case involves user inputs or compliance. Eight of 12 benchmarks end in a tie. On external benchmarks from Epoch AI, Flash Preview scores 92.8% on AIME 2025 vs R1 0528's 66.4%, and 75.4% on SWE-bench Verified (R1 0528 has no score there). R1 0528 scores 96.6% on MATH Level 5 (Flash Preview has no score there). Neither model dominates universally — the right answer depends on your specific workload.

Question 2

Which model is cheaper, R1 0528 or Gemini 3 Flash Preview?

Accepted Answer

Input costs are identical: both charge $0.50 per million tokens. R1 0528 is cheaper on output at $2.15/M versus Gemini 3 Flash Preview's $3.00/M. That 28% output cost difference reaches $850/month at 100M output tokens. One caveat: R1 0528 is a reasoning model and its reasoning tokens count toward the output budget — on short tasks, this can inflate effective output costs beyond what the sticker price suggests.

Question 3

Which is better for coding, R1 0528 or Gemini 3 Flash Preview?

Accepted Answer

Based on available data, Gemini 3 Flash Preview has a stronger coding signal. It scores 75.4% on SWE-bench Verified (rank 3 of 12 models tested), a benchmark that measures real GitHub issue resolution — according to Epoch AI. R1 0528 has no SWE-bench Verified score in our data. For agentic coding workflows, Flash Preview also has a structured output advantage (5/5 vs 4/5 in our testing), and R1 0528 has a documented issue returning empty responses on structured output tasks, which can break code-generation pipelines that rely on JSON schema compliance.

Question 4

Which model handles math better?

Accepted Answer

It depends on the type of math. On AIME 2025 (olympiad-level competition math, Epoch AI), Gemini 3 Flash Preview scores 92.8% (rank 5 of 23) versus R1 0528's 66.4% (rank 16 of 23) — a substantial advantage for Flash Preview. On MATH Level 5 (competition math at the hardest difficulty, Epoch AI), R1 0528 scores 96.6% (rank 5 of 14), but Gemini 3 Flash Preview has no score on that benchmark in our data, making direct comparison impossible. If AIME-style problem solving is your primary use case, Flash Preview's data is more favorable.

Question 5

Does Gemini 3 Flash Preview support images and other file types?

Accepted Answer

Yes. According to our data, Gemini 3 Flash Preview supports text, image, file, audio, and video inputs. R1 0528 is text-only (text in, text out). If your application involves document analysis, image understanding, or audio processing, Flash Preview is the only option between these two.

Question 6

Which model is safer to deploy in production?

Accepted Answer

R1 0528 is significantly stronger on safety calibration in our testing: it scores 4/5 (rank 6 of 55 models) versus Gemini 3 Flash Preview's 1/5 (rank 32 of 55). A score of 1/5 sits at the 25th percentile of all models we've tested, meaning Flash Preview performs at or near the bottom of the field on refusing harmful requests while permitting legitimate ones. For customer-facing applications, regulated industries, or any deployment where harmful output is a liability, R1 0528 is the safer choice.

R1 0528 vs Gemini 3 Flash Preview

R1 0528

Gemini 3 Flash Preview

Benchmark Analysis

Pricing Analysis

Real-World Cost Comparison

Bottom Line

How We Test

Frequently Asked Questions