Question 1

Is Devstral Medium better than Gemini 3.1 Flash Lite Preview?

Accepted Answer

In our testing across 12 benchmarks, no. Gemini 3.1 Flash Lite Preview wins 9 of 12 tests, Devstral Medium wins 1 (classification), and 2 are tied. Gemini 3.1 Flash Lite Preview also costs less: $0.25/$1.50 per MTok input/output vs $0.40/$2.00 for Devstral Medium. Devstral Medium's only outright advantage is on classification tasks, where it ties for 1st of 53 models.

Question 2

Which model is cheaper — Devstral Medium or Gemini 3.1 Flash Lite Preview?

Accepted Answer

Gemini 3.1 Flash Lite Preview is cheaper on both dimensions: $0.25 per million input tokens and $1.50 per million output tokens, vs $0.40 and $2.00 for Devstral Medium. At 10M output tokens/month, that's $15 vs $20 — a $5 difference. At 100M output tokens/month, you save $50/month by choosing Gemini 3.1 Flash Lite Preview.

Question 3

Which model is better for coding and agentic tasks?

Accepted Answer

Neither model has external coding benchmark scores (e.g., SWE-bench) in our current data payload for this comparison. On our internal agentic planning benchmark, both score 4/5 and tie at rank 16 of 54 models. On tool calling — which enables function-use in agentic pipelines — Gemini 3.1 Flash Lite Preview scores 4/5 (ranked 18th of 54) vs Devstral Medium's 3/5 (ranked 47th of 54). Based on available benchmark data, Gemini 3.1 Flash Lite Preview holds the edge for tool-heavy agentic workflows.

Question 4

Which model handles long documents better?

Accepted Answer

Both score identically in our long context benchmark: 4/5, tied at rank 38 of 55 models. However, Gemini 3.1 Flash Lite Preview has a dramatically larger context window: 1,048,576 tokens vs Devstral Medium's 131,072 tokens. For documents or conversations exceeding ~130K tokens, only Gemini 3.1 Flash Lite Preview can accommodate the input.

Question 5

Which model is safer for consumer-facing applications?

Accepted Answer

Gemini 3.1 Flash Lite Preview scores 5/5 on our safety calibration benchmark, tying for 1st among 55 tested models. Devstral Medium scores 1/5, ranking 32nd of 55. Safety calibration measures the model's ability to refuse genuinely harmful requests while correctly permitting legitimate ones. For any consumer-facing product, Gemini 3.1 Flash Lite Preview is the substantially safer choice based on our testing.

Question 6

Does Gemini 3.1 Flash Lite Preview support multimodal inputs?

Accepted Answer

Yes — per the data payload, Gemini 3.1 Flash Lite Preview accepts text, image, file, audio, and video inputs. Devstral Medium is text-only. If your application needs to process images, audio, or documents as direct inputs, only Gemini 3.1 Flash Lite Preview supports that modality.

Devstral Medium vs Gemini 3.1 Flash Lite Preview

Devstral Medium

Gemini 3.1 Flash Lite Preview

Benchmark Analysis

Pricing Analysis

Real-World Cost Comparison

Bottom Line

How We Test

Frequently Asked Questions