Claude Haiku 4.5 vs Gemini 2.5 Flash for Creative Writing

Winner: Gemini 2.5 Flash. In our testing on the Creative Writing task, Gemini 2.5 Flash scores 4.333 vs Claude Haiku 4.5's 4.000 (taskScoreB 4.3333333 vs taskScoreA 4.0). That margin, combined with Gemini's superior constrained_rewriting (4 vs 3), higher safety_calibration (4 vs 2), larger context window (1,048,576 vs 200,000 tokens), and lower token costs (input 0.3 vs 1; output 2.5 vs 5 per mTok), makes Gemini 2.5 Flash the better practical choice for most creative-writing workflows. Claude Haiku 4.5 remains valuable when you prioritize strategic_analysis, faithfulness, or planning (Claude scores 5 vs Gemini's 3 on strategic_analysis and 5 vs 4 on faithfulness and agentic_planning), but overall Gemini is the definitive winner for Creative Writing in our benchmarks.

anthropic

Claude Haiku 4.5

Overall
4.33/5Strong

Benchmark Scores

Faithfulness
5/5
Long Context
5/5
Multilingual
5/5
Tool Calling
5/5
Classification
4/5
Agentic Planning
5/5
Structured Output
4/5
Safety Calibration
2/5
Strategic Analysis
5/5
Persona Consistency
5/5
Constrained Rewriting
3/5
Creative Problem Solving
4/5

External Benchmarks

SWE-bench Verified
N/A
MATH Level 5
N/A
AIME 2025
N/A

Pricing

Input

$1.00/MTok

Output

$5.00/MTok

Context Window200K

modelpicker.net

google

Gemini 2.5 Flash

Overall
4.17/5Strong

Benchmark Scores

Faithfulness
4/5
Long Context
5/5
Multilingual
5/5
Tool Calling
5/5
Classification
3/5
Agentic Planning
4/5
Structured Output
4/5
Safety Calibration
4/5
Strategic Analysis
3/5
Persona Consistency
5/5
Constrained Rewriting
4/5
Creative Problem Solving
4/5

External Benchmarks

SWE-bench Verified
N/A
MATH Level 5
N/A
AIME 2025
N/A

Pricing

Input

$0.300/MTok

Output

$2.50/MTok

Context Window1049K

modelpicker.net

Task Analysis

No external benchmark is provided for this task, so our decision is based on internal task scores and component tests. Creative Writing requires: persona_consistency (maintaining character and voice), creative_problem_solving (non-obvious, feasible ideas), and constrained_rewriting (compression under tight limits). Secondary capabilities that matter are long_context (managing plot across many tokens), safety_calibration (allowing legitimate creative content while refusing harmful requests), faithfulness (staying true to source material), and modality support (using images, files, audio/video as creative prompts). In our testing: Gemini 2.5 Flash leads the overall Creative Writing task (4.333 vs 4.000). Component results show Gemini beats Claude on constrained_rewriting (4 vs 3) and safety_calibration (4 vs 2), while the two tie on persona_consistency (5) and creative_problem_solving (4). Claude Haiku 4.5 wins strategic_analysis, faithfulness, classification, and agentic_planning—useful for rigorous plot reasoning and sticking to source material. Both models score top-tier on long_context (5) and persona_consistency (5), but Gemini’s cheaper pricing and broader modality support make it the stronger all-around choice for creative-writing projects in our tests.

Practical Examples

  1. Flash fiction under strict character limits: Gemini 2.5 Flash is preferable — constrained_rewriting 4 vs 3 means it compresses and preserves tone more reliably in tight forms (e.g., 280-character microflash or tweet-length prose). 2) Safe-but-provocative prompts: Choose Gemini when you need safer outputs without over-filtering — safety_calibration 4 vs 2 reduces accidental refusals for edgy but legitimate creative prompts. 3) Large, multimodal inspiration sessions: Gemini supports text+image+file+audio+video inputs and has a 1,048,576-token context window, making it better for weaving long-form drafts that reference images, audio cues, or research files. 4) Multi-part novel planning and source-faithful rewrites: Claude Haiku 4.5 excels when you need deep strategic analysis and exact adherence to source material—Claude scores 5 on strategic_analysis and faithfulness vs Gemini's 3 and 4 respectively—so it’s strong for careful plot tradeoffs or faithful adaptations. 5) Cost-sensitive iterative editing: Gemini costs less per mTok (input 0.3, output 2.5 vs Claude input 1, output 5), so running many revision passes is materially cheaper on Gemini in our tests.

Bottom Line

For Creative Writing, choose Gemini 2.5 Flash if you need stronger constrained rewriting, better safety calibration, broader modality inputs, a huge context window, and lower token costs — it wins our Creative Writing benchmark 4.333 vs 4.000. Choose Claude Haiku 4.5 if your priority is strategic plot reasoning, strict faithfulness to source material, or agentic planning (Claude scores 5 vs Gemini's lower scores in those areas).

How We Test

We test every model against our 12-benchmark suite covering tool calling, agentic planning, creative problem solving, safety calibration, and more. Each test is scored 1–5 by an LLM judge. Read our full methodology.

Frequently Asked Questions