Claude Haiku 4.5 vs R1 0528 for Creative Writing
Winner: R1 0528. In our testing R1 0528 posts a higher Creative Writing task score (4.333 vs 4.000) and ranks 5th vs Claude Haiku 4.5 at 28th. The decisive advantages are R1's constrained_rewriting (4 vs 3) and safety_calibration (4 vs 2). Claude Haiku 4.5 remains stronger on strategic_analysis (5 vs 4) and offers larger context (200,000 tokens) and explicit multimodal support (text+image->text), but these do not outweigh R1's edge on the core Creative Writing subtests in our suite.
anthropic
Claude Haiku 4.5
Benchmark Scores
External Benchmarks
Pricing
Input
$1.00/MTok
Output
$5.00/MTok
modelpicker.net
deepseek
R1 0528
Benchmark Scores
External Benchmarks
Pricing
Input
$0.500/MTok
Output
$2.15/MTok
modelpicker.net
Task Analysis
Creative Writing demands: creative_problem_solving (novel ideas and plot beats), persona_consistency (voice and character stability), constrained_rewriting (compression to hard limits), long_context (managing long arcs), tone control, and safety calibration (avoiding harmful content while permitting edgy fiction). External benchmarks are not provided for this task, so we base the verdict on our 12-test proxies and the task subtests listed. In our testing R1 0528 scores 4.333 on Creative Writing vs Claude Haiku 4.5 at 4.000. Supporting evidence from subtests: both models tie on creative_problem_solving (4) and persona_consistency (5), but R1's higher constrained_rewriting (4 vs 3) and safety_calibration (4 vs 2) explain its lead. Claude Haiku contributes strengths useful to writers—top strategic_analysis (5) and larger context/window (200,000 tokens and 64k max output)—which help with complex plot tradeoffs and very long drafts, but they did not shift the task score in our suite.
Practical Examples
Where R1 0528 shines (based on our scores):
- Short-story festival with strict word/character limits: R1's constrained_rewriting 4 vs Claude's 3 yields tighter, higher-quality compressed drafts and edits.
- Working near content boundaries (edgy themes that still need safe handling): R1's safety_calibration 4 vs Claude's 2 reduces refusals and produces safer, allowable phrasing that passes moderation checks in our tests.
- Cost-sensitive iterative drafting: R1 input/output cost is $0.50/$2.15 per mTok vs Claude Haiku 4.5 at $1/$5 per mTok, so R1 is materially cheaper for many generations. Where Claude Haiku 4.5 shines (based on our scores and metadata):
- Complex plot planning and tradeoffs: Claude Haiku's strategic_analysis 5 vs R1's 4 produced better nuance in our planning probes.
- Very long-form serial or multimodal projects: Claude Haiku's context_window 200,000 and max_output_tokens 64,000 (vs R1 163,840 context and null max_output_tokens) and modality text+image->text are advantages for long arcs or image-driven fiction.
- Tooling and function integrations: Both tie on tool_calling (5), so developer workflows that need tool selection behave similarly in our tests.
Bottom Line
For Creative Writing, choose Claude Haiku 4.5 if you need larger context, multimodal image→text capabilities, or superior strategic plot analysis and can accept higher output cost ($5 per mTok). Choose R1 0528 if you need tighter constrained rewrites, stronger safety calibration, a top-5 task rank (5 of 52 in our testing), and lower costs ($0.50 input / $2.15 output per mTok).
How We Test
We test every model against our 12-benchmark suite covering tool calling, agentic planning, creative problem solving, safety calibration, and more. Each test is scored 1–5 by an LLM judge. Read our full methodology.