deepseek

DeepSeek: DeepSeek V3 0324

DeepSeek: DeepSeek V3 0324 is an earlier release in the deepseek V3 lineage, currently in pending review status. At $0.20 input / $0.77 output per million tokens, it sits at the low end of the pricing spectrum, but our testing placed it at the bottom of the ranked field — 53rd out of 52 active models — with some of the lowest individual benchmark scores in the entire suite. It is a text-to-text model. No modality or supported parameters are documented for this release. For most use cases, newer deepseek models deliver substantially better benchmark performance at comparable or lower prices.

Performance

DeepSeek V3 0324 scored at or near the bottom of the field on most benchmarks in our testing. Long context scored 3/5 (rank 55 of 55, sole holder — last place). Persona consistency scored 1/5 (rank 53 of 53, sole holder). Structured output scored 3/5 (rank 53 of 54, sole holder). Constrained rewriting scored 2/5 (rank 53 of 53, sole holder). Multilingual scored 3/5 (rank 54 of 55, sole holder). The top-scoring dimensions were structured output at 3/5 and strategic analysis at 3/5 (rank 36 of 54) — both at the field median or below. Safety calibration scored 1/5 (rank 32 of 55). Creative problem solving scored 2/5 (rank 47 of 54). Overall rank: 53 out of 52 active models tested.

Pricing

DeepSeek V3 0324 costs $0.20 per million input tokens and $0.77 per million output tokens. At 1 million output tokens/month, that is $0.77; at 10 million output tokens, $7.70. The pricing is very low, but within the deepseek lineup, DeepSeek V3.2 at $0.38/MTok output offers an average benchmark score of 4.25 compared to this model's average of roughly 2.6. DeepSeek V3.1 at $0.75/MTok output averaged 3.92 in our tests. The per-token savings do not compensate for the significant quality gap.

deepseek

DeepSeek: DeepSeek V3 0324

Overall
2.50/5Usable

Benchmark Scores

Faithfulness
3/5
Long Context
3/5
Multilingual
3/5
Tool Calling
3/5
Classification
3/5
Agentic Planning
3/5
Structured Output
3/5
Safety Calibration
1/5
Strategic Analysis
3/5
Persona Consistency
1/5
Constrained Rewriting
2/5
Creative Problem Solving
2/5

External Benchmarks

SWE-bench Verified
N/A
MATH Level 5
N/A
AIME 2025
N/A

Pricing

Input

$0.200/MTok

Output

$0.770/MTok

Context Window164K

modelpicker.net

Real-World Costs

iChat response<$0.001
iBlog post$0.0016
iDocument batch$0.043
iPipeline run$0.425

Pricing vs Performance

Output cost per million tokens (log scale) vs average score across our 12 internal benchmarks

This modelOther models

Try It

from openai import OpenAI

client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_OPENROUTER_KEY",
)

response = client.chat.completions.create(
    model="deepseek/deepseek-chat-v3-0324",
    messages=[
        {"role": "user", "content": "Hello, DeepSeek: DeepSeek V3 0324!"}
    ],
)

print(response.choices[0].message.content)

Recommendation

DeepSeek V3 0324 is not recommended for production use in its current state. It is pending review, has no documented supported parameters, and scored at the very bottom of our benchmark suite on long context, persona consistency, structured output, constrained rewriting, and multilingual quality. Teams looking for low-cost deepseek performance should instead evaluate DeepSeek V3.2 ($0.38/MTok output, avg 4.25 across our 12-test suite) or DeepSeek V3.1 ($0.75/MTok output, avg 3.92). Both offer dramatically better quality at comparable prices.

How We Test

We test every model against our 12-benchmark suite covering tool calling, agentic planning, creative problem solving, safety calibration, and more. Each test is scored 1–5 by an LLM judge. Read our full methodology.

Frequently Asked Questions