deepseek

DeepSeek V3.2 Speciale

DeepSeek V3.2 Speciale is a high-compute reasoning variant of the DeepSeek V3.2 architecture, designed for maximum agentic performance. It is currently archived (retired status). In our testing, the model produced empty responses on most benchmarks via OpenRouter as of April 2026, which is flagged as a known routing issue. As a result, we were only able to score it on 5 of 12 benchmark dimensions, and its overall rank is 53 out of 52 active models tested — placing it at the bottom of the field. At $0.40 input / $1.20 output per million tokens, it occupies a mid-tier price point but delivers bottom-tier reliability in current testing. It does not support tool calling and uses reasoning tokens, requiring compatible client configuration.

Performance

We were only able to test DeepSeek V3.2 Speciale on 5 of 12 benchmarks due to empty response issues via OpenRouter. Among the scores we obtained: long context (5/5, tied for 1st with 36 other models out of 55 tested) was the sole strength. Faithfulness scored 1/5 (rank 55 of 55, sole holder), multilingual scored 2/5 (rank 55 of 55, sole holder), tool calling scored 1/5 (rank 53 of 54), and safety calibration scored 1/5 (rank 32 of 55). The 7 remaining benchmark dimensions returned no scoreable output. Overall ranking: 53rd out of 52 active models. The model uses reasoning tokens and does not support tool calling, which further limits its applicability for agentic workflows.

Pricing

DeepSeek V3.2 Speciale is priced at $0.40 per million input tokens and $1.20 per million output tokens. At 1 million output tokens/month, you pay $1.20; at 10 million output tokens, $12.00. While the price is competitive compared to flagship models, the benchmark reliability issues observed in our testing make it difficult to recommend at any price. Other models in this price range — such as DeepSeek V3.1 at $0.75/MTok output (avg 3.92) — delivered complete, scoreable results across the full benchmark suite.

deepseek

DeepSeek V3.2 Speciale

Overall
2.00/5Weak

Benchmark Scores

Faithfulness
1/5
Long Context
5/5
Multilingual
2/5
Tool Calling
1/5
Safety Calibration
1/5

External Benchmarks

SWE-bench Verified
N/A
MATH Level 5
N/A
AIME 2025
N/A

Pricing

Input

$0.400/MTok

Output

$1.20/MTok

Context Window164K

modelpicker.net

Real-World Costs

iChat response<$0.001
iBlog post$0.0026
iDocument batch$0.068
iPipeline run$0.680

Pricing vs Performance

Output cost per million tokens (log scale) vs average score across our 12 internal benchmarks

This modelOther models

Try It

from openai import OpenAI

client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="YOUR_OPENROUTER_KEY",
)

response = client.chat.completions.create(
    model="deepseek/deepseek-v3.2-speciale",
    messages=[
        {"role": "user", "content": "Hello, DeepSeek V3.2 Speciale!"}
    ],
)

print(response.choices[0].message.content)

Recommendation

DeepSeek V3.2 Speciale is not recommended for production use in its current state. The model is archived, had critical reliability problems producing output via OpenRouter in our testing, and scored at the very bottom of the field on faithfulness and multilingual benchmarks. Developers who need high-compute performance should evaluate R1 0528 (avg 4.5, $2.15/MTok output) instead. If budget is the primary concern, DeepSeek V3.1 (avg 3.92, $0.75/MTok output) covers the same price tier with complete benchmark coverage.

How We Test

We test every model against our 12-benchmark suite covering tool calling, agentic planning, creative problem solving, safety calibration, and more. Each test is scored 1–5 by an LLM judge. Read our full methodology.

Frequently Asked Questions