Qwen: Qwen3.6 Max Preview
Qwen's flagship model. Context window: 262K tokens.
Scores by test
Methodology →What you need to know
Qwen3.6 Max Preview is currently the highest-performing model in our dataset, ranking first among 71 evaluated models. Its primary differentiator is a near-perfect internal score of 4.85/5.0, driven by maximum scores in complex logic tasks including agentic planning, strategic analysis, and tool calling. It demonstrates exceptional reliability in maintaining persona consistency and faithfulness across a wide range of prompts.
The model handles large-scale data efficiently with a 262K context window and a perfect 5/5 score for long-context processing. While it excels in structured output and tabular data, it shows a slight relative dip in classification and constrained rewriting, though these remain strong at 4/5.
At a blended cost of $4.94/MTok, this model sits in a premium price tier. However, the cost is justified by its versatility; it performs at a top-tier level across almost every technical category, from multilingual support to creative problem solving, reducing the need to chain multiple specialized models.
Use this model if you are building complex autonomous agents, requiring high-precision structured data, or processing very long documents. Skip this model if your workload consists primarily of simple classification tasks where a cheaper, smaller model would suffice.
Strengths — Top 3
Relative weaknesses — Bottom 3
Similar models