Open source vs open weight
First, the terminology. Most “open source” LLMs are actually open-weight: you can download and run the model weights, but the training data and process aren't fully open. True open-source models (weights + data + training code) are rare.
The license spectrum matters for commercial use:
- Apache 2.0 / MIT— Fully permissive. Use commercially without restrictions. Examples: Mistral's smaller models, some Qwen variants.
- Llama License— Free for commercial use under 700M monthly active users. Covers Meta's Llama family.
- DeepSeek License — Permissive with some restrictions. Check the specific model version.
- Research-only — Some model variants are restricted to non-commercial research. Always check before deploying.
Why open weight matters
- No vendor lock-in. If your API provider raises prices or shuts down, you can move the same model to another host — or run it yourself.
- Fine-tuning. You can train the model further on your data to specialize it for your domain.
- Data privacy. Self-hosting means your data never leaves your infrastructure.
- Cost at scale. At high volume, self-hosting amortizes to significantly less than API costs.
Top open-weight models
| # | Model | Provider | Avg Score | $/out | Context |
|---|---|---|---|---|---|
| 01 | Qwen: Qwen3.6 Plus | Qwen | 4.54 | $1.95 | 1M |
| 02 | R1 0528 | DeepSeek | 4.46 | $2.15 | 164K |
| 03 | DeepSeek V3.2 | DeepSeek | 4.31 | $0.378 | 131K |
| 04 | DeepSeek V4 Flash | DeepSeek | 4.23 | $0.280 | 1.0M |
| 05 | Mistral Medium 3.1 | Mistral | 4.23 | $2.00 | 131K |
| 06 | DeepSeek V4 Pro | DeepSeek | 4.15 | $0.870 | 1.0M |
| 07 | Qwen: Qwen3 235B A22B Instruct 2507 | Qwen | 4.08 | $0.100 | 262K |
| 08 | R1 | DeepSeek | 4.00 | $2.50 | 64K |
| 09 | DeepSeek V3.1 | DeepSeek | 4.00 | $0.750 | 33K |
| 10 | Qwen: Qwen3.5-9B | Qwen | 4.00 | $0.150 | 262K |
Best for specific tasks
Coding: Codestral 2508 (Mistral) leads with a coding composite of 5.00/5.0.
Reasoning: DeepSeek V4 Flash (DeepSeek) leads with a reasoning composite of 5.00/5.0.
General purpose: Qwen: Qwen3.6 Plus (Qwen) has the highest overall score at 4.54/5.0.
How close to proprietary?
The best open-weight model (Qwen: Qwen3.6 Plus, 4.54/5.0) vs the best proprietary model (Claude Sonnet 4.6, 4.69/5.0) — a gap of 0.15 points. The gap has narrowed significantly — for many use cases, open-weight models are now competitive with proprietary ones.
Hosted open-weight options
You don't need a GPU to use open-weight models. Several inference providers host them with generous free tiers:
- Groq — Extremely fast inference on custom LPU hardware. Free tier with rate limits. Best for latency-sensitive applications.
- Together AI — Wide model selection, competitive pricing. Good for production workloads.
- Fireworks AI — Optimized serving with function calling support. Strong developer experience.
- OpenRouter — Unified API that routes to multiple providers. Useful for fallback strategies.
Running locally
For local deployment with Ollama, LM Studio, or llama.cpp, see our dedicated Best Local LLMs for Coding guide — it covers hardware requirements, quantization, and tooling in detail.