models/openai/gpt-oss-20b

OpenAI·active·free tier available

OpenAI: gpt-oss-20b

Name: OpenAI: gpt-oss-20b
Brand: OpenAI
Price: 0.13 USD
Availability: InStock
Rating: 3.62 (13 reviews)

OpenAI's efficiency model. Context window: 131K tokens.

Overall score

3.62

/5.00 · ranked #116

Input

$0.030

per 1M tokens

Output

$0.130

per 1M tokens

Context

131K

tokens

Blended

$0.105

3:1 out:in ratio

modelpicker.aipowered by live benchmark data

Scores by test

Methodology →

Structured Output

5.0

Strategic Analysis

4.0

Constrained Rewriting

3.0

Creative Problem Solving

3.0

Tool Calling

4.0

Faithfulness

4.0

Classification

3.0

Long Context

5.0

Safety Calibration

2.0

Persona Consistency

4.0

Agentic Planning

3.0

Multilingual

4.0

Tabular Data

3.0

SciCode

34.4

What you need to know

The primary value of gpt-oss-20b is its high reliability in structured data generation and long-context processing. With perfect 5/5 internal scores in both Structured Output and Long Context, this model is optimized for extracting data from large documents into precise formats. These capabilities are paired with a 131K context window, making it a specialized tool for high-volume data parsing.

Despite these strengths, the model ranks low overall (#115 of 130) due to significant deficits in reasoning and safety. A 34.4% score on SciCode and a 2/5 in Safety Calibration indicate that it struggles with complex scientific logic and strict guardrail adherence. Its performance in classification and constrained rewriting is mediocre, suggesting it is not suitable for nuanced editorial tasks or high-precision labeling.

At a blended cost of $0.105/MTok, the model is priced as a budget-tier option. While the cost is low, the trade-off is a lack of general-purpose intelligence and poor safety tuning. You are paying for a narrow set of capabilities—specifically structured output and context handling—rather than a balanced assistant.

Use this model if you need a low-cost engine for transforming large volumes of unstructured text into JSON or other structured formats. Skip this model if your application requires complex strategic reasoning, high safety standards, or precise classification.

Strengths — Top 3

Structured Output5.0/5.0

Long Context5.0/5.0

Strategic Analysis4.0/5.0

Relative weaknesses — Bottom 3

Safety Calibration2.0/5.0

Constrained Rewriting3.0/5.0

Creative Problem Solving3.0/5.0