guides/ai-token-calculator7 min read

AI token calculator — estimate your LLM costs

Tokens are the currency of LLM APIs. Here's what they are, how to count them, and an interactive calculator to estimate your costs across any model.

What are tokens?

Tokens are subword units — chunks of text that the model processes internally. They're not characters, not words, but something in between. A token is typically 3–4 characters in English, though this varies by language and content type.

Why tokens instead of words?LLMs use tokenizers (like BPE — Byte Pair Encoding) that break text into statistically optimal chunks. Common words like “the” become single tokens. Rare words get split into multiple tokens. “Unbelievable” might become [“Un”, “believ”, “able”].

Rules of thumb

Content TypeApproximate TokensCharacters
A short email~200~800
A page of text~500~2,000
A blog post (1,000 words)~1,300~5,000
A code file (200 lines)~500–800~3,000
A technical document (10 pages)~5,000~20,000
A novel (80,000 words)~100,000~400,000

Quick math: divide your character count by 4 for a rough token estimate. For code, divide by 3.5 (code uses more rare tokens). For non-Latin scripts (Chinese, Japanese, Korean), divide by 2 — each character typically consumes more tokens.

Why tokenizer choice matters

Different models use different tokenizers. The same text produces different token counts across providers. Anthropic, OpenAI, and Google all use their own tokenizers. This means:

  • A prompt that costs $0.01 on one model might cost $0.012 on another — even at the same $/MTok price.
  • Context window limits (e.g., 128K tokens) hold different amounts of text per model.
  • For precise counting, use each provider's tokenizer tool. For estimates, the 4-chars-per-token rule works across all of them.

Input vs output tokens

Every API call has two token counts: input tokens (your prompt, system message, and any context) and output tokens(the model's response). Output tokens cost 2–6x more because they require more compute — the model generates them one at a time.

What 1M tokens costs across models

Live data · cost of 1M tokens (blended 1:3 in/out ratio)
Model$/in$/outBlended 1MQuality
Qwen: Qwen3 235B A22B Instruct 2507$0.071$0.100$0.094.08/5.0
Qwen: Qwen3.5-35B-A3B$0.163$1.30$1.023.92/5.0
GPT-5.5$5.00$30.00$23.754.46/5.0

Calculate your costs

Adjust the sliders to match your expected usage:

Cost calculator

Estimate your monthly cost

costs per day / month
500K tokens
150K tokens
ModelQuality$ / day$ / month
OOpenAI: gpt-oss-20b3.54$0.04$1.08
MMinistral 3 3B 25123.31$0.07$1.95
QQwen: Qwen3.5-9B4.00$0.07$2.17
MMinistral 3 8B 25123.38$0.10$2.93

Reducing token usage

  • Trim your prompts. Remove unnecessary context, examples, and formatting. Every token in your system prompt is charged on every request.
  • Set max_tokens. Limit response length when you know a short answer is sufficient. Don't let the model ramble.
  • Use prompt caching. Anthropic and OpenAI both offer caching for repeated prompt prefixes, reducing input token costs.
  • Summarize before stuffing. Instead of pasting a 10,000-token document into context, summarize it first with a cheap model.

For a deeper dive into pricing strategies, see our LLM API Pricing Explained guide. For full pricing data across all models, see the pricing comparison page.