How Tokens Are Priced

In AI, “how much does this conversation cost” is measured in tokens, not minutes. This page lays out the mechanics in plain language.


What Is a Token?

A token is the smallest unit the model processes. Not characters, not words — the model segments text into its own pieces.

Rough conversion:

Language1 Token ≈
English~0.75 word (e.g., “apple” = 1 token, “refrigerator” = 2 tokens)
Chinese~1 character = 1.5–2 tokens (e.g., “你好” = 3–4 tokens)
Code0.25–0.5 line (varies by language)

Example

In English:

“Hello, the weather is nice today, let’s go to the park!”

That’s 11 words, which the model typically splits into about 13 tokens.

In Chinese:

“你好,今天天氣真好,我們去公園散步吧!”

That’s 17 characters, which the model splits into about 28 tokens.

Chinese uses more tokens than English. Chatting in Chinese often costs 30–50% more than chatting in English.


Cost per Exchange

Formula:

Exchange cost =
  Input tokens × input price
+ Output tokens × output price

Key concept: input includes the conversation history.

Message 1 from you: 100 tokens
AI reply 1: 300 tokens
────────────────────────────────
Message 2 from you: 80 tokens + prior 400 = 480 tokens input
AI reply 2: 250 tokens
────────────────────────────────
Message 3 from you: 60 tokens + prior 730 = 790 tokens input
AI reply 3: 200 tokens

The longer the chat, the more each request costs, because every turn carries history. This is why context compression exists.


Model Pricing (2026-04)

OpenAI

ModelInput / 1MOutput / 1MBest For
GPT-4o$2.50$10.00Complex reasoning, creative work
GPT-4o-mini$0.15$0.60Daily chat, support
o1$15.00$60.00Deep reasoning (slow + expensive)
o1-mini$3.00$12.00Mid-tier reasoning

Anthropic

ModelInput / 1MOutput / 1MBest For
Claude 3.5 Sonnet$3.00$15.00Code, long-context
Claude 3.5 Haiku$0.80$4.00Fast responses
Claude 3 Opus$15.00$75.00Flagship

Google

ModelInput / 1MOutput / 1MBest For
Gemini 1.5 Pro$1.25$5.00Long context (2M)
Gemini 1.5 Flash$0.075$0.30Cheap, high volume
Gemini 2.0 Flash$0.10$0.40Newer version

Others

ModelNote
Groq (Llama, Mixtral)Fast and cheap
DeepSeekVery low price, strong on Chinese
Azure OpenAISame models as OpenAI, slightly different pricing

Prices change often — providers are in a price war. The table above reflects 2026-04; confirm current pricing on each vendor’s site.


Example: One Day of Support Conversations

Ada handles 50 customer questions per day. Each conversation averages 5 turns, with ~100 tokens per turn on either side.

Total tokens:

  • Message portion: 50 × 5 × 2 × 100 = 50,000 tokens (raw messages)
  • History accumulation: each turn carries prior turns, roughly 150,000 tokens input + 50,000 tokens output

By model:

ModelCost / Day (USD)Month (30 days)
GPT-4o-mini$0.05$1.5
GPT-4o$0.88$26.4
Claude Sonnet$1.20$36.0
Gemini Flash$0.03$0.9

Takeaway: same workload, expensive vs cheap can differ by 40×.


Checking Your Own Consumption

In the Admin Panel:

Quickest “gut check”:

  1. Ask the AI any question
  2. Some chat UIs show “this exchange: X tokens” at the bottom
  3. Multiply by the model’s unit price → cost of this exchange

Estimating Your Bill

Personal use (~20 exchanges/day):

  • GPT-4o-mini → $5–10 / month
  • Claude Haiku → $15–30 / month
  • GPT-4o → $30–60 / month

Small-team support (100 exchanges/day):

  • GPT-4o-mini → $30–60 / month
  • Claude Sonnet → $200–400 / month

Mid-size e-commerce support (500 exchanges/day):

  • GPT-4o-mini → $150–300 / month
  • GPT-4o → $2,000–4,000 / month

Rule of thumb: picking the right model can cut cost by 10–20×. If you just say “hello” and the AI replies “hi” in 5 seconds, you probably don’t need $15/1M-token o1.