Tokens & Tokenization
How text becomes numbers, pricing, and counting tokens.
Learning objectives
- Define token and why models do not use “words” directly
- Estimate token counts for budgeting
- Relate tokens to API pricing
Tokens are the meter on the pump
Models split text into tokens — chunks that might be a whole word, a syllable, or punctuation. English averages ~4 characters per token, but varies:
| Text | Rough tokens |
|---|---|
Hello | 1 |
workshopco.ca | 3–5 |
| 500-word FAQ page | ~650–750 |
| Full Book 1 chapter pasted in | Thousands — expensive |
Input vs output tokens
APIs bill separately:
- Input tokens — system prompt + conversation history + user message + retrieved docs
- Output tokens — the model’s reply (often cost more per token)
Example pricing shape (illustrative — check vendor):
Input: $0.15 / 1M tokens
Output: $0.60 / 1M tokens
One FAQ reply: 800 input + 200 output ≈ fractions of a cent
1,000 chats/month ≈ still under a few dollars IF prompts stay small
Worked example — Workshop Co. monthly estimate
Assumptions: 200 FAQ chats/month, 1,200 input + 300 output tokens each.
Input: 200 × 1,200 = 240,000 tokens
Output: 200 × 300 = 60,000 tokens
At $0.15 / $0.60 per 1M (illustrative):
Input cost ≈ $0.036
Output cost ≈ $0.036
Total API ≈ $0.07/month + engineering time
Bill spikes when someone pastes entire log files into the chat widget.
- Pasting 50 KB nginx logs → thousands of tokens per message
- Sending full conversation forever → use summarization or window limits
- Repeating huge system prompt every turn → cache prompts where vendor supports it
Try it yourself
Use a tokenizer tool (OpenAI Tokenizer, Hugging Face) on Workshop Co.’s system prompt draft:
“You are Workshop Co.’s FAQ assistant. Answer only from the provided class schedule JSON. Never invent dates. If unsure, say contact support@workshopco.ca.”
Count tokens. Add a 400-token JSON schedule. What is one request total?
Ballpark
System prompt ~45–60 tokens + JSON ~400 + user question ~20 ≈ ~500 input tokens per turn before history.
Quick quiz
Why is code often more token-heavy than prose?
Answer
Symbols, indentation, and rare identifiers split into many tokens; long URLs and base64 explode counts.