What is a token?
A token is the unit of text a language model actually sees. Modern LLMs do not read characters or words. they read sequences of tokens drawn from a fixed vocabulary, typically built with byte-pair encoding (BPE) or SentencePiece.
A token can be a whole word ("the"), a fragment ("token", "ization"), a punctuation mark, or even a single byte. The same string is broken into different tokens depending on the model.
Exact vs. approximate counts
OpenAI ships tiktoken as open source, so we can run the real tokenizer in your browser using a JavaScript port. The number you see for GPT-4o, GPT-4, GPT-3.5, o1, and o3 matches what the OpenAI API will charge you for.
Anthropic, Google, Meta, and Mistral do not ship browser-friendly tokenizers. For those models we use each provider's published average chars-per-token ratio. The estimate is typically within 5–10% of the real count, which is fine for budgeting and prompt design.
If you need a perfectly exact count for Claude, hit Anthropic's messages/count_tokens endpoint. For Llama / Mistral, run the model's tokenizer.json through Hugging Face's tokenizers library locally.