How LLMs Generate Text: Token by Token

Watch how language models generate text one word at a time by sampling from probability distributions. The model doesn't always pick the most likely word—it randomly samples based on the probabilities!

Select Example:

Input Tokens:
The
scientist
discovered
3 tokens
Billions of
Parameters
LLM (Neural Network)
Input Tokens
Probability Distribution
Next Token Probabilities:
that
35%
a
28%
new
18%
how
10%
evidence
6%
proof
2%
something
1%
Step 0 of 5● Generating...

Key Concepts

  • ✓ Autoregressive: One token at a time
  • ✓ Probabilistic: Random sampling, not always top choice
  • ✓ Context matters: Previous tokens influence next distribution
  • ✓ No "thinking ahead": Each token is independent
Generated Sentence:
The scientist discovered