How LLMs Generate Text: Token by Token

Watch how language models generate text one word at a time by sampling from probability distributions. The model doesn't always pick the most likely word—it randomly samples based on the probabilities!

Select Example:

Science Discovery

Future Prediction

Cooking Recipe

Breaking News

Input Tokens:

The

scientist

discovered

3 tokens

Billions of

Parameters

LLM (Neural Network)

Input Tokens

Probability Distribution

Next Token Probabilities:

that

35%

28%

new

18%

how

10%

evidence

proof

something

→

Step 0 of 5● Generating...

Key Concepts

✓ Autoregressive: One token at a time
✓ Probabilistic: Random sampling, not always top choice
✓ Context matters: Previous tokens influence next distribution
✓ No "thinking ahead": Each token is independent

Generated Sentence:

The scientist discovered