Autoregressive
Module: fundamentals
What it is
Autoregressive generation means the model produces output one token at a time, with each new token depending on all previously generated tokens. The model generates token 1, then uses that to generate token 2, then uses both to generate token 3, and so on until completion.
Why it matters
Autoregressive generation explains why AI responses appear word-by-word when streaming and why early mistakes can cascade. Once the model starts down a path, it continues from there. It also explains why models can't easily "go back"—each token commits to a direction that influences everything after.