Sampling

Module: fundamentals

What it is

Sampling is how a language model selects its next token from probability distributions. The model calculates probabilities for all possible next tokens, then selects one. Different sampling methods (top-p, top-k, temperature) control this selection, balancing between likely outputs and creative variety.

Why it matters

Sampling explains why the same prompt can produce different outputs. The model doesn't deterministically pick the "best" next word—it samples from probable options. This is why regenerating a response gives different results and why AI can be creative rather than producing identical outputs every time.