Transformer
Module: fundamentals
What it is
The Transformer is a neural network architecture introduced in 2017 that revolutionised AI. Its key innovation is the attention mechanism, which allows the model to consider relationships between all parts of an input simultaneously rather than processing it sequentially. The 'T' in GPT stands for Transformer.
Why it matters
Transformers enabled the current generation of powerful AI. Before transformers, language models struggled with long text because they processed words one at a time and forgot earlier context. Transformers can attend to everything at once, which is why modern AI handles long conversations and documents so much better.