Architecture (Model Architecture)
Deeply understand LLM architecture design
Overview
The Architecture module dives into core components of LLM architecture, from foundational Attention mechanisms to advanced memory-augmented modules.
This module assumes you already know the basics of deep learning. We recommend starting with Attention before exploring advanced architectures.
Chapters
Attention Mechanisms
Deeply understand Attention in Transformers, including MHA, Causal Attention, GQA, and MQA
Position Encoding and RoPE
From sinusoidal PE to rotary position embeddings: math, implementation, and length extrapolation
CookLLM Docs