LogoCookLLM Docs
LogoCookLLM Docs
HomeCookLLM

Principles

Tokenization
Tokenization BasicsBPE AlgorithmGPT TokenizersBPE Training Engineering
Model Architecture
Attention Mechanisms
Position Encoding
Position Encoding BasicsRoPE Math DerivationRoPE ImplementationLength Extrapolation
GPU Programming Basics
GPU Architecture BasicsTensor LayoutTriton Basics: Vector Add
FlashAttention
Flash Attention PrinciplesFrom Naive to Auto-TuningBlock Pointers and Multi-Dim SupportCausal Masking OptimizationGrouped Query AttentionBackward Pass

Hands-on Training

X (Twitter)

数据准备

下载 Fineweb-Edu-Chinese 数据,理解 Parquet 存储和数据检查流程

👨‍🍳

Content is cooking...

We're preparing high-quality content for you. Stay tuned!

Table of Contents

1. 数据目录
2. 使用下载脚本
3. 目录结构
4. Parquet 格式
5. 数据检查
6. 从原始数据重新采样