LogoCookLLM Docs
LogoCookLLM Docs
HomeCookLLM

Principles

Tokenization
Tokenization BasicsBPE AlgorithmGPT TokenizersBPE Training Engineering
Model Architecture
Attention Mechanisms
Position Encoding
Position Encoding BasicsRoPE Math DerivationRoPE ImplementationLength Extrapolation
GPU Programming Basics
GPU Architecture BasicsTensor LayoutTriton Basics: Vector Add
FlashAttention
Flash Attention PrinciplesFrom Naive to Auto-TuningBlock Pointers and Multi-Dim SupportCausal Masking OptimizationGrouped Query AttentionBackward Pass

Hands-on Training

X (Twitter)

Tokenizer 训练

使用 RustBPE 训练 BPE tokenizer,并导出 tiktoken 编码

👨‍🍳

Content is cooking...

We're preparing high-quality content for you. Stay tuned!

Table of Contents

1. 输入和输出
2. 训练命令
3. 训练流程
4. 预分词规则
5. 特殊 token
6. 训练后的检查
7. 词表大小怎么选
8. 和训练配置的关系
延伸阅读