pre-training | LLM Learning Hub 3/5

Jun 30, 20244 min read

BPE Dropout

BPE Dropout: Stochastic subword segmentation. Applies dropout to merges during tokenization. Improves model robustness and generalization.

Jun 28, 202411 min read

Word Piece Tokenization: Subword segmentation for NLP. Builds vocab from frequent subwords & handles rare words

Jun 22, 202412 min read

Explore the evolution of Masked Language Modeling: from BERT to XLNet, uncover how to choose the right technique for your NLP tasks.

3