Jun 304 min readTokenizationBPE DropoutBPE Dropout: Stochastic subword segmentation. Applies dropout to merges during tokenization. Improves model robustness and generalization.
Jun 2811 min readTokenizationWordPiece Tokenization: A BPE VariantWord Piece Tokenization: Subword segmentation for NLP. Builds vocab from frequent subwords & handles rare words
Jun 2212 min readMasked Language ModellingMastering Masked Language Models: Techniques, Comparisons, and Best PracticesExplore the evolution of Masked Language Modeling: from BERT to XLNet, uncover how to choose the right technique for your NLP tasks.