Unsupervised text tokenizer focused on computational efficiency
-
Updated
Mar 29, 2024 - C++
Unsupervised text tokenizer focused on computational efficiency
Fast and customizable text tokenization library with BPE and SentencePiece support
R package for Byte Pair Encoding based on YouTokenToMe
Add a description, image, and links to the bpe topic page so that developers can more easily learn about it.
To associate your repository with the bpe topic, visit your repo's landing page and select "manage topics."