Skip to content

v0.2.1

Latest
Compare
Choose a tag to compare
@xhluca xhluca released this 22 Sep 18:07
1e636a9
  • Add Tokenizer.save_vocab and Tokenizer.load_vocab methods to save/load vocabulary to a json file called vocab.tokenizer.json by default
  • Add Tokenizer.save_stopwords and Tokenizer.load_stopwords methods to save/load stopwords to a json file called stopwords.tokenizer.json by default
  • Add TokenizerHF class to allow saving/loading from huggingface hub
    • New function: load_vocab_from_hub, save_vocab_to_hub, load_stopwords_from_hub, save_stopwords_to_hub

New tests and examples were added (see examples/index_to_hf.py and examples/tokenizer_class.py)