crates.io : kitoken
Fast and versatile tokenizer for language models, supporting BPE, Unigram and WordPiece tokenization
Registry
-
Source
- Homepage
- Documentation
- JSON
purl: pkg:cargo/kitoken
Keywords:
bpe
, nlp
, tokenizer
, unigram
, wordpiece
, nodejs
, python
, rust
, sentencepiece
, web
, word-segmentation
License: BSD-2-Clause
Latest release: 7 months ago
First release: 8 months ago
Downloads: 6,026 total
Stars: 26 on GitHub
Forks: 0 on GitHub
See more repository details: repos.ecosyste.ms
Funding links: https://github.com/sponsors/Systemcluster
Last synced: 10 days ago