An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

crates.io : kitoken

Fast and versatile tokenizer for language models, supporting BPE, Unigram and WordPiece tokenization

Registry - Source - Homepage - Documentation - JSON
purl: pkg:cargo/kitoken
Keywords: bpe , nlp , tokenizer , unigram , wordpiece , nodejs , python , rust , sentencepiece , web , word-segmentation
License: BSD-2-Clause
Latest release: 7 months ago
First release: 8 months ago
Downloads: 6,026 total
Stars: 26 on GitHub
Forks: 0 on GitHub
See more repository details: repos.ecosyste.ms
Funding links: https://github.com/sponsors/Systemcluster
Last synced: 10 days ago

    Loading...
    Readme
    Loading...