pypi.org "sentencepiece" keyword
View the packages on the pypi.org package registry that are tagged with the "sentencepiece" keyword.
Top 1.7% on pypi.org
28 versions - Latest release: 11 months ago - 3 dependent packages - 134 dependent repositories - 101 thousand downloads last month - 241 stars on GitHub - 1 maintainer
konoha 5.5.6 💰
Add your description here28 versions - Latest release: 11 months ago - 3 dependent packages - 134 dependent repositories - 101 thousand downloads last month - 241 stars on GitHub - 1 maintainer
kitoken 0.10.1 💰
Fast and versatile tokenizer for language models, supporting BPE, Unigram and WordPiece tokenization2 versions - Latest release: 4 months ago - 533 downloads last month - 16 stars on GitHub - 1 maintainer
escape-unk 1.0.1
Escape unknown symbols in SentecePiece vocabularies8 versions - Latest release: over 2 years ago - 315 downloads last month - 0 stars on GitHub - 2 maintainers
Top 2.4% on pypi.org
15 versions - Latest release: almost 5 years ago - 1 dependent package - 30 dependent repositories - 5.39 thousand downloads last month - 9,462 stars on GitHub - 1 maintainer
tf-sentencepiece 0.1.92
SentencePiece Encode/Decode ops for TensorFlow15 versions - Latest release: almost 5 years ago - 1 dependent package - 30 dependent repositories - 5.39 thousand downloads last month - 9,462 stars on GitHub - 1 maintainer
Top 3.6% on pypi.org
66 versions - Latest release: about 2 years ago - 3 dependent packages - 103 dependent repositories - 28.8 thousand downloads last month - 302 stars on GitHub - 4 maintainers
pyonmttok 1.37.1
Fast and customizable text tokenization library with BPE and SentencePiece support66 versions - Latest release: about 2 years ago - 3 dependent packages - 103 dependent repositories - 28.8 thousand downloads last month - 302 stars on GitHub - 4 maintainers
Top 8.0% on pypi.org
19 versions - Latest release: over 4 years ago - 16 dependent repositories - 428 downloads last month - 214 stars on GitHub - 1 maintainer
tiny-tokenizer 3.4.0 💰
🌿 An easy-to-use Japanese Text Processing tool, which makes it possible to switch tokenizers with...19 versions - Latest release: over 4 years ago - 16 dependent repositories - 428 downloads last month - 214 stars on GitHub - 1 maintainer
rs-bpe 0.1.0
A ridiculously fast Python BPE (Byte Pair Encoder) implementation written in Rust1 version - Latest release: about 1 month ago - 1.91 thousand downloads last month - 1 stars on GitHub - 1 maintainer
nepalitokenizers 0.0.2
Pre-trained Tokenizers for the Nepali language with an interface to HuggingFace's tokenizers libr...2 versions - Latest release: almost 2 years ago - 123 downloads last month - 2 stars on GitHub - 1 maintainer
Related Keywords
natural-language-processing
6
nlp
4
tokenizer
4
text-processing
3
wordpiece
3
huggingface
2
transformers
2
tokenization
2
NLP
2
neural-machine-translation
2
word-segmentation
2
rust
2
python
2
unigram
2
janome
2
bpe
2
sudachi
2
mecab
2
kytea
2
japanese
2
tiktoken-alternative
1
tiktoken
1
openai
1
linguistics
1
research
1
data-science
1
embeddings
1
text-generation
1
generative-ai
1
large-language-models
1
llm
1
ai
1
artificial-intelligence
1
deep-learning
1
machine-learning
1
accelerated
1
nepali
1
backtracking
1
bpe-dropout
1
tool
1
package
1
library
1
stable
1
production-ready
1
nlp-engineers
1
machine-learning-engineers
1
data-scientists
1
researchers
1
scientists
1
developers
1
tokenizers-library
1
huggingface-tokenizers
1
tiktoken-compatible
1
vocabulary
1
subword-units
1
subword-tokenization
1
byte-pair-encoding
1
machine-translation
1
icu
1
cpp
1
subword
1
unicode
1
opennmt
1
segmentation
1
learning
1
machine
1
tensorflow
1
escaping
1
web
1
nodejs
1
rapid
1
blazing-fast
1
speed
1
optimized
1
efficient
1
high-performance
1
performance
1
fast
1
cross-platform
1
pypi
1
python-package
1
python-library
1
cpython-extension
1
rust-extension
1
string-encoding
1
text-encoding
1
vocab
1