An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

crates.io "tokenization" keyword

View the packages on the crates.io package registry that are tagged with the "tokenization" keyword.

colorblast-cli 0.0.1
Syntax highlighting CLI for various programming languages, markup languages and various other for...
1 version - Latest release: about 2 years ago - 1.3 thousand downloads total - 0 stars on GitHub - 1 maintainer
wordpieces 0.6.1
Split tokens into word pieces
10 versions - Latest release: almost 3 years ago - 3 dependent packages - 3 dependent repositories - 19.7 thousand downloads total - 5 stars on GitHub - 1 maintainer
classi-cine 0.4.2
A tool that builds smart video playlists by learning your preferences through Bayesian classifica...
10 versions - Latest release: 2 months ago - 9.93 thousand downloads total - 4 stars on GitHub - 1 maintainer
tokenizer-lib 1.6.0
Tokenization utilities for building parsers in Rust
15 versions - Latest release: about 1 year ago - 2 dependent packages - 1 dependent repositories - 21.6 thousand downloads total - 2 stars on GitHub - 1 maintainer
build-trie 0.1.1
Procedural macro for generating match and state code representing a trie structure
2 versions - Latest release: over 4 years ago - 2.69 thousand downloads total - 3 stars on GitHub - 1 maintainer
libtqsm 0.6.1
Sentence segmenter that supports ~300 languages
1 version - Latest release: over 1 year ago - 1 dependent package - 1.75 thousand downloads total - 2 stars on GitHub - 1 maintainer
crossandra 0.0.2 💰
A straightforward tokenization library for seamless text processing.
2 versions - Latest release: 7 months ago - 1.54 thousand downloads total - 8 stars on GitHub - 1 maintainer
sentence 0.0.2
Sentence tokenizes English language sentences for use in TTS applications.
2 versions - Latest release: about 5 years ago - 2.66 thousand downloads total - 2 stars on GitHub - 1 maintainer
vaporetto_rules 0.6.5
Rule-base filters for Vaporetto
12 versions - Latest release: 4 months ago - 1 dependent package - 1 dependent repositories - 55.8 thousand downloads total - 238 stars on GitHub - 1 maintainer
vaporetto 0.6.5
Vaporetto: a pointwise prediction based tokenizer
18 versions - Latest release: 4 months ago - 3 dependent packages - 1 dependent repositories - 113 thousand downloads total - 238 stars on GitHub - 1 maintainer
vaporetto_tantivy 0.24.0
Vaporetto Tokenizer for Tantivy
15 versions - Latest release: about 1 month ago - 17.9 thousand downloads total - 238 stars on GitHub - 1 maintainer
vibrato 0.5.2
Vibrato: viterbi-based accelerated tokenizer
12 versions - Latest release: 4 months ago - 1 dependent package - 1 dependent repositories - 37.2 thousand downloads total - 360 stars on GitHub - 2 maintainers
agrocrypto-core 0.1.0
The core engine of AgroCrypto: a blockchain-native asset tokenization and settlement layer.
1 version - Latest release: 4 months ago - 530 downloads total - 1 maintainer
bpetok 0.1.2
A simple CLI for tokenizing text input using Byte Pair Encoding (BPE).
3 versions - Latest release: 10 months ago - 2.5 thousand downloads total - 1 maintainer
strizer 0.1.0
minimal and fast library for text tokenization
1 version - Latest release: over 4 years ago - 1.69 thousand downloads total - 1 stars on GitHub - 1 maintainer
rustrawi 0.1.2 💰
Rust port of the original PHP Sastrawi
3 versions - Latest release: over 2 years ago - 3.14 thousand downloads total - 0 stars on GitHub - 1 maintainer
fern-tokenization 0.0.0
Empty crate, used only to reserve the name.
1 version - Latest release: almost 3 years ago - 1.41 thousand downloads total - 14 stars on GitHub - 1 maintainer
tuck5 0.2.0
A pragmatic lexer/parser generator
4 versions - Latest release: over 1 year ago - 4.2 thousand downloads total - 0 stars on GitHub - 1 maintainer
pretok 0.1.0
A string pre-tokenizer for C-like syntaxes.
1 version - Latest release: almost 5 years ago - 1 dependent repositories - 1.43 thousand downloads total - 0 stars on GitHub - 1 maintainer
blex 0.2.2
A lightweight lexing framework
4 versions - Latest release: over 2 years ago - 1 dependent package - 4.83 thousand downloads total - 0 stars on GitHub - 1 maintainer
derive-finite-automaton 0.3.0
Procedural macro for generating finite automaton
6 versions - Latest release: 25 days ago - 1 dependent package - 1 dependent repositories - 10.7 thousand downloads total - 2 stars on GitHub - 1 maintainer
derive-finite-automaton-derive 0.3.0
Procedural macro for generating finite automaton
6 versions - Latest release: 25 days ago - 1 dependent package - 1 dependent repositories - 11.1 thousand downloads total - 2 stars on GitHub - 1 maintainer
unscanny 0.1.0 💰
Painless string scanning.
1 version - Latest release: over 3 years ago - 8 dependent packages - 28 dependent repositories - 3.48 million downloads total - 55 stars on GitHub - 1 maintainer
any-lexer 0.0.3
Lexers for various programming languages and formats
3 versions - Latest release: about 2 years ago - 2 dependent packages - 5.12 thousand downloads total - 0 stars on GitHub - 1 maintainer
colorblast 0.0.3
Syntax highlighting library for various programming languages, markup languages and various other...
3 versions - Latest release: about 2 years ago - 1 dependent package - 3.68 thousand downloads total - 0 stars on GitHub - 1 maintainer
vtext 0.2.0
NLP with Rust
4 versions - Latest release: about 5 years ago - 3 dependent repositories - 12.1 thousand downloads total - 150 stars on GitHub - 1 maintainer
text-scanner 0.0.3
A UTF-8 char-oriented, zero-copy, text and code scanning library
3 versions - Latest release: about 2 years ago - 1 dependent package - 4.87 thousand downloads total - 0 stars on GitHub - 1 maintainer