Ecosyste.ms: Packages
An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.
crates.io "tokenizer" keyword
nlpo3 1.3.2
Thai natural language processing library, with Python and Node bindings7 versions - Latest release: over 2 years ago - 1 dependent package - 1 dependent repositories - 4.6 thousand downloads total - 30 stars on GitHub - 2 maintainers
nlpo3-cli 0.2.0
Command line interface for nlpO3, a Thai natural language processing library3 versions - Latest release: almost 3 years ago - 1.12 thousand downloads total - 30 stars on GitHub - 2 maintainers
sqlite3-parser 0.12.0
SQL parser (as understood by SQLite)11 versions - Latest release: 6 months ago - 3 dependent packages - 2 dependent repositories - 59.3 thousand downloads total - 41 stars on GitHub - 1 maintainer
fuzzy-pickles 0.1.1
A low-level parser of Rust source code with high-level visitor implementations2 versions - Latest release: over 3 years ago - 1 dependent repositories - 1.59 thousand downloads total - 7 stars on GitHub - 1 maintainer
erl_tokenize 0.6.1 💰
Erlang source code tokenizer28 versions - Latest release: 3 months ago - 5 dependent packages - 3 dependent repositories - 22.3 thousand downloads total - 8 stars on GitHub - 1 maintainer
pgn-lexer 0.1.1
A lexer for PGN files for chess. Provides an iterator over the tokens from a byte stream.3 versions - Latest release: over 6 years ago - 1.92 thousand downloads total - 1 stars on GitHub - 1 maintainer
Top 8.3% on crates.io
14 versions - Latest release: 10 months ago - 2 dependent packages - 220 dependent repositories - 639 thousand downloads total - 146 stars on GitHub - 1 maintainer
html5gum 0.5.7
A WHATWG-compliant HTML5 tokenizer and tag soup parser.14 versions - Latest release: 10 months ago - 2 dependent packages - 220 dependent repositories - 639 thousand downloads total - 146 stars on GitHub - 1 maintainer
svgrtypes 0.42.0
SVG types parser.15 versions - Latest release: 1 day ago - 2 dependent packages - 5.06 thousand downloads total - 1 maintainer
c-lexer-stable 0.1.4
C lexer4 versions - Latest release: about 3 years ago - 17 thousand downloads total - 2 stars on GitHub - 1 maintainer
Top 2.5% on crates.io
25 versions - Latest release: 23 days ago - 20 dependent packages - 281 dependent repositories - 741 thousand downloads total - 8,474 stars on GitHub - 3 maintainers
tokenizers 0.19.1
Provides an implementation of today's most used tokenizers, with a focus on performances and vers...25 versions - Latest release: 23 days ago - 20 dependent packages - 281 dependent repositories - 741 thousand downloads total - 8,474 stars on GitHub - 3 maintainers
indentation_flattener 0.1.0
From indented input, generate plain output with indentation PUSH and POP codes.1 version - Latest release: about 7 years ago - 935 downloads total - 0 stars on GitHub - 1 maintainer
json-parser 1.0.2
JSON parser3 versions - Latest release: almost 5 years ago - 2.92 thousand downloads total - 4 stars on GitHub - 1 maintainer
indent_tokenizer 0.4.0
Generate tokens based on indentation4 versions - Latest release: over 6 years ago - 2.9 thousand downloads total - 1 stars on GitHub - 1 maintainer
rustpostal 0.3.0
Rust bindings to libpostal4 versions - Latest release: about 2 years ago - 1.68 thousand downloads total - 14 stars on GitHub - 1 maintainer
lox-scanner 0.1.0
lexical scanner for Lox3 versions - Latest release: over 2 years ago - 1.07 thousand downloads total - 0 stars on GitHub - 1 maintainer
condex 1.0.0 💰
Extract tokens by simple condition expression.1 version - Latest release: over 2 years ago - 431 downloads total - 2 stars on GitHub - 1 maintainer
Top 6.5% on crates.io
23 versions - Latest release: 3 days ago - 24 dependent packages - 532 dependent repositories - 1.99 million downloads total - 66 stars on GitHub - 1 maintainer
svgtypes 0.15.1
SVG types parser.23 versions - Latest release: 3 days ago - 24 dependent packages - 532 dependent repositories - 1.99 million downloads total - 66 stars on GitHub - 1 maintainer
lexers 0.1.4
Tools for tokenizing and scanning11 versions - Latest release: about 2 years ago - 7 dependent packages - 4 dependent repositories - 14.4 thousand downloads total - 64 stars on GitHub - 1 maintainer
absolution 0.1.1
'Freedom from `syn`'. A lightweight Rust lexer designed for use in bang-style proc macros.3 versions - Latest release: about 4 years ago - 1 dependent package - 2.14 thousand downloads total - 107 stars on GitHub - 1 maintainer
text-splitter 0.13.1
Split text into semantic chunks, up to a desired chunk size. Supports calculating length by chara...30 versions - Latest release: 3 days ago - 1 dependent repositories - 37.3 thousand downloads total - 135 stars on GitHub - 1 maintainer
Top 4.8% on crates.io
24 versions - Latest release: 7 months ago - 31 dependent packages - 2,453 dependent repositories - 16.8 million downloads total - 128 stars on GitHub - 2 maintainers
xmlparser 0.13.6
Pull-based, zero-allocation XML parser.24 versions - Latest release: 7 months ago - 31 dependent packages - 2,453 dependent repositories - 16.8 million downloads total - 128 stars on GitHub - 2 maintainers
alpino-tokenize 0.4.0
Wrapper around the Alpino tokenizer for Dutch5 versions - Latest release: 6 months ago - 2.19 thousand downloads total - 3 stars on GitHub - 1 maintainer
alpino-tokenizer-sys 0.2.1
Low-level wrapper around the Alpino tokenizer for Dutch3 versions - Latest release: almost 4 years ago - 1 dependent package - 1 dependent repositories - 2.38 thousand downloads total - 3 stars on GitHub - 1 maintainer
alpino-tokenizer 0.4.0
Wrapper around the Alpino tokenizer for Dutch4 versions - Latest release: 6 months ago - 1 dependent package - 2 dependent repositories - 3.34 thousand downloads total - 3 stars on GitHub - 1 maintainer
vaporetto 0.6.3
Vaporetto: a pointwise prediction based tokenizer16 versions - Latest release: about 1 year ago - 3 dependent packages - 1 dependent repositories - 70.8 thousand downloads total - 213 stars on GitHub - 1 maintainer
vaporetto_tantivy 0.20.0
Vaporetto Tokenizer for Tantivy10 versions - Latest release: 11 months ago - 3.14 thousand downloads total - 213 stars on GitHub - 1 maintainer
vaporetto_rules 0.6.3
Rule-base filters for Vaporetto10 versions - Latest release: about 1 year ago - 1 dependent package - 1 dependent repositories - 33.2 thousand downloads total - 213 stars on GitHub - 1 maintainer
parsit 0.2.0
very simple lib, the parsing combinators, recursive descendent that uses logos as lexer17 versions - Latest release: 10 months ago - 1 dependent package - 1 dependent repositories - 5.28 thousand downloads total - 7 stars on GitHub - 1 maintainer
bracoxide 0.1.3
A feature-rich library for brace pattern combination, permutation generation, and error handling.4 versions - Latest release: 8 months ago - 1 dependent package - 6 dependent repositories - 89.4 thousand downloads total - 1 stars on GitHub - 1 maintainer
libsql-sqlite3-parser 0.11.1
SQL parser (as understood by SQLite) (libsql fork)2 versions - Latest release: 2 months ago - 15.5 thousand downloads total - 1 maintainer
luther-derive 0.1.0
The proc macro generator for the Luther lexer generator.1 version - Latest release: almost 6 years ago - 1.3 thousand downloads total - 5 stars on GitHub - 1 maintainer
cang-jie 0.18.0
A Chinese tokenizer for tantivy20 versions - Latest release: 6 months ago - 6 dependent packages - 13 dependent repositories - 25 thousand downloads total - 68 stars on GitHub - 1 maintainer
bleuscore 0.1.2
A fast(not yet :) bleu score calculator3 versions - Latest release: 11 days ago - 581 downloads total - 0 stars on GitHub - 1 maintainer
c_lexer 0.1.1
C lexer2 versions - Latest release: about 5 years ago - 1 dependent package - 1 dependent repositories - 2.22 thousand downloads total - 6 stars on GitHub - 1 maintainer
bytepiece 0.2.0
Rust version of bytepiece tokenizer2 versions - Latest release: 8 months ago - 527 downloads total - 9 stars on GitHub - 1 maintainer
tokengeex 1.0.0
TokenGeeX is an efficient tokenizer for code based on UnigramLM and TokenMonster.9 versions - Latest release: 13 days ago - 2.84 thousand downloads total - 1 maintainer
lindera-tantivy 0.27.1 💰
Lindera Tokenizer for Tantivy.40 versions - Latest release: 5 months ago - 5 dependent packages - 7 dependent repositories - 21.7 thousand downloads total - 46 stars on GitHub - 4 maintainers
tantivy-stemmers 0.2.0
A collection of Tantivy stemmer tokenizers2 versions - Latest release: 7 days ago - 185 downloads total - 0 stars on GitHub - 2 maintainers
Top 6.2% on crates.io
2 versions - Latest release: 3 months ago - 2 dependent packages - 28 dependent repositories - 1.42 million downloads total - 2,632 stars on GitHub - 2 maintainers
logos-codegen 0.14.0 💰
Create ridiculously fast Lexers2 versions - Latest release: 3 months ago - 2 dependent packages - 28 dependent repositories - 1.42 million downloads total - 2,632 stars on GitHub - 2 maintainers
logos2 💰
Create ridiculously fast Lexers6 versions - Latest release: 7 days ago - 1.51 thousand downloads total - 2,632 stars on GitHub - 1 maintainer
logos-derive2 💰
Create ridiculously fast Lexers6 versions - Latest release: 7 days ago - 1.54 thousand downloads total - 2,632 stars on GitHub - 1 maintainer
Top 2.8% on crates.io
50 versions - Latest release: 3 months ago - 199 dependent packages - 606 dependent repositories - 5.75 million downloads total - 2,632 stars on GitHub - 2 maintainers
logos 0.14.0 💰
Create ridiculously fast Lexers50 versions - Latest release: 3 months ago - 199 dependent packages - 606 dependent repositories - 5.75 million downloads total - 2,632 stars on GitHub - 2 maintainers
logos-codegen2 💰
Create ridiculously fast Lexers6 versions - Latest release: 7 days ago - 1.55 thousand downloads total - 2,632 stars on GitHub - 1 maintainer
Top 3.5% on crates.io
46 versions - Latest release: 3 months ago - 7 dependent packages - 539 dependent repositories - 5.75 million downloads total - 2,632 stars on GitHub - 2 maintainers
logos-derive 0.14.0 💰
Create ridiculously fast Lexers46 versions - Latest release: 3 months ago - 7 dependent packages - 539 dependent repositories - 5.75 million downloads total - 2,632 stars on GitHub - 2 maintainers
logos-cli2 💰
Create ridiculously fast Lexers6 versions - Latest release: 7 days ago - 1.5 thousand downloads total - 2,632 stars on GitHub - 1 maintainer
logos-cli 0.14.0 💰
Create ridiculously fast Lexers2 versions - Latest release: 3 months ago - 608 downloads total - 2,632 stars on GitHub - 2 maintainers
tantivy-czech-stemmer 0.2.1
Czech stemmer as Tantivy tokenizer2 versions - Latest release: 9 days ago - 281 downloads total - 0 stars on GitHub - 2 maintainers
pkl_fast 0.1.1
A library aiming to easily and efficiently work with Apple's PKL format.2 versions - Latest release: 3 months ago - 620 downloads total - 3 stars on GitHub - 1 maintainer
luther 0.1.0
The runtime components of the Luther lexer generator.1 version - Latest release: almost 6 years ago - 1 dependent package - 1 dependent repositories - 1.84 thousand downloads total - 5 stars on GitHub - 1 maintainer
Top 7.0% on crates.io
19 versions - Latest release: 10 days ago - 3 dependent packages - 33 dependent repositories - 215 thousand downloads total - 211 stars on GitHub - 2 maintainers
charabia 0.8.10
A simple library to detect the language, tokenize the text and normalize the tokens19 versions - Latest release: 10 days ago - 3 dependent packages - 33 dependent repositories - 215 thousand downloads total - 211 stars on GitHub - 2 maintainers
rustfst-ffi 1.0.1
Library for constructing, combining, optimizing, and searching weighted finite-state transducers ...13 versions - Latest release: 8 days ago - 3.3 thousand downloads total - 138 stars on GitHub - 1 maintainer
rustfst 1.0.1
Library for constructing, combining, optimizing, and searching weighted finite-state transducers ...47 versions - Latest release: 8 days ago - 3 dependent packages - 1 dependent repositories - 282 thousand downloads total - 138 stars on GitHub - 1 maintainer
rust_transformers 0.2.0
High performance tokenizers for Rust2 versions - Latest release: about 4 years ago - 1 dependent package - 1.01 thousand downloads total - 270 stars on GitHub - 1 maintainer
Top 6.3% on crates.io
34 versions - Latest release: 7 months ago - 7 dependent packages - 225 dependent repositories - 164 thousand downloads total - 270 stars on GitHub - 1 maintainer
rust_tokenizers 8.1.1
High performance tokenizers for Rust34 versions - Latest release: 7 months ago - 7 dependent packages - 225 dependent repositories - 164 thousand downloads total - 270 stars on GitHub - 1 maintainer
fileql 0.3.0 💰
A tool to run SQL-like query on local files using GitQL SDK3 versions - Latest release: 11 days ago - 919 downloads total - 55 stars on GitHub - 1 maintainer
xxcalc 0.2.1
Embeddable or standalone robust floating-point polynomial calculator4 versions - Latest release: over 7 years ago - 1 dependent repositories - 8.17 thousand downloads total - 13 stars on GitHub - 1 maintainer
html5tokenizer 0.5.2
An HTML5 tokenizer with code span support.7 versions - Latest release: 8 months ago - 1 dependent repositories - 2.2 thousand downloads total - 1 maintainer
tele_tokenizer 0.2.0
A CSS tokenizer2 versions - Latest release: about 2 years ago - 3 dependent packages - 1 dependent repositories - 1.69 thousand downloads total - 199 stars on GitHub - 1 maintainer
Top 7.2% on crates.io
27 versions - Latest release: 5 months ago - 24 dependent packages - 73 dependent repositories - 290 thousand downloads total - 198 stars on GitHub - 1 maintainer
tiktoken-rs 0.5.8
Library for encoding and decoding with the tiktoken library in Rust27 versions - Latest release: 5 months ago - 24 dependent packages - 73 dependent repositories - 290 thousand downloads total - 198 stars on GitHub - 1 maintainer
another-tiktoken-rs 0.1.2
Library for encoding and decoding with the tiktoken library in Rust3 versions - Latest release: 8 months ago - 768 downloads total - 198 stars on GitHub - 1 maintainer
bytepiece_rs 0.2.2
The Bytepiece Tokenizer Implemented in Rust7 versions - Latest release: 6 months ago - 1 dependent package - 2.07 thousand downloads total - 14 stars on GitHub - 1 maintainer
svgparser 0.8.1
Featureful, pull-based, zero-allocation SVG parser.21 versions - Latest release: about 6 years ago - 4 dependent packages - 98 dependent repositories - 101 thousand downloads total - 22 stars on GitHub - 1 maintainer
Top 5.7% on crates.io
35 versions - Latest release: 27 days ago - 11 dependent packages - 41 dependent repositories - 283 thousand downloads total - 349 stars on GitHub - 1 maintainer
lindera-decompress 0.30.0 💰
A morphological analysis library.35 versions - Latest release: 27 days ago - 11 dependent packages - 41 dependent repositories - 283 thousand downloads total - 349 stars on GitHub - 1 maintainer
Top 6.2% on crates.io
37 versions - Latest release: 27 days ago - 5 dependent packages - 40 dependent repositories - 279 thousand downloads total - 349 stars on GitHub - 1 maintainer
lindera-cc-cedict-builder 0.30.0 💰
A Chinese morphological dictionary builder for CC-CEDICT.37 versions - Latest release: 27 days ago - 5 dependent packages - 40 dependent repositories - 279 thousand downloads total - 349 stars on GitHub - 1 maintainer
Top 6.2% on crates.io
42 versions - Latest release: 27 days ago - 5 dependent packages - 40 dependent repositories - 281 thousand downloads total - 349 stars on GitHub - 4 maintainers
lindera-ko-dic-builder 0.30.0 💰
A Korean morphological dictionary builder for ko-dic.42 versions - Latest release: 27 days ago - 5 dependent packages - 40 dependent repositories - 281 thousand downloads total - 349 stars on GitHub - 4 maintainers
Top 6.2% on crates.io
47 versions - Latest release: 27 days ago - 5 dependent packages - 40 dependent repositories - 282 thousand downloads total - 349 stars on GitHub - 4 maintainers
lindera-unidic-builder 0.30.0 💰
A Japanese morphological dictionary builder for UniDic.47 versions - Latest release: 27 days ago - 5 dependent packages - 40 dependent repositories - 282 thousand downloads total - 349 stars on GitHub - 4 maintainers
tusk_lexer 0.4.7
The lexical analysis component of Tusk.21 versions - Latest release: almost 3 years ago - 1 dependent package - 6.9 thousand downloads total - 1 maintainer
lexical_scanner 0.1.18
A simple lexer which creates over 115+ various tokens based on the rust programming language. Thi...19 versions - Latest release: about 2 years ago - 7.25 thousand downloads total - 2 stars on GitHub - 1 maintainer
Top 5.4% on crates.io
72 versions - Latest release: 27 days ago - 10 dependent packages - 124 dependent repositories - 182 thousand downloads total - 349 stars on GitHub - 4 maintainers
lindera 0.30.0 💰
A morphological analysis library.72 versions - Latest release: 27 days ago - 10 dependent packages - 124 dependent repositories - 182 thousand downloads total - 349 stars on GitHub - 4 maintainers
Top 5.2% on crates.io
52 versions - Latest release: 27 days ago - 10 dependent packages - 238 dependent repositories - 306 thousand downloads total - 349 stars on GitHub - 4 maintainers
lindera-dictionary 0.30.0 💰
A Japanese morphological dictionary.52 versions - Latest release: 27 days ago - 10 dependent packages - 238 dependent repositories - 306 thousand downloads total - 349 stars on GitHub - 4 maintainers
Top 5.6% on crates.io
54 versions - Latest release: 27 days ago - 6 dependent packages - 235 dependent repositories - 299 thousand downloads total - 349 stars on GitHub - 4 maintainers
lindera-ipadic-builder 0.30.0 💰
A Japanese morphological dictionary builder for IPADIC.54 versions - Latest release: 27 days ago - 6 dependent packages - 235 dependent repositories - 299 thousand downloads total - 349 stars on GitHub - 4 maintainers
Top 4.9% on crates.io
54 versions - Latest release: 27 days ago - 28 dependent packages - 239 dependent repositories - 313 thousand downloads total - 349 stars on GitHub - 4 maintainers
lindera-core 0.30.0 💰
A morphological analysis library.54 versions - Latest release: 27 days ago - 28 dependent packages - 239 dependent repositories - 313 thousand downloads total - 349 stars on GitHub - 4 maintainers
Top 6.1% on crates.io
59 versions - Latest release: 27 days ago - 4 dependent packages - 126 dependent repositories - 193 thousand downloads total - 349 stars on GitHub - 4 maintainers
lindera-ipadic 0.30.0 💰
A Japanese morphological dictionary for IPADIC.59 versions - Latest release: 27 days ago - 4 dependent packages - 126 dependent repositories - 193 thousand downloads total - 349 stars on GitHub - 4 maintainers
Top 7.9% on crates.io
10 versions - Latest release: 7 months ago - 4 dependent packages - 20 dependent repositories - 35 thousand downloads total - 396 stars on GitHub - 1 maintainer
plex 0.3.0
A syntax extension for writing lexers and parsers.10 versions - Latest release: 7 months ago - 4 dependent packages - 20 dependent repositories - 35 thousand downloads total - 396 stars on GitHub - 1 maintainer
rye-grain 0.0.1
A Python to Rust translator1 version - Latest release: over 1 year ago - 401 downloads total - 1 stars on GitHub - 1 maintainer
vibrato 0.5.1
Vibrato: viterbi-based accelerated tokenizer11 versions - Latest release: 12 months ago - 1 dependent package - 1 dependent repositories - 11.6 thousand downloads total - 292 stars on GitHub - 2 maintainers
azul-simplecss 0.1.1
A very simple CSS 2.1 tokenizer.2 versions - Latest release: almost 5 years ago - 1 dependent package - 4 dependent repositories - 17.3 thousand downloads total - 29 stars on GitHub - 1 maintainer
chat-splitter 0.1.1 💰
Never exceed OpenAI's chat models' maximum number of tokens when using the async_openai Rust crate2 versions - Latest release: 8 months ago - 550 downloads total - 2 stars on GitHub - 1 maintainer
ast-rs 0.0.1
AST Toolkit for Rust1 version - Latest release: almost 2 years ago - 345 downloads total - 0 stars on GitHub - 1 maintainer
rustrawi 0.1.2 💰
Rust port of the original PHP Sastrawi3 versions - Latest release: about 1 year ago - 713 downloads total - 0 stars on GitHub - 1 maintainer
sana 0.1.1
Create lexers easily2 versions - Latest release: over 3 years ago - 1 dependent repositories - 992 downloads total - 1 maintainer
sentencepiece-model 0.1.0 💰
Sentencepiece model parser1 version - Latest release: 6 months ago - 311 downloads total - 0 stars on GitHub - 1 maintainer
tokenizer-lib 1.5.1
Tokenization utilities for building parsers in Rust14 versions - Latest release: 8 months ago - 2 dependent packages - 1 dependent repositories - 6.55 thousand downloads total - 2 stars on GitHub - 1 maintainer
blex 0.2.2
A lightweight lexing framework4 versions - Latest release: about 1 year ago - 1 dependent package - 1.17 thousand downloads total - 0 stars on GitHub - 1 maintainer
basic_lexer 0.2.1
Basic lexical analyzer for parsing and compiling6 versions - Latest release: over 2 years ago - 1.79 thousand downloads total - 0 stars on GitHub - 1 maintainer
pretok 0.1.0
A string pre-tokenizer for C-like syntaxes.1 version - Latest release: over 3 years ago - 1 dependent repositories - 457 downloads total - 0 stars on GitHub - 1 maintainer
sana_derive 0.1.1
The derive macro for Sana2 versions - Latest release: over 3 years ago - 1 dependent package - 1 dependent repositories - 1.45 thousand downloads total - 1 maintainer
blingfire 1.0.0
Wrapper for the BlingFire tokenization library5 versions - Latest release: almost 4 years ago - 47.8 thousand downloads total - 16 stars on GitHub - 1 maintainer
blingfire-sys 1.0.1
Bindings to the BlingFire C++ library5 versions - Latest release: almost 4 years ago - 1 dependent package - 48.5 thousand downloads total - 16 stars on GitHub - 1 maintainer
aleph-alpha-tokenizer 0.3.1
A fast implementation of a wordpiece-inspired tokenizer4 versions - Latest release: almost 4 years ago - 1.86 thousand downloads total - 6 stars on GitHub - 1 maintainer
nnsplit 0.5.9
A tool to split text using a neural network. For sentence boundary detection, compound splitting ...29 versions - Latest release: about 1 year ago - 1 dependent repositories - 12.4 thousand downloads total - 476 stars on GitHub - 1 maintainer
token 1.0.0-rc1
A simple string-tokenizer (and sentence splitter) Note: If you find that you would like to use t...1 version - Latest release: about 9 years ago - 1 dependent repositories - 1.95 thousand downloads total - 5 stars on GitHub - 1 maintainer
regex-tokenizer 0.1.1
A regex tokenizer2 versions - Latest release: about 1 year ago - 529 downloads total - 0 stars on GitHub - 2 maintainers
sana_core 0.1.1
The core of Sana2 versions - Latest release: over 3 years ago - 2 dependent packages - 1 dependent repositories - 1.77 thousand downloads total - 1 maintainer
jayce 12.1.0
jayce is a tokenizer 🌌34 versions - Latest release: about 2 months ago - 8.91 thousand downloads total - 1 stars on GitHub - 1 maintainer
javascript_lexer 0.1.8
Javascript lexer9 versions - Latest release: almost 4 years ago - 4.43 thousand downloads total - 8 stars on GitHub - 1 maintainer
lang_pt 0.1.2
A parser tool to generate recursive descent top down parser.3 versions - Latest release: over 1 year ago - 829 downloads total - 6 stars on GitHub - 1 maintainer
libsimple 0.1.0
Rust bindings to simple, a SQLite3 fts5 tokenizer which supports Chinese and PinYin.2 versions - Latest release: 26 days ago - 136 downloads total - 1 maintainer
tantivy-tokenizer-tiny-segmenter 0.3.0
A Japanese tokenizer for Tantivy, based on TinySegmenter.3 versions - Latest release: over 4 years ago - 1 dependent repositories - 2.08 thousand downloads total - 12 stars on GitHub - 1 maintainer
char-lex 1.0.5
Create easy enum based lexers11 versions - Latest release: about 4 years ago - 3.81 thousand downloads total - 1 maintainer
Related Keywords
lexer
37
parser
37
rust
33
analyzer
23
morphological
22
library
20
nlp
19
multilingual
19
parsing
17
japanese
11
dictionary
10
lexical
10
scanner
10
no_std
9
analysis
9
lexer-generator
9
tokenization
8
token
7
text
6
generator
6
python
6
tantivy
6
builder
5
machine-learning
5
deep-learning
4
ai
4
morphological-analysis
4
bpe
4
rust-lang
4
natural-language-processing
4
sql
4
openai
4
segmentation
4
ipadic
4
rust-crate
3
thai
3
sentence
3
regex
3
lex
3
text-processing
3
cli
3
alpino
3
svg
3
chinese
3
dutch
3
stemmer
3
language
3
gpt
3
html
3
kaldi-asr
2
javascript
2
word-segmentation
2
kaldi
2
string
2
fsts
2
finite-state-transducers
2
dfa
2
finite-state-acceptors
2
composition
2
automata
2
asr
2
rust-wrapper
2
blingfire
2
transducer
2
acceptor
2
fst
2
graph
2
chatgpt
2
hacktoberfest
2
nodejs
2
thai-language
2
parser-generator
2
css
2
whatwg
2
html5
2
xml
2
c
2
cc-cedict
2
ko-dic
2
language-model
2
neologd
2
korean
2
indentation
2
processing
2
unidic
2
openfst
2
splitter
2
transformer
2
wfst
2
transducers
2
speech-recognition
2
split
2
sqlite
2
shortest-path
2
gpt-4
1
chat
1
artificial-intelligence
1
search
1
translator
1
scanlex
1