Ecosyste.ms: Packages
An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.
crates.io "tokenizer" keyword
rust_transformers 0.2.0
High performance tokenizers for Rust2 versions - Latest release: over 4 years ago - 1 dependent package - 1.12 thousand downloads total - 273 stars on GitHub - 1 maintainer
Top 6.3% on crates.io
34 versions - Latest release: 8 months ago - 11 dependent packages - 225 dependent repositories - 171 thousand downloads total - 273 stars on GitHub - 1 maintainer
rust_tokenizers 8.1.1
High performance tokenizers for Rust34 versions - Latest release: 8 months ago - 11 dependent packages - 225 dependent repositories - 171 thousand downloads total - 273 stars on GitHub - 1 maintainer
xxcalc 0.2.1
Embeddable or standalone robust floating-point polynomial calculator4 versions - Latest release: over 7 years ago - 1 dependent repositories - 8.37 thousand downloads total - 13 stars on GitHub - 1 maintainer
Top 7.0% on crates.io
20 versions - Latest release: about 1 month ago - 3 dependent packages - 33 dependent repositories - 230 thousand downloads total - 222 stars on GitHub - 2 maintainers
charabia 0.8.10
A simple library to detect the language, tokenize the text and normalize the tokens20 versions - Latest release: about 1 month ago - 3 dependent packages - 33 dependent repositories - 230 thousand downloads total - 222 stars on GitHub - 2 maintainers
html5tokenizer 0.5.2
An HTML5 tokenizer with code span support.7 versions - Latest release: 8 months ago - 1 dependent repositories - 2.53 thousand downloads total - 1 maintainer
fileql 0.3.0 💰
A tool to run SQL-like query on local files using GitQL SDK3 versions - Latest release: about 1 month ago - 1.12 thousand downloads total - 56 stars on GitHub - 1 maintainer
tele_tokenizer 0.2.0
A CSS tokenizer2 versions - Latest release: about 2 years ago - 3 dependent packages - 1 dependent repositories - 1.83 thousand downloads total - 198 stars on GitHub - 1 maintainer
bytepiece_rs 0.2.2
The Bytepiece Tokenizer Implemented in Rust7 versions - Latest release: 7 months ago - 1 dependent package - 2.36 thousand downloads total - 14 stars on GitHub - 1 maintainer
svgparser 0.8.1
Featureful, pull-based, zero-allocation SVG parser.21 versions - Latest release: about 6 years ago - 4 dependent packages - 98 dependent repositories - 104 thousand downloads total - 22 stars on GitHub - 1 maintainer
Top 2.5% on crates.io
25 versions - Latest release: about 1 month ago - 60 dependent packages - 281 dependent repositories - 798 thousand downloads total - 8,543 stars on GitHub - 3 maintainers
tokenizers 0.19.1
Provides an implementation of today's most used tokenizers, with a focus on performances and vers...25 versions - Latest release: about 1 month ago - 60 dependent packages - 281 dependent repositories - 798 thousand downloads total - 8,543 stars on GitHub - 3 maintainers
lindera-analyzer 0.30.0 💰
A morphological analysis library.10 versions - Latest release: about 2 months ago - 2 dependent packages - 1 dependent repositories - 12.9 thousand downloads total - 349 stars on GitHub - 1 maintainer
Top 7.2% on crates.io
38 versions - Latest release: about 2 months ago - 3 dependent packages - 24 dependent repositories - 183 thousand downloads total - 349 stars on GitHub - 1 maintainer
lindera-ko-dic 0.30.0 💰
A Japanese morphological dictionary for ko-dic.38 versions - Latest release: about 2 months ago - 3 dependent packages - 24 dependent repositories - 183 thousand downloads total - 349 stars on GitHub - 1 maintainer
Top 10.0% on crates.io
37 versions - Latest release: about 2 months ago - 2 dependent packages - 2 dependent repositories - 40.7 thousand downloads total - 349 stars on GitHub - 1 maintainer
lindera-cc-cedict 0.30.0 💰
A Japanese morphological dictionary for CC-CEDICT.37 versions - Latest release: about 2 months ago - 2 dependent packages - 2 dependent repositories - 40.7 thousand downloads total - 349 stars on GitHub - 1 maintainer
Top 5.2% on crates.io
53 versions - Latest release: about 2 months ago - 10 dependent packages - 238 dependent repositories - 341 thousand downloads total - 349 stars on GitHub - 4 maintainers
lindera-dictionary 0.30.0 💰
A Japanese morphological dictionary.53 versions - Latest release: about 2 months ago - 10 dependent packages - 238 dependent repositories - 341 thousand downloads total - 349 stars on GitHub - 4 maintainers
Top 5.4% on crates.io
73 versions - Latest release: about 2 months ago - 10 dependent packages - 124 dependent repositories - 193 thousand downloads total - 349 stars on GitHub - 4 maintainers
lindera 0.30.0 💰
A morphological analysis library.73 versions - Latest release: about 2 months ago - 10 dependent packages - 124 dependent repositories - 193 thousand downloads total - 349 stars on GitHub - 4 maintainers
Top 8.4% on crates.io
14 versions - Latest release: about 2 months ago - 3 dependent packages - 16 dependent repositories - 29.2 thousand downloads total - 349 stars on GitHub - 1 maintainer
lindera-filter 0.30.0 💰
Character and token filters for Lindera.14 versions - Latest release: about 2 months ago - 3 dependent packages - 16 dependent repositories - 29.2 thousand downloads total - 349 stars on GitHub - 1 maintainer
lindera-cli 0.30.0 💰
A morphological analysis command line interface.72 versions - Latest release: about 2 months ago - 25.5 thousand downloads total - 349 stars on GitHub - 4 maintainers
Top 6.2% on crates.io
38 versions - Latest release: about 2 months ago - 6 dependent packages - 40 dependent repositories - 313 thousand downloads total - 349 stars on GitHub - 1 maintainer
lindera-cc-cedict-builder 0.30.0 💰
A Chinese morphological dictionary builder for CC-CEDICT.38 versions - Latest release: about 2 months ago - 6 dependent packages - 40 dependent repositories - 313 thousand downloads total - 349 stars on GitHub - 1 maintainer
Top 7.1% on crates.io
10 versions - Latest release: about 2 months ago - 12 dependent packages - 3 dependent repositories - 117 thousand downloads total - 349 stars on GitHub - 1 maintainer
lindera-tokenizer 0.30.0 💰
A morphological analysis library.10 versions - Latest release: about 2 months ago - 12 dependent packages - 3 dependent repositories - 117 thousand downloads total - 349 stars on GitHub - 1 maintainer
bleuscore 0.1.2
A fast bleu score calculator4 versions - Latest release: about 1 month ago - 739 downloads total - 0 stars on GitHub - 1 maintainer
tusk_lexer 0.4.7
The lexical analysis component of Tusk.21 versions - Latest release: almost 3 years ago - 1 dependent package - 7.51 thousand downloads total - 1 maintainer
lexical_scanner 0.1.18
A simple lexer which creates over 115+ various tokens based on the rust programming language. Thi...19 versions - Latest release: about 2 years ago - 8.17 thousand downloads total - 2 stars on GitHub - 1 maintainer
Top 7.9% on crates.io
10 versions - Latest release: 7 months ago - 4 dependent packages - 20 dependent repositories - 36.5 thousand downloads total - 399 stars on GitHub - 1 maintainer
plex 0.3.0
A syntax extension for writing lexers and parsers.10 versions - Latest release: 7 months ago - 4 dependent packages - 20 dependent repositories - 36.5 thousand downloads total - 399 stars on GitHub - 1 maintainer
rye-grain 0.0.1
A Python to Rust translator1 version - Latest release: over 1 year ago - 455 downloads total - 1 stars on GitHub - 1 maintainer
vibrato 0.5.1
Vibrato: viterbi-based accelerated tokenizer11 versions - Latest release: about 1 year ago - 1 dependent package - 1 dependent repositories - 12.7 thousand downloads total - 295 stars on GitHub - 2 maintainers
azul-simplecss 0.1.1
A very simple CSS 2.1 tokenizer.2 versions - Latest release: almost 5 years ago - 1 dependent package - 4 dependent repositories - 17.7 thousand downloads total - 32 stars on GitHub - 1 maintainer
tokenizer-lib 1.5.1
Tokenization utilities for building parsers in Rust15 versions - Latest release: 9 months ago - 2 dependent packages - 1 dependent repositories - 7.39 thousand downloads total - 2 stars on GitHub - 1 maintainer
chat-splitter 0.1.1 💰
Never exceed OpenAI's chat models' maximum number of tokens when using the async_openai Rust crate2 versions - Latest release: 9 months ago - 646 downloads total - 2 stars on GitHub - 1 maintainer
ast-rs 0.0.1
AST Toolkit for Rust1 version - Latest release: almost 2 years ago - 406 downloads total - 0 stars on GitHub - 1 maintainer
vaporetto_tantivy 0.20.0
Vaporetto Tokenizer for Tantivy10 versions - Latest release: 12 months ago - 3.27 thousand downloads total - 215 stars on GitHub - 1 maintainer
vaporetto_rules 0.6.3
Rule-base filters for Vaporetto10 versions - Latest release: about 1 year ago - 1 dependent package - 1 dependent repositories - 33.9 thousand downloads total - 215 stars on GitHub - 1 maintainer
vaporetto 0.6.3
Vaporetto: a pointwise prediction based tokenizer16 versions - Latest release: about 1 year ago - 3 dependent packages - 1 dependent repositories - 72.6 thousand downloads total - 215 stars on GitHub - 1 maintainer
rustrawi 0.1.2 💰
Rust port of the original PHP Sastrawi3 versions - Latest release: over 1 year ago - 830 downloads total - 0 stars on GitHub - 1 maintainer
sqlite3-parser 0.12.0
SQL parser (as understood by SQLite)11 versions - Latest release: 7 months ago - 3 dependent packages - 2 dependent repositories - 89.7 thousand downloads total - 42 stars on GitHub - 1 maintainer
sentencepiece-model 0.1.0 💰
Sentencepiece model parser1 version - Latest release: 7 months ago - 360 downloads total - 0 stars on GitHub - 1 maintainer
sana 0.1.1
Create lexers easily2 versions - Latest release: almost 4 years ago - 1 dependent repositories - 1.08 thousand downloads total - 1 maintainer
libsimple 0.2.1
Rust bindings to simple, a SQLite3 fts5 tokenizer which supports Chinese and PinYin.4 versions - Latest release: about 2 months ago - 639 downloads total - 1 maintainer
basic_lexer 0.2.1
Basic lexical analyzer for parsing and compiling6 versions - Latest release: over 2 years ago - 2.02 thousand downloads total - 0 stars on GitHub - 1 maintainer
blex 0.2.2
A lightweight lexing framework4 versions - Latest release: about 1 year ago - 1 dependent package - 1.34 thousand downloads total - 0 stars on GitHub - 1 maintainer
sana_derive 0.1.1
The derive macro for Sana2 versions - Latest release: almost 4 years ago - 1 dependent package - 1 dependent repositories - 1.56 thousand downloads total - 1 maintainer
pretok 0.1.0
A string pre-tokenizer for C-like syntaxes.1 version - Latest release: over 3 years ago - 1 dependent repositories - 506 downloads total - 0 stars on GitHub - 1 maintainer
blingfire 1.0.0
Wrapper for the BlingFire tokenization library5 versions - Latest release: almost 4 years ago - 49.6 thousand downloads total - 15 stars on GitHub - 1 maintainer
blingfire-sys 1.0.1
Bindings to the BlingFire C++ library5 versions - Latest release: almost 4 years ago - 1 dependent package - 50.3 thousand downloads total - 15 stars on GitHub - 1 maintainer
aleph-alpha-tokenizer 0.3.1
A fast implementation of a wordpiece-inspired tokenizer4 versions - Latest release: almost 4 years ago - 2.05 thousand downloads total - 6 stars on GitHub - 1 maintainer
nnsplit 0.5.9
A tool to split text using a neural network. For sentence boundary detection, compound splitting ...29 versions - Latest release: about 1 year ago - 1 dependent repositories - 13.6 thousand downloads total - 489 stars on GitHub - 1 maintainer
regex-tokenizer 0.1.1
A regex tokenizer2 versions - Latest release: about 1 year ago - 606 downloads total - 0 stars on GitHub - 2 maintainers
token 1.0.0-rc1
A simple string-tokenizer (and sentence splitter) Note: If you find that you would like to use t...1 version - Latest release: over 9 years ago - 1 dependent repositories - 2.01 thousand downloads total - 5 stars on GitHub - 1 maintainer
tokengeex 1.0.1
TokenGeeX is an efficient tokenizer for code based on UnigramLM and TokenMonster.10 versions - Latest release: 15 days ago - 2.99 thousand downloads total - 3 stars on GitHub - 1 maintainer
javascript_lexer 0.1.8
Javascript lexer9 versions - Latest release: about 4 years ago - 4.78 thousand downloads total - 8 stars on GitHub - 1 maintainer
sana_core 0.1.1
The core of Sana2 versions - Latest release: almost 4 years ago - 2 dependent packages - 1 dependent repositories - 1.91 thousand downloads total - 1 maintainer
jayce 12.1.0
jayce is a tokenizer 🌌34 versions - Latest release: 3 months ago - 10.3 thousand downloads total - 1 stars on GitHub - 1 maintainer
lang_pt 0.1.2
A parser tool to generate recursive descent top down parser.3 versions - Latest release: over 1 year ago - 950 downloads total - 6 stars on GitHub - 1 maintainer
giron 0.1.2
ECMAScript parser which outputs ESTree JSON.3 versions - Latest release: over 4 years ago - 1 dependent repositories - 1.4 thousand downloads total - 20 stars on GitHub - 1 maintainer
tantivy-tokenizer-tiny-segmenter 0.3.0
A Japanese tokenizer for Tantivy, based on TinySegmenter.3 versions - Latest release: over 4 years ago - 1 dependent repositories - 2.23 thousand downloads total - 12 stars on GitHub - 1 maintainer
char-lex 1.0.5
Create easy enum based lexers11 versions - Latest release: about 4 years ago - 4.24 thousand downloads total - 1 maintainer
another-tiktoken-rs 0.1.2
Library for encoding and decoding with the tiktoken library in Rust3 versions - Latest release: 9 months ago - 849 downloads total - 200 stars on GitHub - 1 maintainer
Top 7.2% on crates.io
28 versions - Latest release: 17 days ago - 39 dependent packages - 73 dependent repositories - 319 thousand downloads total - 200 stars on GitHub - 1 maintainer
tiktoken-rs 0.5.9
Library for encoding and decoding with the tiktoken library in Rust28 versions - Latest release: 17 days ago - 39 dependent packages - 73 dependent repositories - 319 thousand downloads total - 200 stars on GitHub - 1 maintainer
simple-cursor 0.1.1
A super simple character cursor implementation geared towards lexers/tokenizers.2 versions - Latest release: 11 months ago - 629 downloads total - 0 stars on GitHub - 1 maintainer
gpt_tokenizer 0.1.0
Rust BPE Encoder Decoder (Tokenizer) for GPT-2 / GPT-31 version - Latest release: about 1 year ago - 1 dependent package - 478 downloads total - 12 stars on GitHub - 1 maintainer
lindera-ipadic-neologd 0.30.0 💰
A Japanese morphological dictionary for IPADIC NEologd.8 versions - Latest release: about 2 months ago - 1 dependent package - 4.11 thousand downloads total - 349 stars on GitHub - 1 maintainer
punkt 1.0.5
An implementation of a Punkt sentence tokenizer8 versions - Latest release: over 5 years ago - 3 dependent packages - 3 dependent repositories - 11.5 thousand downloads total - 34 stars on GitHub - 1 maintainer
Top 6.1% on crates.io
59 versions - Latest release: about 2 months ago - 4 dependent packages - 126 dependent repositories - 203 thousand downloads total - 349 stars on GitHub - 4 maintainers
lindera-ipadic 0.30.0 💰
A Japanese morphological dictionary for IPADIC.59 versions - Latest release: about 2 months ago - 4 dependent packages - 126 dependent repositories - 203 thousand downloads total - 349 stars on GitHub - 4 maintainers
Top 4.9% on crates.io
54 versions - Latest release: about 2 months ago - 28 dependent packages - 239 dependent repositories - 338 thousand downloads total - 349 stars on GitHub - 4 maintainers
lindera-core 0.30.0 💰
A morphological analysis library.54 versions - Latest release: about 2 months ago - 28 dependent packages - 239 dependent repositories - 338 thousand downloads total - 349 stars on GitHub - 4 maintainers
Top 8.7% on crates.io
36 versions - Latest release: about 2 months ago - 2 dependent packages - 3 dependent repositories - 87.3 thousand downloads total - 349 stars on GitHub - 1 maintainer
lindera-unidic 0.30.0 💰
A Japanese morphological dictionary for UniDic.36 versions - Latest release: about 2 months ago - 2 dependent packages - 3 dependent repositories - 87.3 thousand downloads total - 349 stars on GitHub - 1 maintainer
Top 6.2% on crates.io
47 versions - Latest release: about 2 months ago - 6 dependent packages - 40 dependent repositories - 307 thousand downloads total - 349 stars on GitHub - 4 maintainers
lindera-unidic-builder 0.30.0 💰
A Japanese morphological dictionary builder for UniDic.47 versions - Latest release: about 2 months ago - 6 dependent packages - 40 dependent repositories - 307 thousand downloads total - 349 stars on GitHub - 4 maintainers
Top 7.0% on crates.io
35 versions - Latest release: about 2 months ago - 5 dependent packages - 21 dependent repositories - 164 thousand downloads total - 349 stars on GitHub - 1 maintainer
lindera-compress 0.30.0 💰
A morphological analysis library.35 versions - Latest release: about 2 months ago - 5 dependent packages - 21 dependent repositories - 164 thousand downloads total - 349 stars on GitHub - 1 maintainer
Top 5.7% on crates.io
35 versions - Latest release: about 2 months ago - 11 dependent packages - 41 dependent repositories - 307 thousand downloads total - 349 stars on GitHub - 1 maintainer
lindera-decompress 0.30.0 💰
A morphological analysis library.35 versions - Latest release: about 2 months ago - 11 dependent packages - 41 dependent repositories - 307 thousand downloads total - 349 stars on GitHub - 1 maintainer
Top 5.6% on crates.io
54 versions - Latest release: about 2 months ago - 7 dependent packages - 235 dependent repositories - 324 thousand downloads total - 349 stars on GitHub - 4 maintainers
lindera-ipadic-builder 0.30.0 💰
A Japanese morphological dictionary builder for IPADIC.54 versions - Latest release: about 2 months ago - 7 dependent packages - 235 dependent repositories - 324 thousand downloads total - 349 stars on GitHub - 4 maintainers
Top 6.2% on crates.io
42 versions - Latest release: about 2 months ago - 6 dependent packages - 40 dependent repositories - 305 thousand downloads total - 349 stars on GitHub - 4 maintainers
lindera-ko-dic-builder 0.30.0 💰
A Korean morphological dictionary builder for ko-dic.42 versions - Latest release: about 2 months ago - 6 dependent packages - 40 dependent repositories - 305 thousand downloads total - 349 stars on GitHub - 4 maintainers
Top 8.0% on crates.io
15 versions - Latest release: about 2 months ago - 3 dependent packages - 3 dependent repositories - 157 thousand downloads total - 349 stars on GitHub - 4 maintainers
lindera-ipadic-neologd-builder 0.30.0 💰
A Japanese morphological dictionary builder for IPADIC NEologd.15 versions - Latest release: about 2 months ago - 3 dependent packages - 3 dependent repositories - 157 thousand downloads total - 349 stars on GitHub - 4 maintainers
Top 4.8% on crates.io
24 versions - Latest release: 8 months ago - 35 dependent packages - 2,453 dependent repositories - 16.8 million downloads total - 128 stars on GitHub - 2 maintainers
xmlparser 0.13.6
Pull-based, zero-allocation XML parser.24 versions - Latest release: 8 months ago - 35 dependent packages - 2,453 dependent repositories - 16.8 million downloads total - 128 stars on GitHub - 2 maintainers
text-splitter 0.13.1
Split text into semantic chunks, up to a desired chunk size. Supports calculating length by chara...30 versions - Latest release: 25 days ago - 5 dependent packages - 1 dependent repositories - 37.3 thousand downloads total - 135 stars on GitHub - 1 maintainer
Top 6.5% on crates.io
23 versions - Latest release: 25 days ago - 26 dependent packages - 532 dependent repositories - 1.99 million downloads total - 66 stars on GitHub - 1 maintainer
svgtypes 0.15.1
SVG types parser.23 versions - Latest release: 25 days ago - 26 dependent packages - 532 dependent repositories - 1.99 million downloads total - 66 stars on GitHub - 1 maintainer
logos-derive2 💰
Create ridiculously fast Lexers6 versions - Latest release: 29 days ago - 1 dependent package - 1.54 thousand downloads total - 2,632 stars on GitHub - 1 maintainer
logos-codegen2 💰
Create ridiculously fast Lexers6 versions - Latest release: 29 days ago - 2 dependent packages - 1.55 thousand downloads total - 2,632 stars on GitHub - 1 maintainer
Top 2.8% on crates.io
50 versions - Latest release: 4 months ago - 235 dependent packages - 606 dependent repositories - 5.75 million downloads total - 2,632 stars on GitHub - 2 maintainers
logos 0.14.0 💰
Create ridiculously fast Lexers50 versions - Latest release: 4 months ago - 235 dependent packages - 606 dependent repositories - 5.75 million downloads total - 2,632 stars on GitHub - 2 maintainers
libsql-sqlite3-parser 0.11.1
SQL parser (as understood by SQLite) (libsql fork)2 versions - Latest release: 3 months ago - 1 dependent package - 15.5 thousand downloads total - 1 maintainer
lexers 0.1.4
Tools for tokenizing and scanning11 versions - Latest release: about 2 years ago - 8 dependent packages - 4 dependent repositories - 14.4 thousand downloads total - 64 stars on GitHub - 1 maintainer
htmlparser 0.1.1
Pull-based, zero-allocation HTML parser.2 versions - Latest release: 11 months ago - 2 dependent packages - 2.62 thousand downloads total - 0 stars on GitHub - 1 maintainer
bracoxide 0.1.3
A feature-rich library for brace pattern combination, permutation generation, and error handling.4 versions - Latest release: 9 months ago - 2 dependent packages - 6 dependent repositories - 89.4 thousand downloads total - 1 stars on GitHub - 1 maintainer
svgrtypes 0.42.2
SVG types parser.16 versions - Latest release: 19 days ago - 2 dependent packages - 5.34 thousand downloads total - 1 maintainer
rustfst-ffi 1.0.1
Library for constructing, combining, optimizing, and searching weighted finite-state transducers ...13 versions - Latest release: about 1 month ago - 3.61 thousand downloads total - 138 stars on GitHub - 1 maintainer
rustfst 1.0.1
Library for constructing, combining, optimizing, and searching weighted finite-state transducers ...47 versions - Latest release: about 1 month ago - 3 dependent packages - 1 dependent repositories - 286 thousand downloads total - 138 stars on GitHub - 1 maintainer
langbox 0.5.0
A simple framework to build compilers and interpreters8 versions - Latest release: 8 months ago - 2.61 thousand downloads total - 0 stars on GitHub - 1 maintainer
uscan 0.1.3
A universal source code scanner4 versions - Latest release: over 1 year ago - 1.24 thousand downloads total - 0 stars on GitHub - 1 maintainer
tiniestsegmenter 0.1.1
Compact Japanese segmenter2 versions - Latest release: 21 days ago - 226 downloads total - 0 stars on GitHub - 1 maintainer
Top 3.5% on crates.io
46 versions - Latest release: 4 months ago - 7 dependent packages - 539 dependent repositories - 5.75 million downloads total - 2,632 stars on GitHub - 2 maintainers
logos-derive 0.14.0 💰
Create ridiculously fast Lexers46 versions - Latest release: 4 months ago - 7 dependent packages - 539 dependent repositories - 5.75 million downloads total - 2,632 stars on GitHub - 2 maintainers
Top 8.3% on crates.io
14 versions - Latest release: 10 months ago - 2 dependent packages - 220 dependent repositories - 639 thousand downloads total - 146 stars on GitHub - 1 maintainer
html5gum 0.5.7
A WHATWG-compliant HTML5 tokenizer and tag soup parser.14 versions - Latest release: 10 months ago - 2 dependent packages - 220 dependent repositories - 639 thousand downloads total - 146 stars on GitHub - 1 maintainer
Top 6.2% on crates.io
2 versions - Latest release: 4 months ago - 2 dependent packages - 28 dependent repositories - 1.42 million downloads total - 2,632 stars on GitHub - 2 maintainers
logos-codegen 0.14.0 💰
Create ridiculously fast Lexers2 versions - Latest release: 4 months ago - 2 dependent packages - 28 dependent repositories - 1.42 million downloads total - 2,632 stars on GitHub - 2 maintainers
nipah_tokenizer 0.1.0
A powerful yet simple text tokenizer for your everyday needs!1 version - Latest release: over 1 year ago - 409 downloads total - 0 stars on GitHub - 1 maintainer
sqlite3_tokenizer 0.1.0
Tokenizes SQL strings as SQLite would1 version - Latest release: almost 9 years ago - 1.84 thousand downloads total - 0 stars on GitHub - 1 maintainer
regex-lexer 0.2.0
A regex-based lexer (tokenizer)3 versions - Latest release: almost 2 years ago - 3 dependent packages - 4 dependent repositories - 9.53 thousand downloads total - 6 stars on GitHub - 1 maintainer
rust-forth-tokenizer 0.2.0
A Forth tokenizer written in Rust.9 versions - Latest release: over 4 years ago - 1 dependent package - 5.13 thousand downloads total - 1 stars on GitHub - 1 maintainer
sql-script-parser 0.1.2 💰
sql-script-parser iterates over SQL statements in SQL script.3 versions - Latest release: about 3 years ago - 1.42 thousand downloads total - 2 stars on GitHub - 1 maintainer
tokenizer 0.1.2
Thai text tokenizer2 versions - Latest release: about 4 years ago - 1.18 thousand downloads total - 3 stars on GitHub - 1 maintainer
regex-lexer-lalrpop 0.3.0
A regex-based lexer (tokenizer)4 versions - Latest release: over 2 years ago - 1.35 thousand downloads total - 0 stars on GitHub - 1 maintainer
nlpo3 1.3.2
Thai natural language processing library, with Python and Node bindings7 versions - Latest release: over 2 years ago - 1 dependent package - 1 dependent repositories - 4.6 thousand downloads total - 30 stars on GitHub - 2 maintainers
nlpo3-cli 0.2.0
Command line interface for nlpO3, a Thai natural language processing library3 versions - Latest release: almost 3 years ago - 1.12 thousand downloads total - 30 stars on GitHub - 2 maintainers
fuzzy-pickles 0.1.1
A low-level parser of Rust source code with high-level visitor implementations2 versions - Latest release: almost 4 years ago - 1 dependent repositories - 1.59 thousand downloads total - 7 stars on GitHub - 1 maintainer
erl_tokenize 0.6.1 💰
Erlang source code tokenizer28 versions - Latest release: 3 months ago - 5 dependent packages - 3 dependent repositories - 22.3 thousand downloads total - 8 stars on GitHub - 1 maintainer
Related Keywords
lexer
37
parser
37
rust
33
analyzer
23
morphological
22
nlp
20
library
20
multilingual
19
parsing
17
japanese
12
dictionary
10
scanner
10
lexical
10
no_std
9
lexer-generator
9
analysis
9
tokenization
8
token
7
tantivy
6
text
6
python
6
generator
6
builder
5
machine-learning
5
openai
4
ai
4
morphological-analysis
4
segmentation
4
natural-language-processing
4
bpe
4
deep-learning
4
rust-lang
4
ipadic
4
sql
4
html
3
cli
3
lex
3
language
3
text-processing
3
thai
3
chinese
3
rust-crate
3
regex
3
sentence
3
stemmer
3
svg
3
gpt
3
alpino
3
dutch
3
blingfire
2
rust-wrapper
2
transducers
2
wfst
2
thai-language
2
nodejs
2
speech-recognition
2
shortest-path
2
chatgpt
2
hacktoberfest
2
split
2
openfst
2
word-segmentation
2
sqlite
2
kaldi-asr
2
kaldi
2
splitter
2
finite-state-transducers
2
fsts
2
string
2
transformer
2
html5
2
whatwg
2
xml
2
unidic
2
neologd
2
css
2
dfa
2
language-model
2
graph
2
fst
2
acceptor
2
korean
2
finite-state-acceptors
2
composition
2
parser-generator
2
javascript
2
c
2
indentation
2
processing
2
automata
2
asr
2
transducer
2
cc-cedict
2
ko-dic
2
brace-expansion
1
braces
1
combination
1
ffi
1
permutation
1
brace_expansion
1