crates.io "text-processing" keyword
View the packages on the crates.io package registry that are tagged with the "text-processing" keyword.
whichlang 0.1.1
A blazingly fast and lightweight language detection library for Rust.2 versions - Latest release: about 1 year ago - 2 dependent packages - 1 dependent repositories - 194 thousand downloads total - 422 stars on GitHub - 3 maintainers
texfmt 0.1.0
(La)TeX formatter.1 version - Latest release: over 3 years ago - 1.55 thousand downloads total - 0 stars on GitHub - 1 maintainer
docfmt 0.1.1
A document formatter using Handlebars templates2 versions - Latest release: almost 2 years ago - 2.78 thousand downloads total - 0 stars on GitHub - 1 maintainer
drug-extraction-core 0.1.2
A core library for extracting drugs from text records3 versions - Latest release: over 3 years ago - 1 dependent package - 4.49 thousand downloads total - 3 stars on GitHub - 1 maintainer
Top 2.2% on crates.io
62 versions - Latest release: 3 months ago - 144 dependent packages - 66,037 dependent repositories - 645 million downloads total - 1,170 stars on GitHub - 1 maintainer
aho-corasick 1.1.4 💰
Fast multiple substring searching.62 versions - Latest release: 3 months ago - 144 dependent packages - 66,037 dependent repositories - 645 million downloads total - 1,170 stars on GitHub - 1 maintainer
aho-corasick-unsafe 0.0.4 💰
Fast multiple substring searching.4 versions - Latest release: over 1 year ago - 6.12 thousand downloads total - 1,170 stars on GitHub - 1 maintainer
sklears-feature-extraction 0.1.0-beta.1
Feature extraction from raw data (text, images)3 versions - Latest release: about 1 month ago - 354 downloads total - 1 stars on GitHub - 1 maintainer
ssort 0.3.0
CLI tool for suffix (inverse lexicographic) sorting3 versions - Latest release: 5 months ago - 1.02 thousand downloads total - 0 stars on GitHub - 1 maintainer
repvar 0.14.3
A tiny CLI tool that replaces variables of the style `${KEY}` in text with their respecti...17 versions - Latest release: 4 months ago - 2 dependent packages - 19.3 thousand downloads total - 0 stars on GitHub - 1 maintainer
rake 0.3.6
Rust implementation of Rapid Automatic Keyword Extraction (RAKE) algorithm13 versions - Latest release: 12 months ago - 1 dependent repositories - 62.9 thousand downloads total - 34 stars on GitHub - 1 maintainer
split-char-from-str 0.0.0
A small utility to split a string into the first or last character (type `char`) and the rest (ty...1 version - Latest release: over 1 year ago - 1.41 thousand downloads total - 0 stars on GitHub - 1 maintainer
ricat 0.4.5
A Rust-Based implemenation of classic UNIX `cat` command20 versions - Latest release: over 1 year ago - 24.2 thousand downloads total - 1 stars on GitHub - 1 maintainer
flashtext2 0.2.0
The FlashText algorithm implemented in Rust6 versions - Latest release: over 1 year ago - 9.04 thousand downloads total - 9 stars on GitHub - 1 maintainer
html-to-markdown-rs 2.24.1
High-performance HTML to Markdown converter using the astral-tl parser. Part of the Kreuzberg eco...81 versions - Latest release: 3 days ago - 49.8 thousand downloads total - 409 stars on GitHub - 1 maintainer
html-to-markdown-cli 2.24.1
Command-line interface for html-to-markdown - high-performance HTML to Markdown converter80 versions - Latest release: 3 days ago - 5.23 thousand downloads total - 409 stars on GitHub - 1 maintainer
te 💰
A really simple, stripped down & readable regular expression alternative for matching text1 version - Latest release: 4 days ago - 1 dependent package - 1 dependent repositories - 1.9 thousand downloads total - 77 stars on GitHub - 1 maintainer
rew 0.3.0
A text processing CLI tool that rewrites FS paths according to a pattern.3 versions - Latest release: almost 5 years ago - 4.72 thousand downloads total - 43 stars on GitHub - 1 maintainer
s3-concat 1.1.0
Concatenate Amazon S3 files remotely using flexible patterns2 versions - Latest release: over 6 years ago - 3.38 thousand downloads total - 38 stars on GitHub - 1 maintainer
cyrla 0.1.0
Library for two-way conversion between latin and cyrillic script1 version - Latest release: about 3 years ago - 1.64 thousand downloads total - 0 stars on GitHub - 1 maintainer
email-address-extractor 1.0.1
A blazingly fast command line tool written in pure safe Rust to automatically extract email addre...2 versions - Latest release: over 1 year ago - 2.6 thousand downloads total - 1 stars on GitHub - 1 maintainer
gfr 1.0.1
A blazingly-fast, Rust tool for finding patterns in code, inspired by 'gf'.2 versions - Latest release: 6 months ago - 776 downloads total - 0 stars on GitHub - 1 maintainer
Top 8.9% on crates.io
21 versions - Latest release: over 2 years ago - 5 dependent packages - 25 dependent repositories - 219 thousand downloads total - 102 stars on GitHub - 1 maintainer
qp-trie 0.8.2
An idiomatic and fast QP-trie implementation in pure Rust, written with an emphasis on safety.21 versions - Latest release: over 2 years ago - 5 dependent packages - 25 dependent repositories - 219 thousand downloads total - 102 stars on GitHub - 1 maintainer
whitespace_text_steganography 0.2.1
A steganography strategy that uses whitespace to hide text in other text6 versions - Latest release: over 7 years ago - 1 dependent package - 1 dependent repositories - 10.7 thousand downloads total - 0 stars on GitHub - 1 maintainer
agnostic-levenshtein 0.1.3
Levenshtein distance for ASCII or Unicode strings4 versions - Latest release: 12 months ago - 4.47 thousand downloads total - 0 stars on GitHub - 1 maintainer
autoruby 0.5.1
Easily generate furigana for various document formats8 versions - Latest release: over 2 years ago - 1 dependent package - 11.1 thousand downloads total - 10 stars on GitHub - 1 maintainer
noodler 0.1.0
A port of the python-ngram project that provides fuzzy search using N-gram.1 version - Latest release: almost 3 years ago - 7.05 thousand downloads total - 0 stars on GitHub - 1 maintainer
ssam 0.2.0
Ssam, short for split sampler, splits one or more text-based input files into multiple sets using...4 versions - Latest release: almost 5 years ago - 5.62 thousand downloads total - 2 stars on GitHub - 1 maintainer
tui_document 0.9.25
A Ratatui widget wrapping the Ropey crate.9 versions - Latest release: 8 months ago - 3.85 thousand downloads total - 1 stars on GitHub - 1 maintainer
Top 9.1% on crates.io
23 versions - Latest release: about 2 years ago - 1 dependent package - 2 dependent repositories - 484 thousand downloads total - 6,762 stars on GitHub - 2 maintainers
sd 1.0.0
An intuitive find & replace CLI23 versions - Latest release: about 2 years ago - 1 dependent package - 2 dependent repositories - 484 thousand downloads total - 6,762 stars on GitHub - 2 maintainers
mime-rs 0.3.0
A text processing framework, inspired by Emacs lisp and keyboard macros.1 version - Latest release: almost 3 years ago - 1.84 thousand downloads total - 7 stars on GitHub - 1 maintainer
codetypo 0.10.34
Source Code Spelling Correction1 version - Latest release: 11 months ago - 1.26 thousand downloads total - 0 stars on GitHub - 1 maintainer
ed_join 1.1.1 💰
A Rust Implemtation of Ed-Join Algorithm for string similarity join5 versions - Latest release: over 6 years ago - 7.01 thousand downloads total - 1 stars on GitHub - 1 maintainer
fuzzy-string-distance 1.0.0
Fuzzy string distance comparisons1 version - Latest release: over 1 year ago - 1.21 thousand downloads total - 1 stars on GitHub - 1 maintainer
repa 0.1.5
Peak Performance Pattern Seeker6 versions - Latest release: 12 months ago - 4.11 thousand downloads total - 0 stars on GitHub - 1 maintainer
s3-utils 1.1.0
Various tools and extensions around Amazon S32 versions - Latest release: almost 6 years ago - 3.28 thousand downloads total - 56 stars on GitHub - 1 maintainer
aneubeck-daachorse 1.1.1
Daachorse: Double-Array Aho-Corasick2 versions - Latest release: over 1 year ago - 63.9 thousand downloads total - 227 stars on GitHub - 3 maintainers
Top 9.6% on crates.io
11 versions - Latest release: over 3 years ago - 12 dependent packages - 4 dependent repositories - 630 thousand downloads total - 232 stars on GitHub - 2 maintainers
daachorse 1.0.0
Daachorse: Double-Array Aho-Corasick11 versions - Latest release: over 3 years ago - 12 dependent packages - 4 dependent repositories - 630 thousand downloads total - 232 stars on GitHub - 2 maintainers
mpatch 1.3.5
A smart, context-aware patch tool that applies diffs using fuzzy matching, ideal for AI-generated...14 versions - Latest release: about 1 month ago - 93.8 thousand downloads total - 2 stars on GitHub - 1 maintainer
textsurf 0.5.2
Webservice for efficiently serving multiple plain text documents or excerpts thereof (by unicode ...10 versions - Latest release: 12 days ago - 2.04 thousand downloads total - 1 stars on GitHub - 1 maintainer
autoruby-cli 0.5.1
CLI to easily generate furigana for various document formats8 versions - Latest release: over 2 years ago - 10.7 thousand downloads total - 6 stars on GitHub - 1 maintainer
markovish 0.2.2
Simple Markov chain implementation for text generation8 versions - Latest release: 11 months ago - 1 dependent package - 10.9 thousand downloads total - 7 stars on GitHub - 1 maintainer
typope 0.4.0
Pedantic source code checker for orthotypography mistakes and other typographical errors6 versions - Latest release: 11 months ago - 5.41 thousand downloads total - 1 stars on GitHub - 1 maintainer
spongebob 2.0.1
A utility to convert text to spongebob case a.k.a tHe MoCkInG sPoNgEbOb MeMe.8 versions - Latest release: about 1 year ago - 8.7 thousand downloads total - 1 stars on GitHub - 1 maintainer
matcher_rs 0.5.9
A high-performance matcher designed to solve LOGICAL and TEXT VARIATIONS problems in word matchin...42 versions - Latest release: about 1 month ago - 48.5 thousand downloads total - 15 stars on GitHub - 1 maintainer
drug-extraction-cli 1.3.0
A CLI for extracting drugs from text records6 versions - Latest release: almost 2 years ago - 9.34 thousand downloads total - 3 stars on GitHub - 1 maintainer
pray 1.5.0
A tui tool for preparing a prompt to the llms.11 versions - Latest release: about 1 year ago - 8.24 thousand downloads total - 1 stars on GitHub - 1 maintainer
wakuchin 0.3.0 💰
A next generation wakuchin researcher software written in Rust3 versions - Latest release: over 3 years ago - 1 dependent package - 4.42 thousand downloads total - 1 stars on GitHub - 1 maintainer
fastchr 0.3.0
Faster memchr using SIMD intrinsics3 versions - Latest release: over 7 years ago - 1 dependent package - 6.28 thousand downloads total - 15 stars on GitHub - 1 maintainer
rawk-core 0.0.5
Core library for the AWK interpreter5 versions - Latest release: 11 days ago - 109 downloads total - 0 stars on GitHub - 1 maintainer
deepfrog 0.2.1
A deep learning NLP suite (PoS,lemmatiser,NER) with FoLiA XML support2 versions - Latest release: almost 5 years ago - 3.03 thousand downloads total - 19 stars on GitHub - 1 maintainer
strip-codeblocks 0.1.0
A Rust library to strip markdown code blocks from text, preserving only the inner content1 version - Latest release: 2 months ago - 28 downloads total - 1 maintainer
red-sed 1.0.2
An experimental drop-in replacement for GNU sed, written in Rust2 versions - Latest release: 16 days ago - 28 downloads total
unaccent 0.1.1
A Rust crate to remove accents from strings, inspired by PostgreSQL's unaccent extension.2 versions - Latest release: 12 months ago - 22.8 thousand downloads total - 5 stars on GitHub - 1 maintainer
nlpo3 1.4.0
Thai natural language processing library, with Python and Node bindings8 versions - Latest release: about 1 year ago - 1 dependent package - 1 dependent repositories - 26.4 thousand downloads total - 38 stars on GitHub - 2 maintainers
smoltok-core 0.1.1
Byte-Pair Encoding tokenizer implementation in Rust2 versions - Latest release: 28 days ago - 24 downloads total - 1 maintainer
uroman 0.6.3
A blazingly fast, self-contained Rust reimplementation of the uroman universal romanizer.14 versions - Latest release: 3 months ago - 4.71 thousand downloads total - 35 stars on GitHub - 1 maintainer
flowmark 0.1.3
Fast, modern Markdown formatter with smart typography and paragraph wrapping2 versions - Latest release: 3 months ago - 44 downloads total - 1 maintainer
r4d 3.1.0
Text oriented macro processor70 versions - Latest release: over 3 years ago - 1 dependent package - 93.4 thousand downloads total - 16 stars on GitHub - 1 maintainer
kda-tools 1.3.1 💰
Tools for doing data management on a match journal, specifally for Hunt Showdown, but it'll work...3 versions - Latest release: over 2 years ago - 4.79 thousand downloads total - 3 stars on GitHub - 1 maintainer
skan 0.1.0
Skan is a Rust-native, Java Scanner-inspired library that provides type-safe, convenient methods ...1 version - Latest release: 5 months ago - 546 downloads total - 0 stars on GitHub - 1 maintainer
prunist 0.15.0
Experimental library for pruning tree structures based on priority rules; API may change5 versions - Latest release: 14 days ago - 188 downloads total - 49 stars on GitHub - 1 maintainer
headson 0.15.0
Budget‑constrained JSON preview renderer68 versions - Latest release: 14 days ago - 4.52 thousand downloads total - 49 stars on GitHub - 1 maintainer
jackdauer 0.1.2
Use this Rust crate to easily parse various time formats to durations3 versions - Latest release: almost 3 years ago - 1 dependent package - 1 dependent repositories - 8.62 thousand downloads total - 8 stars on GitHub - 1 maintainer
dcsv 0.3.3
Dyanmic csv reader,writer,editor13 versions - Latest release: about 2 years ago - 3 dependent packages - 4 dependent repositories - 22 thousand downloads total - 2 stars on GitHub - 1 maintainer
ised 0.3.2
An interactive tool for find-and-replace across many files6 versions - Latest release: 8 months ago - 3.18 thousand downloads total - 6 stars on GitHub - 1 maintainer
nlpo3-cli 0.2.0
Command line interface for nlpO3, a Thai natural language processing library3 versions - Latest release: over 4 years ago - 4.08 thousand downloads total - 36 stars on GitHub - 2 maintainers
moguls 0.1.1
Let the words of financial moguls inspire and guide you in your quest for financial excellence ...2 versions - Latest release: over 2 years ago - 2.66 thousand downloads total - 1 stars on GitHub - 1 maintainer
buup 0.25.3
Core transformation library with zero dependencies29 versions - Latest release: 4 months ago - 13.9 thousand downloads total - 7 stars on GitHub - 1 maintainer
dictutils 0.1.2
Dictionary utilities for Mdict and other formats3 versions - Latest release: about 2 months ago - 127 downloads total - 1 stars on GitHub - 1 maintainer
detex 0.2.1
Strip TeX/LaTeX commands from input files3 versions - Latest release: about 1 month ago - 49 downloads total - 1 maintainer
sesdiff 0.3.1
Generates a shortest edit script (Myers' diff algorithm) to indicate how to get from the strings ...8 versions - Latest release: over 1 year ago - 1 dependent package - 1 dependent repositories - 11.8 thousand downloads total - 4 stars on GitHub - 1 maintainer
text-tags 0.1.0
A lightweight, text-tag markup parser1 version - Latest release: 7 months ago - 446 downloads total - 5 stars on GitHub - 1 maintainer
sakurs-cli 0.1.1 💰
Command-line interface for Sakurs sentence boundary detection2 versions - Latest release: 6 months ago - 786 downloads total - 3 stars on GitHub - 1 maintainer
textcon 0.2.1
Template text files with file/directory references for AI/LLM consumption4 versions - Latest release: 23 days ago - 435 downloads total - 1 stars on GitHub - 1 maintainer
bidi 0.1.1
Implementation of the Unicode Bidirectional Algorithm (UBA).2 versions - Latest release: over 2 years ago - 2.91 thousand downloads total - 0 stars on GitHub - 1 maintainer
whetstone 1.0.0
Parses and evaluate string representations of mathematical expressions in various syntaxes1 version - Latest release: 3 months ago - 24 downloads total - 1 maintainer
lazy-transform-str 0.0.6 💰
Lazy-copying lazy-allocated scanning `str` transformations. This is good e.g. for (un)escaping te...6 versions - Latest release: over 5 years ago - 1 dependent package - 5 dependent repositories - 30.3 thousand downloads total - 1 stars on GitHub - 1 maintainer
extract 0.1.1
A tool for extracting text from text.2 versions - Latest release: almost 9 years ago - 4.17 thousand downloads total - 2 stars on GitHub - 1 maintainer
rjc 0.2.3
rjc converts the output of many commands, file-types, and strings to JSON, YAML, or TOML6 versions - Latest release: over 2 years ago - 7.79 thousand downloads total - 1 stars on GitHub - 1 maintainer
tdoc 0.9.2
Library and assorted CLI tools for working with FTML (Formatted Text Markup Language) documents17 versions - Latest release: 22 days ago - 723 downloads total - 0 stars on GitHub - 1 maintainer
lngcnv 1.10.2 💰
linguistics: display pronunciation, translate between dialects, convert between orthographies; su...57 versions - Latest release: 8 months ago - 67.3 thousand downloads total - 22 stars on GitHub - 1 maintainer
voirs-g2p 0.1.0-alpha.2
Grapheme-to-Phoneme conversion for VoiRS speech synthesis2 versions - Latest release: 4 months ago - 1.21 thousand downloads total - 3 stars on GitHub - 1 maintainer
vtext 0.2.0
NLP with Rust4 versions - Latest release: over 5 years ago - 3 dependent repositories - 14.7 thousand downloads total - 153 stars on GitHub - 1 maintainer
codetypo-vars 0.9.1
Source Code Spelling Correction1 version - Latest release: 11 months ago - 1.08 thousand downloads total - 0 stars on GitHub - 1 maintainer
text_analysis 0.4.9
A robust multilingual text analysis CLI with context, N-grams, named entities, and CSV/JSON export.26 versions - Latest release: 4 months ago - 23 thousand downloads total - 7 stars on GitHub - 1 maintainer
vi 0.8.0
An input method library for vietnamese IME21 versions - Latest release: 7 months ago - 27.6 thousand downloads total - 153 stars on GitHub - 1 maintainer
unic-ucd-utils
UNIC - Utilities for working with Unicode Code Points4 versions - Latest release: 22 days ago - 1 dependent package - 5.79 thousand downloads total - 243 stars on GitHub - 1 maintainer
unic-ucd-core 0.6.0
UNIC - Unicode Character Database - Version6 versions - Latest release: over 8 years ago - 11 dependent packages - 1 dependent repositories - 20.7 thousand downloads total - 242 stars on GitHub - 1 maintainer
unic-utils 0.6.0
UNIC - Utilities2 versions - Latest release: over 8 years ago - 9 dependent packages - 1 dependent repositories - 8.8 thousand downloads total - 242 stars on GitHub - 1 maintainer
merge-whitespace-utils 1.1.0
Procedural macros for merging whitespace in const contexts1 version - Latest release: about 1 year ago - 1.16 thousand downloads total - 2 stars on GitHub - 1 maintainer
fnew 1.0.1
A Unicode-aware line-oriented drop-in replacement for coreutils' fold.2 versions - Latest release: almost 6 years ago - 3.26 thousand downloads total - 3 stars on GitHub - 1 maintainer
sttx 0.1.0
Utility belt for transforming speech-to-text data1 version - Latest release: almost 2 years ago - 1.47 thousand downloads total - 0 stars on GitHub - 1 maintainer
codetypo-dict 0.12.7
Source Code Spelling Correction2 versions - Latest release: 11 months ago - 1.92 thousand downloads total - 0 stars on GitHub - 1 maintainer
uniaxe 0.1.1
A Rust crate to replace Unicode letters with Ascii equivalents2 versions - Latest release: about 5 years ago - 3.23 thousand downloads total - 6 stars on GitHub - 1 maintainer
nucleo-matcher 0.3.1
plug and play high performance fuzzy matcher5 versions - Latest release: almost 2 years ago - 4 dependent packages - 11 dependent repositories - 1.21 million downloads total - 1,218 stars on GitHub - 3 maintainers
nucleo 0.5.0
plug and play high performance fuzzy matcher9 versions - Latest release: almost 2 years ago - 7 dependent packages - 10 dependent repositories - 358 thousand downloads total - 1,218 stars on GitHub - 3 maintainers
pretok 0.1.0
A string pre-tokenizer for C-like syntaxes.1 version - Latest release: over 5 years ago - 1 dependent repositories - 1.69 thousand downloads total - 0 stars on GitHub - 1 maintainer
hck 0.11.5
A sharp cut(1) clone.53 versions - Latest release: 2 months ago - 67.7 thousand downloads total - 720 stars on GitHub - 1 maintainer
ter
A cli to run text expressions and perform basic text operations such as filtering, ignoring and r...2 versions - Latest release: 25 days ago - 2.76 thousand downloads total - 77 stars on GitHub - 1 maintainer
sakurs-core 0.1.1 💰
High-performance sentence boundary detection using Delta-Stack Monoid algorithm2 versions - Latest release: 6 months ago - 885 downloads total - 3 stars on GitHub - 2 maintainers
Related Keywords
rust
90
text
56
unicode
42
nlp
33
internationalization
33
crates
33
locale-data
32
cldr
32
unic
32
unicode-algorithms
32
unicode-characters
32
cli
31
character-property
14
linguistics
14
search
14
string
13
command-line-tool
10
parser
8
aho-corasick
8
markdown
8
python
8
rust-lang
8
parsing
8
regex
7
multi
7
language
7
annotation
7
spelling
7
developer-tools
7
tool
6
standoff
6
utilities
6
pattern
6
normalization
5
rust-library
5
performance
5
command-line
5
html
5
utility
5
tokenizer
5
automation
5
finite-state-machine
5
substring-matching
5
development
5
pattern-matching
5
spell-checker
4
spell-checking
4
llm
4
linux
4
string-matching
4
no-std
4
text-mining
4
natural-language-processing
4
chmod
4
word
4
code-typos
4
common-misspellings
4
extraction
4
fuzzy-search
4
common-mistakes
4
ai
4
malloc
4
misspelling
4
fuzzy-matching
4
grep
4
misspellings
4
open-source
4
levenshtein
4
algorithm
4
matcher
3
java
3
csv
3
grep-like
3
matching-engine
3
sensitive-word
3
text-analysis
3
filesystem
3
text-classification
3
scripting
3
json
3
bidi
3
yaml
3
japanese
3
sorting
3
macros
3
replace
3
macro
3
encoding
3
hacktoberfest
3
version
3
rust-crate
3
split
3
file
3
idna
3
strings
3
content-moderation
3
english
2
nodejs
2
high-performance
2
parse-combinator
2