An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

crates.io "text-processing" keyword

Top 9.1% on crates.io
sd 1.0.0
An intuitive find & replace CLI
23 versions - Latest release: over 2 years ago - 1 dependent package - 2 dependent repositories - 494 thousand downloads total - 6,953 stars on GitHub - 2 maintainers
potential_utf 0.1.5 💰
Unvalidated string and character types
5 versions - Latest release: 15 days ago - 154 million downloads total - 1,777 stars on GitHub - 2 maintainers
sliceslice 0.4.3
A fast implementation of single-pattern substring search using SIMD acceleration
9 versions - Latest release: almost 2 years ago - 2 dependent repositories - 4.92 million downloads total - 99 stars on GitHub - 1 maintainer
tashkil 0.1.0 💰
A lightweight library for removing Arabic diacritics
1 version - Latest release: over 3 years ago - 1.58 thousand downloads total - 19 stars on GitHub - 1 maintainer
vi 0.8.0
An input method library for vietnamese IME
21 versions - Latest release: 10 months ago - 28.2 thousand downloads total - 154 stars on GitHub - 1 maintainer
html-to-markdown-rs 3.2.2
High-performance HTML to Markdown converter using the astral-tl parser. Part of the Kreuzberg eco...
107 versions - Latest release: about 13 hours ago - 200 thousand downloads total - 569 stars on GitHub - 1 maintainer
Top 2.2% on crates.io
aho-corasick 1.1.4 💰
Fast multiple substring searching.
62 versions - Latest release: 6 months ago - 144 dependent packages - 66,037 dependent repositories - 760 million downloads total - 1,229 stars on GitHub - 1 maintainer
seams 0.1.1
High-throughput sentence extractor for Project Gutenberg texts with dialog-aware detection
2 versions - Latest release: 9 months ago - 1.08 thousand downloads total - 3 stars on GitHub - 1 maintainer
Top 8.4% on crates.io
bytelines 2.5.0
Read input lines as byte slices for high efficiency
10 versions - Latest release: over 2 years ago - 13 dependent packages - 30 dependent repositories - 317 thousand downloads total - 66 stars on GitHub - 1 maintainer
aho-corasick-unsafe 💰
Fast multiple substring searching.
4 versions - Latest release: about 17 hours ago - 6.31 thousand downloads total - 1,227 stars on GitHub - 1 maintainer
regexy 0.2.0
A simple and lightweight Rust library for working with regular expressions. The regexy crate prov...
2 versions - Latest release: over 1 year ago - 2.09 thousand downloads total - 1 stars on GitHub - 1 maintainer
textcon 0.2.1
Template text files with file/directory references for AI/LLM consumption
4 versions - Latest release: 3 months ago - 451 downloads total - 1 stars on GitHub - 1 maintainer
txt_processor 0.1.4
A little library for text processing
4 versions - Latest release: over 2 years ago - 4.86 thousand downloads total - 2 stars on GitHub - 1 maintainer
phonics 0.1.0
Phonetic spelling algorithms in Rust
2 versions - Latest release: almost 6 years ago - 2.87 thousand downloads total - 2 stars on GitHub - 1 maintainer
chunk_norris 0.2.1
A Rust library for splitting large text into smaller batches for LLM input.
3 versions - Latest release: about 1 year ago - 2.38 thousand downloads total - 1 stars on GitHub - 1 maintainer
cutters 0.1.4
Rule based sentence segmentation library.
5 versions - Latest release: over 2 years ago - 6.84 thousand downloads total - 14 stars on GitHub - 1 maintainer
xxxxx_rust_sts 0.1.0
A collection of useful string and file utilities for Rust
1 version - Latest release: 10 months ago - 484 downloads total - 1 maintainer
sttx 0.1.0
Utility belt for transforming speech-to-text data
1 version - Latest release: about 2 years ago - 1.48 thousand downloads total - 0 stars on GitHub - 1 maintainer
aneubeck-daachorse 1.1.1
Daachorse: Double-Array Aho-Corasick
2 versions - Latest release: over 1 year ago - 88.1 thousand downloads total - 246 stars on GitHub - 3 maintainers
Top 9.6% on crates.io
daachorse 2.0.0
Daachorse: Double-Array Aho-Corasick
13 versions - Latest release: 8 days ago - 12 dependent packages - 4 dependent repositories - 841 thousand downloads total - 246 stars on GitHub - 2 maintainers
html-to-markdown-cli 3.2.0
Command-line interface for html-to-markdown - high-performance HTML to Markdown converter
104 versions - Latest release: 2 days ago - 6.12 thousand downloads total - 569 stars on GitHub - 1 maintainer
synoptic 2.2.9
A simple, low-level, syntax highlighting library with unicode support
33 versions - Latest release: over 1 year ago - 3 dependent packages - 2 dependent repositories - 561 thousand downloads total - 31 stars on GitHub - 1 maintainer
codetypo-cli 1.30.2
Source Code Spelling Correction
1 version - Latest release: about 1 year ago - 1.01 thousand downloads total - 0 stars on GitHub - 1 maintainer
pptx-to-md 0.4.0
Parse Microsoft PowerPoint files (.pptx) into Markdown (.md)
6 versions - Latest release: 10 months ago - 22.5 thousand downloads total - 2 stars on GitHub - 1 maintainer
srch 0.0.1 💰
Text Search For Humans
1 version - Latest release: over 2 years ago - 1.76 thousand downloads total - 78 stars on GitHub - 1 maintainer
waken_snowball 0.1.0
Rust implementation of Snowball stemming algorithms for 33 languages
1 version - Latest release: 9 months ago - 1.47 thousand downloads total - 849 stars on GitHub - 1 maintainer
bible-io 1.0.2
A Rust library for working with Bible text data structures
3 versions - Latest release: 4 months ago - 677 downloads total - 0 stars on GitHub - 1 maintainer
stam-python 0.12.1
STAM is a library for dealing with standoff annotations on text, this is the python binding.
12 versions - Latest release: 3 months ago - 10.5 thousand downloads total - 1 stars on GitHub - 1 maintainer
abbreviation_extractor 0.1.4
A library for extracting abbreviations from text.
5 versions - Latest release: over 1 year ago - 6.29 thousand downloads total - 1 stars on GitHub - 1 maintainer
zahirscan 0.2.17
Token-efficient content compression for AI analysis using probabilistic template mining
13 versions - Latest release: 4 days ago - 475 downloads total - 0 stars on GitHub - 1 maintainer
korrektor 0.3.1 💰
Library to work with Uzbek language text processing
7 versions - Latest release: about 3 years ago - 8.86 thousand downloads total - 5 stars on GitHub - 2 maintainers
taginfo 0.1.0
Provides classification, validation of tag names from HTML, SVG, and MathML
1 version - Latest release: almost 6 years ago - 1.65 thousand downloads total - 1 stars on GitHub - 1 maintainer
tgo 0.1.0
Heterogeneous data type transtion, it's safe, lightweight and fast
1 version - Latest release: about 3 years ago - 1.63 thousand downloads total - 0 stars on GitHub - 1 maintainer
awkrs 0.1.38
Awk implementation in Rust with broad CLI compatibility, parallel records, and experimental Crane...
14 versions - Latest release: 4 days ago - 153 downloads total - 0 stars on GitHub - 1 maintainer
pukram2html 0.3.3
A Rust library for converting Pukram-formatted text to HTML.
6 versions - Latest release: 8 months ago - 3.63 thousand downloads total - 0 stars on gitlab.com - 1 maintainer
taggie 0.1.0
Edit audio tags in your favorite text editor
1 version - Latest release: over 5 years ago - 1.76 thousand downloads total - 24 stars on GitHub - 1 maintainer
Top 4.2% on crates.io
icu_normalizer 2.2.0 💰
API for normalizing text into Unicode Normalization Forms
19 versions - Latest release: 15 days ago - 10 dependent packages - 122 dependent repositories - 267 million downloads total - 1,777 stars on GitHub - 1 maintainer
Top 6.2% on crates.io
unic-ucd-segment 0.9.0
UNIC — Unicode Character Database — Segmentation Properties
3 versions - Latest release: about 7 years ago - 2 dependent packages - 1,688 dependent repositories - 23.2 million downloads total - 245 stars on GitHub - 1 maintainer
Top 6.8% on crates.io
unic-ucd 0.9.0
UNIC — Unicode Character Database
11 versions - Latest release: about 7 years ago - 9 dependent packages - 28 dependent repositories - 139 thousand downloads total - 245 stars on GitHub - 1 maintainer
Top 5.0% on crates.io
unic-segment 0.9.0
UNIC — Unicode Text Segmentation Algorithms
3 versions - Latest release: about 7 years ago - 8 dependent packages - 1,697 dependent repositories - 23.2 million downloads total - 245 stars on GitHub - 1 maintainer
Top 8.1% on crates.io
unic 0.9.0
UNIC: Unicode and Internationalization Crates
10 versions - Latest release: about 7 years ago - 4 dependent packages - 11 dependent repositories - 72.6 thousand downloads total - 245 stars on GitHub - 1 maintainer
Top 8.2% on crates.io
unic-ucd-case 0.9.0
UNIC — Unicode Character Database — Case Properties
4 versions - Latest release: about 7 years ago - 2 dependent packages - 27 dependent repositories - 135 thousand downloads total - 245 stars on GitHub - 1 maintainer
Top 9.4% on crates.io
unic-idna 0.9.0
UNIC — Unicode IDNA Compatibility Processing
8 versions - Latest release: about 7 years ago - 2 dependent packages - 24 dependent repositories - 86.8 thousand downloads total - 245 stars on GitHub - 1 maintainer
Top 9.9% on crates.io
unic-char 0.9.0
UNIC — Unicode Character Tools
4 versions - Latest release: about 7 years ago - 1 dependent package - 11 dependent repositories - 62.8 thousand downloads total - 245 stars on GitHub - 1 maintainer
Top 8.9% on crates.io
unic-char-basics 0.9.0
UNIC — Unicode Character Tools — Basic Stable Character Properties
2 versions - Latest release: about 7 years ago - 2 dependent packages - 11 dependent repositories - 60.1 thousand downloads total - 245 stars on GitHub - 1 maintainer
Top 7.1% on crates.io
unic-ucd-age 0.9.0
UNIC — Unicode Character Database — Age
7 versions - Latest release: about 7 years ago - 3 dependent packages - 79 dependent repositories - 452 thousand downloads total - 245 stars on GitHub - 1 maintainer
Top 5.9% on crates.io
unic-ucd-bidi 0.9.0
UNIC — Unicode Character Database — Bidi Properties
9 versions - Latest release: about 7 years ago - 6 dependent packages - 458 dependent repositories - 1.46 million downloads total - 245 stars on GitHub - 1 maintainer
unic-cli 0.9.0
UNIC Command-Line Tools
3 versions - Latest release: about 7 years ago - 5.05 thousand downloads total - 245 stars on GitHub - 1 maintainer
Top 9.3% on crates.io
unic-ucd-name_aliases 0.9.0
UNIC — Unicode Character Database — Name Aliases
1 version - Latest release: about 7 years ago - 1 dependent package - 27 dependent repositories - 122 thousand downloads total - 245 stars on GitHub - 1 maintainer
Top 8.0% on crates.io
unic-idna-punycode 0.9.0
UNIC — Implementation of Punycode (RFC 3492) algorithm
9 versions - Latest release: about 7 years ago - 3 dependent packages - 19 dependent repositories - 123 thousand downloads total - 245 stars on GitHub - 1 maintainer
Top 7.1% on crates.io
unic-ucd-normal 0.9.0
UNIC — Unicode Character Database — Normalization Properties
10 versions - Latest release: about 7 years ago - 3 dependent packages - 90 dependent repositories - 495 thousand downloads total - 245 stars on GitHub - 1 maintainer
Top 4.2% on crates.io
unic-char-property 0.9.0
UNIC — Unicode Character Tools — Character Property taxonomy, contracts and build macros
4 versions - Latest release: about 7 years ago - 19 dependent packages - 2,618 dependent repositories - 41.4 million downloads total - 245 stars on GitHub - 1 maintainer
Top 9.9% on crates.io
unic-emoji 0.9.0
UNIC — Unicode Emoji
3 versions - Latest release: about 7 years ago - 1 dependent package - 11 dependent repositories - 68.1 thousand downloads total - 245 stars on GitHub - 1 maintainer
Top 5.9% on crates.io
unic-common 0.9.0
UNIC — Common Utilities
3 versions - Latest release: about 7 years ago - 2 dependent packages - 2,619 dependent repositories - 41.4 million downloads total - 245 stars on GitHub - 1 maintainer
Top 7.1% on crates.io
unic-ucd-hangul 0.9.0
UNIC — Unicode Character Database — Hangul Syllable Composition & Decomposition
2 versions - Latest release: about 7 years ago - 3 dependent packages - 89 dependent repositories - 487 thousand downloads total - 245 stars on GitHub - 1 maintainer
Top 5.3% on crates.io
unic-ucd-ident 0.9.0
UNIC — Unicode Character Database — Identifier Properties
3 versions - Latest release: about 7 years ago - 8 dependent packages - 304 dependent repositories - 16.8 million downloads total - 245 stars on GitHub - 1 maintainer
Top 4.2% on crates.io
unic-char-range 0.9.0
UNIC — Unicode Character Tools — Character Range and Iteration
4 versions - Latest release: about 7 years ago - 20 dependent packages - 2,619 dependent repositories - 41.4 million downloads total - 245 stars on GitHub - 1 maintainer
Top 4.2% on crates.io
unic-ucd-version 0.9.0
UNIC — Unicode Character Database — Version
3 versions - Latest release: about 7 years ago - 18 dependent packages - 2,619 dependent repositories - 41.4 million downloads total - 245 stars on GitHub - 1 maintainer
Top 7.4% on crates.io
unic-ucd-name 0.9.0
UNIC — Unicode Character Database — Name
4 versions - Latest release: about 7 years ago - 4 dependent packages - 30 dependent repositories - 143 thousand downloads total - 245 stars on GitHub - 1 maintainer
Top 8.2% on crates.io
unic-ucd-block 0.9.0
UNIC — Unicode Character Database — Unicode Blocks
2 versions - Latest release: about 7 years ago - 3 dependent packages - 34 dependent repositories - 126 thousand downloads total - 234 stars on GitHub - 1 maintainer
Top 6.7% on crates.io
unic-ucd-common 0.9.0
UNIC — Unicode Character Database — Common Properties
3 versions - Latest release: about 7 years ago - 6 dependent packages - 31 dependent repositories - 370 thousand downloads total - 234 stars on GitHub - 1 maintainer
Top 9.5% on crates.io
unic-idna-mapping 0.9.0
UNIC — IDNA — IDNA Mapping Table
6 versions - Latest release: about 7 years ago - 1 dependent package - 19 dependent repositories - 84.4 thousand downloads total - 234 stars on GitHub - 1 maintainer
Top 6.1% on crates.io
unic-bidi 0.9.0
UNIC — Unicode Bidirectional Algorithm
8 versions - Latest release: about 7 years ago - 6 dependent packages - 379 dependent repositories - 953 thousand downloads total - 234 stars on GitHub - 1 maintainer
Top 6.3% on crates.io
unic-normal 0.9.0
UNIC — Unicode Normalization Forms
9 versions - Latest release: about 7 years ago - 8 dependent packages - 79 dependent repositories - 439 thousand downloads total - 230 stars on GitHub - 1 maintainer
Top 5.1% on crates.io
unic-ucd-category 0.9.0
UNIC — Unicode Character Database — General Category
5 versions - Latest release: about 7 years ago - 14 dependent packages - 384 dependent repositories - 2.17 million downloads total - 234 stars on GitHub - 1 maintainer
Top 4.7% on crates.io
unic-emoji-char 0.9.0
UNIC — Unicode Emoji — Emoji Character Properties
3 versions - Latest release: about 7 years ago - 16 dependent packages - 438 dependent repositories - 7 million downloads total - 234 stars on GitHub - 1 maintainer
opencc-jieba-rs 0.7.4
High-performance Chinese text conversion and segmentation using Jieba and OpenCC-style dictionaries.
6 versions - Latest release: 25 days ago - 32.3 thousand downloads total - 0 stars on GitHub - 1 maintainer
wakuchin_cli 0.3.0 💰
A next generation wakuchin researcher software written in Rust
3 versions - Latest release: over 3 years ago - 3.71 thousand downloads total - 1 stars on GitHub - 1 maintainer
splitt 0.1.0
Split text in your terminal
1 version - Latest release: about 1 year ago - 693 downloads total - 0 stars on GitHub - 1 maintainer
matcher_rs 0.15.2
A high-performance matcher designed to solve LOGICAL and TEXT VARIATIONS problems in word matchin...
66 versions - Latest release: 6 days ago - 48.9 thousand downloads total - 15 stars on GitHub - 1 maintainer
textframe 0.4.1
Library to query plain text documents by unicode offset without loading them all into memory
6 versions - Latest release: about 1 month ago - 2.06 thousand downloads total - 2 stars on GitHub - 1 maintainer
gdengine 0.4.0
Game design document creation tool
4 versions - Latest release: over 4 years ago - 5.47 thousand downloads total - 4 stars on GitHub - 1 maintainer
bigstr 0.1.1 💰
A command-line tool to make string BIG
2 versions - Latest release: almost 2 years ago - 2.48 thousand downloads total - 1 stars on GitHub - 1 maintainer
langdetect-rs 0.2.3
Language detection in Rust. Port of Mimino666's langdetect.
5 versions - Latest release: 5 months ago - 230 downloads total - 1 maintainer
s3-concat 1.1.0
Concatenate Amazon S3 files remotely using flexible patterns
2 versions - Latest release: over 6 years ago - 3.38 thousand downloads total - 38 stars on GitHub - 1 maintainer
text-sanitizer 1.6.0
convert text to plain ASCII text
11 versions - Latest release: about 3 years ago - 12.8 thousand downloads total - 2 stars on GitHub - 1 maintainer
cindex 0.5.2
CSV indexing library
15 versions - Latest release: over 2 years ago - 2 dependent packages - 2 dependent repositories - 21 thousand downloads total - 0 stars on GitHub - 1 maintainer
pick-cli 0.1.26 💰
Extract, filter, and transform values from JSON, YAML, TOML, .env, HTTP headers, logfmt, CSV, and...
11 versions - Latest release: about 1 month ago - 159 downloads total - 11 stars on GitHub - 1 maintainer
sqlitepipe 0.2.3
A simple tool for piping the output of a command into sqlite databases.
7 versions - Latest release: 13 days ago - 99 downloads total - 1 maintainer
opencc-fmmseg 0.9.1
High-performance Chinese conversion library (Simplified ↔ Traditional) using OpenCC lexicons and ...
9 versions - Latest release: 27 days ago - 2.88 thousand downloads total - 0 stars on GitHub - 1 maintainer
stamd 0.1.0
Webservice for working with stand-off annotations on text (STAM)
1 version - Latest release: over 1 year ago - 953 downloads total - 0 stars on GitHub - 1 maintainer
triplets 0.17.4-alpha
Composable data sampling primitives for deterministic multi-source ML/AI training-data orchestrat...
28 versions - Latest release: 10 days ago - 333 downloads total - 0 stars on GitHub - 1 maintainer
typope 0.4.1
Pedantic source code checker for orthotypography mistakes and other typographical errors
7 versions - Latest release: 9 days ago - 5.42 thousand downloads total - 1 stars on GitHub - 1 maintainer
folia 0.0.6
High-performance library for handling the FoLiA XML format (Format for Linguistic Annotation)
6 versions - Latest release: over 5 years ago - 1 dependent package - 1 dependent repositories - 9.43 thousand downloads total - 4 stars on GitHub - 1 maintainer
typed-dialogflow 0.1.0
An easy-to-use typed Google Dialogflow client
1 version - Latest release: about 4 years ago - 1.7 thousand downloads total - 0 stars on GitHub - 1 maintainer
texfmt 0.1.0
(La)TeX formatter.
1 version - Latest release: over 3 years ago - 1.55 thousand downloads total - 0 stars on GitHub - 1 maintainer
economic_indicator_finder 0.1.1
A finder for extracting economic indicators from paragraphs
2 versions - Latest release: over 2 years ago - 2.66 thousand downloads total - 1 stars on GitHub - 1 maintainer
shiva 1.4.9
Shiva library: Implementation in Rust of a parser and generator for documents of any type
37 versions - Latest release: over 1 year ago - 1 dependent package - 47.5 thousand downloads total - 414 stars on GitHub - 1 maintainer
textprep 0.1.5
Text preprocessing primitives: normalization, tokenization, and fast keyword matching.
6 versions - Latest release: 11 days ago - 5.29 thousand downloads total - 1 maintainer
cyrla 0.1.0
Library for two-way conversion between latin and cyrillic script
1 version - Latest release: about 3 years ago - 1.65 thousand downloads total - 0 stars on GitHub - 1 maintainer
whichlang 0.1.1
A blazingly fast and lightweight language detection library for Rust.
2 versions - Latest release: about 1 year ago - 2 dependent packages - 1 dependent repositories - 225 thousand downloads total - 442 stars on GitHub - 3 maintainers
rawk-cli 0.1.2
The rawk cli, which is an AWK interpreter clone. The goal is to be POSIX compatible.
3 versions - Latest release: 11 days ago - 48 downloads total - 0 stars on GitHub - 1 maintainer
rawk-core 0.6.0
Core library for an AWK interpreter with the goal to be POSIX compatible.
17 versions - Latest release: 11 days ago - 394 downloads total - 0 stars on GitHub - 1 maintainer
autoruby 0.5.1
Easily generate furigana for various document formats
8 versions - Latest release: over 2 years ago - 1 dependent package - 11.2 thousand downloads total - 10 stars on GitHub - 1 maintainer
hangul 0.1.3
Utilities to manipulate Hangul Syllables
4 versions - Latest release: over 6 years ago - 2 dependent packages - 2 dependent repositories - 28.2 thousand downloads total - 10 stars on GitHub - 1 maintainer
strim 0.6.0
Macro to trim whitespace from string literals
6 versions - Latest release: over 1 year ago - 6.62 thousand downloads total - 0 stars on codeberg.org - 1 maintainer
rew 0.3.0
A text processing CLI tool that rewrites FS paths according to a pattern.
3 versions - Latest release: about 5 years ago - 4.73 thousand downloads total - 43 stars on GitHub - 1 maintainer
red-sed 1.0.2
An experimental drop-in replacement for GNU sed, written in Rust
2 versions - Latest release: 3 months ago - 37 downloads total - 12 stars on GitHub - 1 maintainer
booky 0.8.0
A tool to analyze English text
8 versions - Latest release: 7 months ago - 3.74 thousand downloads total - 1 stars on GitHub - 1 maintainer
vn-nlp-core 0.1.3
Core types, traits, and errors for vn-nlp
1 version - Latest release: about 1 month ago - 58 downloads total - 1 maintainer