Ecosyste.ms: Packages

An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "text-processing" keyword

Top 9.8% on pypi.org
tfkit 0.8.20
Transformers kit - Multi-task QA/Tagging/Multi-label Multi-Class Classification/Generation with B...
387 versions - Latest release: about 1 year ago - 1 dependent package - 2 dependent repositories - 1.58 thousand downloads last month - 54 stars on GitHub - 1 maintainer
finglish3 1.4.8
Finglish-to-Persian converter.
1 version - Latest release: almost 6 years ago - 1 dependent repositories - 12 downloads last month - 82 stars on GitHub - 1 maintainer
daachorse 0.1.7
A fast implementation of the Aho-Corasick algorithm using the compact double-array data structure
8 versions - Latest release: over 1 year ago - 1 dependent repositories - 776 downloads last month - 13 stars on GitHub - 1 maintainer
simager 0.1.2
Simple tools for auto classification and text preprocessing
4 versions - Latest release: 8 months ago - 1 dependent repositories - 32 downloads last month - 0 stars on GitHub - 1 maintainer
nlp-preprocessing 0.2.0
A Package for text preprocessing
14 versions - Latest release: almost 4 years ago - 1 dependent repositories - 58 downloads last month - 16 stars on GitHub - 1 maintainer
hnlp 0.0.1
Humanly Deeplearning NLP.
2 versions - Latest release: about 4 years ago - 1 dependent repositories - 10 downloads last month - 28 stars on GitHub - 1 maintainer
pnlp 0.4.10
A pre/post-processing tool for NLP.
24 versions - Latest release: 5 months ago - 1 dependent package - 2 dependent repositories - 123 downloads last month - 28 stars on GitHub - 1 maintainer
textcheck 0.2.2
Check text files for issues.
2 versions - Latest release: almost 3 years ago - 1 dependent repositories - 8 downloads last month - 2 stars on GitHub - 1 maintainer
rb-tocase 1.3.2
RB toCase is a Case converter.
2 versions - Latest release: over 2 years ago - 1 dependent repositories - 72 downloads last month - 3 stars on GitHub - 1 maintainer
primetext 0.2.2
package for indexing text datasets using prime number factorisation for fast word frequency analysis
4 versions - Latest release: over 7 years ago - 1 dependent repositories - 11 downloads last month - 4 stars on GitHub - 1 maintainer
finglish 1.5.1
Finglish-to-Persian converter.
22 versions - Latest release: about 4 years ago - 1 dependent repositories - 144 downloads last month - 82 stars on GitHub - 1 maintainer
Top 9.4% on pypi.org
wordcloud-fa 0.1.10 💰
A wrapper for wordcloud module for creating persian (and other rtl languages) word cloud.
10 versions - Latest release: over 1 year ago - 6 dependent repositories - 464 downloads last month - 139 stars on GitHub - 1 maintainer
Top 6.6% on pypi.org
soyspacing 1.0.17
Spacing Error Correction Tools
15 versions - Latest release: over 4 years ago - 1 dependent package - 5 dependent repositories - 273 downloads last month - 141 stars on GitHub - 1 maintainer
Top 7.8% on pypi.org
humanreadable 0.4.0 💰
humanreadable is a Python library to convert human-readable values to other units.
13 versions - Latest release: 11 months ago - 3 dependent packages - 34 dependent repositories - 38.1 thousand downloads last month - 15 stars on GitHub - 1 maintainer
html5lib-truncation 0.1.0
Truncating HTML with html5lib filter
1 version - Latest release: over 9 years ago - 5 dependent repositories - 267 downloads last month - 11 stars on GitHub - 1 maintainer
python-ucto 0.6.7
This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost a...
22 versions - Latest release: 7 months ago - 1 dependent package - 4 dependent repositories - 278 downloads last month - 29 stars on GitHub - 1 maintainer
text-dedup 0.4.0
All-in-one text de-duplication
24 versions - Latest release: about 2 months ago - 2 dependent repositories - 1.35 thousand downloads last month - 521 stars on GitHub - 1 maintainer
wetextprocessing 0.1.12
WeTextProcessing, including TN & ITN
25 versions - Latest release: 3 months ago - 2 dependent packages - 8.7 thousand downloads last month - 384 stars on GitHub - 2 maintainers
Top 1.8% on pypi.org
pymupdfb 1.24.3
MuPDF shared libraries for PyMuPDF.
14 versions - Latest release: about 1 month ago - 4 dependent packages - 133 dependent repositories - 2.13 million downloads last month - 4,025 stars on GitHub - 1 maintainer
Top 1.1% on pypi.org
pyparsing 3.1.2 💰
pyparsing module - Classes and methods to define and execute parsing grammars
71 versions - Latest release: 3 months ago - 1,663 dependent packages - 264,180 dependent repositories - 116 million downloads last month - 2,083 stars on GitHub - 1 maintainer
voxera 0.0.1
An Open-Source Persian Language Techs Toolkit with Python
1 version - Latest release: over 1 year ago - 6 downloads last month - 5 stars on GitHub - 1 maintainer
textform 0.11.0
A text shaping package.
7 versions - Latest release: almost 3 years ago - 1 dependent repositories - 25 downloads last month - 7 stars on GitHub - 1 maintainer
ret 0.1.4
A pure-python command-line regular expression tool for stream filtering, extracting, and parsing.
5 versions - Latest release: almost 2 years ago - 7 dependent repositories - 152 downloads last month - 2 stars on GitHub - 1 maintainer
Top 1.4% on pypi.org
pymupdf 1.24.4
A high performance Python library for data extraction, analysis, conversion & manipulation of PDF...
116 versions - Latest release: 25 days ago - 206 dependent packages - 1,798 dependent repositories - 3.05 million downloads last month - 4,025 stars on GitHub - 1 maintainer
konfuzio-sdk 0.3.5
Konfuzio Software Development Kit
409 versions - Latest release: 26 days ago - 1 dependent repositories - 5.29 thousand downloads last month - 54 stars on GitHub - 1 maintainer
pdf2textbox 0.4.3
A PDF-to-text converter based on pdfminer2
14 versions - Latest release: over 5 years ago - 4 dependent repositories - 65 downloads last month - 5 stars on GitHub - 1 maintainer
padatious-phoenix 0.4.9
A neural network intent parser
1 version - Latest release: about 1 month ago - 206 downloads last month - 1 maintainer
dict-fr-au-dela 2021.9.9
EDITABLE French dictionaries from Laboratoire d'Automatique Documentaire et Linguistique (LADL)
2 versions - Latest release: over 2 years ago - 1 dependent repositories - 40 downloads last month - 3 stars on GitHub - 1 maintainer
stringtools 3.0.1
stringtools provides string operations, such as analaysing, converting, generating, validating.
22 versions - Latest release: over 1 year ago - 1 dependent repositories - 158 downloads last month - 5 stars on GitHub - 1 maintainer
paxter 0.6.11
Paxter is a document-first, text pre-processing mini-language toolchain, loosely inspired by @-ex...
20 versions - Latest release: almost 4 years ago - 1 dependent repositories - 62 downloads last month - 5 stars on GitHub - 1 maintainer
markover 0.7
Natural Language Generator with Markov
21 versions - Latest release: 10 months ago - 1 dependent repositories - 37 downloads last month - 27 stars on GitHub - 1 maintainer
trunajod 0.1.1
A python lib for readability analyses.
3 versions - Latest release: about 3 years ago - 1 dependent repositories - 65 downloads last month - 29 stars on GitHub - 1 maintainer
dict-fr-wordscapes 2021.9.10
French dictionary of Wordscapes solutions
1 version - Latest release: over 2 years ago - 1 dependent repositories - 19 downloads last month - 0 stars on GitHub - 1 maintainer
odin-ai 1.2.5
Deep learning for research and production
6 versions - Latest release: about 4 years ago - 1 dependent repositories - 56 downloads last month - 21 stars on GitHub - 1 maintainer
textcl 1.0.0
Text preprocessing package for use in NLP tasks
5 versions - Latest release: about 3 years ago - 1 dependent repositories - 18 downloads last month - 10 stars on GitHub - 1 maintainer
block-spinning 1.0.4
A Python module for block spinning
3 versions - Latest release: about 4 years ago - 11 downloads last month - 1 stars on GitHub - 1 maintainer
retexto 1.6.1
Fast text processing
30 versions - Latest release: almost 4 years ago - 1 dependent repositories - 41 downloads last month - 0 stars on GitHub - 1 maintainer
Top 3.7% on pypi.org
padatious 0.4.8
A neural network intent parser
25 versions - Latest release: about 4 years ago - 3 dependent packages - 47 dependent repositories - 3.77 thousand downloads last month - 158 stars on GitHub - 1 maintainer
clean-text-rhoni 0.1.14
package to clean and normalize text
2 versions - Latest release: 6 months ago - 40 downloads last month - 0 stars on GitHub - 1 maintainer
tregex-tobiasli 1.0.3
Wrapper for more functionality out of regex parse results.
4 versions - Latest release: over 4 years ago - 2 dependent repositories - 25 downloads last month - 0 stars on GitHub - 1 maintainer
Top 5.2% on pypi.org
lingua-franca 0.4.3
Mycroft's multilingual text parsing and formatting library
9 versions - Latest release: almost 2 years ago - 1 dependent package - 24 dependent repositories - 515 downloads last month - 73 stars on GitHub - 1 maintainer
maleo 0.0.5
Wrapper library for text cleansing, preprocessing in NLP
7 versions - Latest release: over 3 years ago - 1 dependent repositories - 18 downloads last month - 17 stars on GitHub - 1 maintainer
dhelp 0.0.5
DH Python tools for scraping web pages, pre-processing data, and performing nlp analysis quickly.
4 versions - Latest release: about 6 years ago - 1 dependent repositories - 49 downloads last month - 5 stars on GitHub - 1 maintainer
Top 1.4% on pypi.org
pyarabic 0.6.15 💰
Arabic text tools for Python
18 versions - Latest release: almost 2 years ago - 7 dependent packages - 34 dependent repositories - 120 thousand downloads last month - 422 stars on GitHub - 1 maintainer
greek-normalisation 0.5.1 💰
Python 3 utilities for validating and normalising Ancient Greek text
6 versions - Latest release: almost 4 years ago - 4 dependent repositories - 60 downloads last month - 21 stars on GitHub - 1 maintainer
Top 9.0% on pypi.org
fuzzychinese 0.1.5
A small package to fuzzy match chinese words 中文模糊匹配
3 versions - Latest release: about 5 years ago - 2 dependent repositories - 537 downloads last month - 69 stars on GitHub - 1 maintainer
python-mecab 1.0.1
A repository to bind mecab for Python 3.5+. Not using swig nor pybind. (Not Maintained Now)
4 versions - Latest release: over 4 years ago - 1 dependent repositories - 54 downloads last month - 28 stars on GitHub - 1 maintainer
kkltk 1.0
kkltk is a toolkit designed for Kinyarwanda and Kirundi languages processing
1 version - Latest release: over 3 years ago - 1 dependent repositories - 6 downloads last month - 1 stars on GitHub - 1 maintainer
cutters 0.1.4
A rule based sentence segmentation library.
5 versions - Latest release: 11 months ago - 1 dependent repositories - 82 downloads last month - 13 stars on GitHub - 1 maintainer
russian-names 0.1.2
Russian names generator
1 version - Latest release: about 5 years ago - 2 dependent repositories - 1.33 thousand downloads last month - 24 stars on GitHub - 1 maintainer
pycrfsuite-spacing 1.0.2
Pycrfsuite를 이용한 띄어쓰기 교정기
3 versions - Latest release: about 6 years ago - 3 dependent packages - 2 dependent repositories - 141 downloads last month - 13 stars on GitHub - 1 maintainer
repx 0.0.1
Search and replace in files with regular expressions
1 version - Latest release: about 2 years ago - 1 dependent repositories - 5 downloads last month - 0 stars on GitHub - 1 maintainer
contexto 0.2.0
Librería para el procesamiento y análisis de texto con Python
3 versions - Latest release: almost 3 years ago - 1 dependent repositories - 55 downloads last month - 49 stars on GitHub - 1 maintainer
strkernel 0.2
Collection of string kernels
1 version - Latest release: almost 6 years ago - 1 dependent repositories - 146 downloads last month - 17 stars on GitHub - 1 maintainer
blabla 0.2.2
Novoic linguistics feature extraction package.
4 versions - Latest release: almost 4 years ago - 44 downloads last month - 32 stars on GitHub - 1 maintainer
hama 1.0.0
Korean natural language toolkit
1 version - Latest release: about 4 years ago - 1 dependent repositories - 71 downloads last month - 19 stars on GitHub - 1 maintainer
softener 0.0.0b0
A CLI tool for generating GitHub workflow annotations
1 version - Latest release: over 2 years ago - 1 dependent repositories - 13 downloads last month - 0 stars on GitHub - 1 maintainer
hashedindex 0.10.0
InvertedIndex implementation using hash lists (dictionaries)
18 versions - Latest release: over 3 years ago - 1 dependent package - 7 dependent repositories - 59 downloads last month - 34 stars on GitHub - 1 maintainer
text-validator 0.3 💰
pluggable command-line tool for validating the formatting and orthography of text files
3 versions - Latest release: over 4 years ago - 5 dependent repositories - 34 downloads last month - 5 stars on GitHub - 1 maintainer
gatenlp 1.0.8
GATE NLP implementation in Python.
29 versions - Latest release: over 1 year ago - 2 dependent repositories - 337 downloads last month - 57 stars on GitHub - 3 maintainers
textvec 1.0.1
Supervised text features extraction
3 versions - Latest release: about 6 years ago - 1 dependent repositories - 44 downloads last month - 190 stars on GitHub - 1 maintainer
chiecthuyenngoaixa 0.2.0
An utility library for processing Vietnamese texts
5 versions - Latest release: 9 months ago - 1 dependent repositories - 27 downloads last month - 3 stars on GitHub - 1 maintainer
Top 1.7% on pypi.org
konoha 5.5.6 💰
Add your description here
28 versions - Latest release: 26 days ago - 3 dependent packages - 134 dependent repositories - 84.6 thousand downloads last month - 214 stars on GitHub - 1 maintainer
texturizer 0.1.9
Python command line application to add text features to a CSV or TSV dataset.
8 versions - Latest release: over 2 years ago - 1 dependent repositories - 61 downloads last month - 4 stars on GitHub - 1 maintainer
split-markdown4gpt 1.0.9
A Python tool for splitting large Markdown files into smaller sections based on a specified token...
7 versions - Latest release: 12 months ago - 1 dependent repositories - 117 downloads last month - 16 stars on GitHub - 1 maintainer
natsulang 1.0.0b11
A text-processing language based on Python 3.
10 versions - Latest release: over 3 years ago - 1 dependent repositories - 36 downloads last month - 8 stars on GitHub - 1 maintainer
cat-win 1.7.8
Simple OS Independent 'cat' Command-line Tool made in Python.
67 versions - Latest release: about 1 month ago - 293 downloads last month - 7 stars on GitHub - 1 maintainer
textdatasetcleaner 0.0.6
Pipeline for cleaning (preprocessing/normalizing) text datasets
4 versions - Latest release: over 3 years ago - 1 dependent repositories - 31 downloads last month - 38 stars on GitHub - 1 maintainer
textmining3 1.1.0
Text Mining Utilities for Python 3
2 versions - Latest release: over 5 years ago - 1 dependent repositories - 70 downloads last month - 1 stars on GitHub - 1 maintainer
mime-py 0.3.0
A text processing framework, inspired by Emacs lisp and keyboard macros.
1 version - Latest release: over 1 year ago - 4 downloads last month - 7 stars on GitHub - 1 maintainer
wikiwho 1.0.3
An algorithm to identify authorship and editor interactions in Wiki revisioned content.
1 version - Latest release: about 5 years ago - 1 dependent repositories - 58 downloads last month - 29 stars on GitHub - 1 maintainer
embedisualization 0.4
Visualization of text embeddings/vectorization with clustering
4 versions - Latest release: about 6 years ago - 1 dependent repositories - 20 downloads last month - 2 stars on GitHub - 1 maintainer
pykotokenizer 0.0.3
Model-based Korean Text Tokenizer in Python
3 versions - Latest release: over 2 years ago - 1 dependent repositories - 120 downloads last month - 0 stars on GitHub - 1 maintainer
perke 0.4.4
A keyphrase extractor for Persian
13 versions - Latest release: 12 months ago - 1 dependent repositories - 77 downloads last month - 68 stars on GitHub - 1 maintainer
xia-diff-match-patch 0.0.3
Diff Match Patch is a high-performance library in multiple languages that manipulates plain text.
3 versions - Latest release: 8 months ago - 1 dependent package - 23 downloads last month - 7,185 stars on GitHub - 1 maintainer
l3wtransformer 0.3.0
A word hashing method based on vectors of letter n-grams. Currently transforms text into sequence...
2 versions - Latest release: over 5 years ago - 1 dependent repositories - 11 downloads last month - 10 stars on GitHub - 1 maintainer
dict-fr-dela 2021.8.27
French dictionaries from Laboratoire d'Automatique Documentaire et Linguistique (LADL)
1 version - Latest release: almost 3 years ago - 1 dependent repositories - 21 downloads last month - 0 stars on GitHub - 1 maintainer
Top 2.0% on pypi.org
nameparser 1.1.3
A simple Python module for parsing human names into their individual components.
51 versions - Latest release: 9 months ago - 34 dependent packages - 581 dependent repositories - 874 thousand downloads last month - 636 stars on GitHub - 1 maintainer
nostril-detector 1.2.2
Nonsense String Evaluator
4 versions - Latest release: 4 months ago - 8.32 thousand downloads last month - 178 stars on GitHub - 1 maintainer
proces 0.1.7
text preprocess.
8 versions - Latest release: 9 months ago - 2 dependent packages - 50 dependent repositories - 88.7 thousand downloads last month - 3 stars on GitHub - 1 maintainer
pytextrust 0.7.5
Library designed as a python wrapper to unleash Rust text processing power combined with Python
26 versions - Latest release: 8 months ago - 1.13 thousand downloads last month - 1 maintainer
Top 2.6% on pypi.org
jaconv 0.3.4 💰
Pure-Python Japanese character interconverter for Hiragana, Katakana, Hankaku, Zenkaku and more
11 versions - Latest release: over 1 year ago - 25 dependent packages - 198 dependent repositories - 1.77 million downloads last month - 289 stars on GitHub - 1 maintainer
turkish-twitter-preprocess 0.0.7
a light-weight python package to pre-process turkish twitter statuses(tweets).
7 versions - Latest release: over 3 years ago - 1 dependent repositories - 21 downloads last month - 6 stars on GitHub - 1 maintainer
iambic 3.0.0
Data extraction and rendering library for Shakespearean text.
30 versions - Latest release: 12 months ago - 1 dependent repositories - 122 downloads last month - 1 stars on GitHub - 1 maintainer
doc2term 0.1
A fast NLP tokenizer that detects tokens and remove duplications and punctuations
1 version - Latest release: about 3 years ago - 1 dependent repositories - 14 downloads last month - 1 stars on GitHub - 1 maintainer
nlpiper 0.3.1
NLPiper, a lightweight package integrated with a universe of frameworks to pre-process documents.
5 versions - Latest release: about 2 years ago - 2 dependent repositories - 38 downloads last month - 17 stars on GitHub - 3 maintainers
stam 0.8.0
STAM is a library for dealing with standoff annotations on text
15 versions - Latest release: 27 days ago - 1 dependent package - 1 dependent repositories - 1.55 thousand downloads last month - 13 stars on GitHub - 1 maintainer
cinje 1.1.2
A Pythonic and ultra fast template engine DSL.
4 versions - Latest release: over 5 years ago - 4 dependent repositories - 71 downloads last month - 31 stars on GitHub - 1 maintainer
nlpre 2.1.1
Natural Language Preprocessing (NLPre) utilities.
20 versions - Latest release: about 5 years ago - 4 dependent repositories - 70 downloads last month - 186 stars on GitHub - 1 maintainer
pgpc 0.0.4
A Python Generator based Parser Combinator library
4 versions - Latest release: 11 months ago - 29 downloads last month - 5 stars on GitHub - 1 maintainer
easytoken 2.0.2 💰
easytoken is an independent Open Source, Natural Language Processing python library which impleme...
11 versions - Latest release: over 4 years ago - 1 dependent repositories - 35 downloads last month - 1 stars on GitHub - 1 maintainer
de-workflow 0.2.2
A ToolBox for fuzzily extracting drugs mentions from text.
12 versions - Latest release: almost 2 years ago - 1 dependent repositories - 63 downloads last month - 3 stars on GitHub - 1 maintainer
diff-match-patch-cython 20121119
The Diff Match and Patch libraries offer robust algorithms to perform the operations required for...
1 version - Latest release: over 8 years ago - 2 dependent repositories - 11 downloads last month - 6,733 stars on GitHub - 2 maintainers
pdfautonup 1.9.0
Convert PDF files to 'n-up' PDF files, guessing the output layout.
21 versions - Latest release: 4 months ago - 1 dependent package - 1 dependent repositories - 178 downloads last month - 4,025 stars on GitHub - 1 maintainer
aqpymupdf 1.23.7
A high performance Python library for data extraction, analysis, conversion & manipulation of PDF...
1 version - Latest release: about 2 months ago - 39 downloads last month - 4,025 stars on GitHub - 1 maintainer
extract-drugs 1.3.0
A CLI for extracting drugs from text records
6 versions - Latest release: about 2 months ago - 138 downloads last month - 3 stars on GitHub - 1 maintainer
gaspra 0.1.0a3
A fast Python tool for searching, diffing, and merging text
2 versions - Latest release: 5 months ago - 17 downloads last month - 1 stars on GitHub - 1 maintainer
docdeid 1.0.0
Create your own document de-identifier using docdeid, a simple framework independent of language ...
25 versions - Latest release: 6 months ago - 1 dependent package - 1 dependent repositories - 567 downloads last month - 2 stars on GitHub - 1 maintainer
stramp 0.3.2
Blockchain-backed timestamp proof for structured document sections
3 versions - Latest release: about 4 years ago - 1 dependent repositories - 29 downloads last month - 2 stars on GitHub - 1 maintainer
oneai-stage 0.0.1
NLP as a Service
1 version - Latest release: about 2 years ago - 1 dependent repositories - 14 downloads last month - 34 stars on GitHub - 1 maintainer