An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "text processing" keyword

View the packages on the pypi.org package registry that are tagged with the "text processing" keyword.

tainlp 0.0.1.dev0
Tai Natural Language Processing library
1 version - Latest release: about 2 years ago - 43 downloads last month - 1 maintainer
Top 3.0% on pypi.org
addheader 0.3.2
A command to manage a header section for a source code tree
9 versions - Latest release: over 2 years ago - 3 dependent packages - 19 dependent repositories - 16.9 thousand downloads last month - 1 stars on GitHub - 1 maintainer
texterra 1.0.1
API for natural language processing.
2 versions - Latest release: over 7 years ago - 2 dependent repositories - 125 downloads last month - 1 maintainer
slaviclean 0.0.6
Text filter designed to cleanse text of profanity and offensive language, specifically tailored f...
6 versions - Latest release: 2 months ago - 265 downloads last month - 1 maintainer
flexi-nlp-tools 0.5.5
NLP toolkit based on the flexi-dict data structure, designed for efficient fuzzy search, with a f...
20 versions - Latest release: 2 months ago - 882 downloads last month - 1 maintainer
Top 1.9% on pypi.org
pythainlp 5.1.1
Thai Natural Language Processing library
113 versions - Latest release: 19 days ago - 37 dependent packages - 183 dependent repositories - 357 thousand downloads last month - 1,026 stars on GitHub - 2 maintainers
hanpud 0.1.dev0
Han Pud (ห่าน พูด): Thai super large generative model
1 version - Latest release: almost 2 years ago - 54 downloads last month - 1 stars on GitHub - 1 maintainer
ai-data-preprocessing-queue 1.6.0
Can be used to pre process data before ai processing
5 versions - Latest release: 2 months ago - 598 downloads last month - 1 maintainer
Top 1.0% on pypi.org
textacy 0.13.0
NLP, before and after spaCy
32 versions - Latest release: about 2 years ago - 18 dependent packages - 436 dependent repositories - 27.8 thousand downloads last month - 2,214 stars on GitHub - 1 maintainer
pyrxg 1.0.0
Regular Expression Generator and Generaliser
1 version - Latest release: 7 months ago - 29 downloads last month - 0 stars on gitlab.com - 1 maintainer
num2geotext 0.0.1
A Python package for converting numbers and floats (up to 15 digits) into Georgian text,
1 version - Latest release: 6 months ago - 54 downloads last month - 1 maintainer
freq-frame 1.0.0
1 version - Latest release: almost 4 years ago - 1 dependent repositories - 34 downloads last month - 1 maintainer
auto-mapper 0.1.2
An auto mapper that accepts a list of string and a list of objects of the format {'code', 'name'}...
2 versions - Latest release: almost 5 years ago - 1 dependent repositories - 95 downloads last month - 1 maintainer
artless-template 0.5.1
Artless and small template library for server-side rendering.
8 versions - Latest release: 4 months ago - 365 downloads last month - 6,242 stars on GitHub - 1 maintainer
hassans-frame 0.0.0
1 version - Latest release: almost 4 years ago - 1 dependent repositories - 31 downloads last month - 1 maintainer
sculpt 0.1.33
Sculpt: Structuring unstructured data with LLMs
1 version - Latest release: 4 days ago - 30 stars on GitHub - 2 maintainers
sculptor 0.1.32
Sculptor: Structuring unstructured data with LLMs
6 versions - Latest release: 3 months ago - 357 downloads last month - 30 stars on GitHub - 1 maintainer
intertext 0.0.1
tools for relational discourse analysis
1 version - Latest release: over 7 years ago - 1 dependent repositories - 53 downloads last month - 1 maintainer
qante 0.0.5
qante - Query ANnotated TExt
5 versions - Latest release: over 1 year ago - 116 downloads last month - 5 stars on GitHub - 1 maintainer
pug 0.1.22
Meta package to install the PDX Python User Group utilities.
11 versions - Latest release: about 10 years ago - 8 dependent repositories - 251 downloads last month - 12 stars on GitHub - 1 maintainer
breame 0.1.2
Breame is a lightweight Python package with a number of tools to aid in the detection of words th...
3 versions - Latest release: over 3 years ago - 1 dependent repositories - 1.69 thousand downloads last month - 16 stars on GitHub - 1 maintainer
processtext 0.1.7
An open-source python package to process text data
10 versions - Latest release: about 1 year ago - 239 downloads last month - 4 stars on GitHub - 1 maintainer
rebnf 0.9
ReBNF: Regexes for Extended Backus-Naur Form (EBNF)
7 versions - Latest release: almost 2 years ago - 281 downloads last month - 0 stars on gitlab.com - 1 maintainer
sotastream 1.0.1
Sotastream is a command line tool that augments a batch of text and produces infinite stream of r...
2 versions - Latest release: over 1 year ago - 104 downloads last month - 20 stars on GitHub - 2 maintainers
pewanalytics 1.1.1
Utilities for text processing and statistical analysis from Pew Research Center
5 versions - Latest release: about 3 years ago - 1 dependent repositories - 198 downloads last month - 84 stars on GitHub - 1 maintainer
thaixtransformers 0.1.0
ThaiXtransformers: Use Pretraining RoBERTa based Thai language models from VISTEC-depa AI Researc...
1 version - Latest release: almost 2 years ago - 419 downloads last month - 7 stars on GitHub - 1 maintainer
lekcut 0.1
LEKCut (เล็ก คัด) is a Thai tokenization library that ports the deep learning model to the onnx m...
1 version - Latest release: over 2 years ago - 170 downloads last month - 7 stars on GitHub - 1 maintainer
pythaitts 0.3.0
Open Source Thai Text-to-speech library in Python
5 versions - Latest release: about 1 year ago - 1 dependent repositories - 366 downloads last month - 27 stars on GitHub - 1 maintainer
thaibraille 0.1.dev2
Thai Braille for Natural Language Processing.
3 versions - Latest release: about 2 years ago - 152 downloads last month - 3 stars on GitHub - 1 maintainer
khamyo 0.3.0 💰
Thai abbreviation to full text library
4 versions - Latest release: 8 months ago - 1 dependent package - 1 dependent repositories - 275 downloads last month - 6 stars on GitHub - 1 maintainer
Top 7.9% on pypi.org
nemo-text-processing 1.1.0
NeMo text processing for ASR and TTS
14 versions - Latest release: 8 months ago - 2 dependent packages - 1 dependent repositories - 54.5 thousand downloads last month - 274 stars on GitHub - 1 maintainer
Top 3.0% on pypi.org
quantulum3 0.9.2 💰
Extract quantities from unstructured text.
41 versions - Latest release: 10 months ago - 8 dependent packages - 44 dependent repositories - 118 thousand downloads last month - 137 stars on GitHub - 1 maintainer
fast-dedupe 0.1.1
Fast, Minimalist Text Deduplication Library for Python
2 versions - Latest release: about 1 month ago - 319 downloads last month - 1 maintainer
multiel 0.5
Multilingual Entity Linking model by BELA model
5 versions - Latest release: almost 2 years ago - 1 dependent package - 294 downloads last month - 11 stars on GitHub - 1 maintainer
spacy-pythainlp 0.1
PyThaiNLP For spaCy
9 versions - Latest release: over 2 years ago - 1 dependent repositories - 783 downloads last month - 13 stars on GitHub - 1 maintainer
wakong 1.1.1 💰
Wakong: An appropriate and robust masking algorithm for generating the training objective of text...
3 versions - Latest release: over 2 years ago - 1 dependent repositories - 152 downloads last month - 3 stars on GitHub - 1 maintainer
charboundary 0.5.0
Fast character-based boundary detection for sentence and paragraphs
13 versions - Latest release: 13 days ago - 1.12 thousand downloads last month - 1 stars on GitHub - 1 maintainer
delb 0.5.1
A library that provides an ergonomic model for XML encoded text documents (e.g. with TEI-XML).
30 versions - Latest release: 3 months ago - 2 dependent packages - 4 dependent repositories - 1.32 thousand downloads last month - 16 stars on GitHub - 1 maintainer
pylda2vec 1.0.0
Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec
2 versions - Latest release: about 6 years ago - 1 dependent repositories - 84 downloads last month - 29 stars on GitHub - 1 maintainer
Top 9.4% on pypi.org
laonlp 1.2.0 💰
Lao Natural Language Processing library
18 versions - Latest release: 11 months ago - 1 dependent package - 4 dependent repositories - 1.45 thousand downloads last month - 31 stars on GitHub - 1 maintainer
jange 0.1.7
Easy NLP library for Python
8 versions - Latest release: over 3 years ago - 1 dependent repositories - 340 downloads last month - 17 stars on GitHub - 1 maintainer
huspacy-nightly 0.11.0.dev261 💰
HuSpaCy: industrial strength Hungarian natural language processing
126 versions - Latest release: over 1 year ago - 1 dependent repositories - 2.25 thousand downloads last month - 155 stars on GitHub - 1 maintainer
Top 6.6% on pypi.org
huspacy 0.12.1 💰
HuSpaCy: industrial strength Hungarian natural language processing
23 versions - Latest release: 6 months ago - 1 dependent package - 6 dependent repositories - 2.19 thousand downloads last month - 142 stars on GitHub - 1 maintainer
pyosis 0.1.1
Unofficial Python client for parsing OSIS (Open Scriptural Information Standard) files
2 versions - Latest release: 3 months ago - 86 downloads last month - 0 stars on GitHub - 1 maintainer
gatenlp 1.0.8
GATE NLP implementation in Python.
29 versions - Latest release: over 2 years ago - 2 dependent repositories - 942 downloads last month - 63 stars on GitHub - 3 maintainers
punjabi-stemmer 1.0.1
A Python library for stemming Punjabi language words, including preprocessing for noise removal.
2 versions - Latest release: about 1 year ago - 90 downloads last month - 2 stars on GitHub - 1 maintainer
linesieve 1.0
An unholy blend of grep, sed, awk, and Python.
13 versions - Latest release: about 2 years ago - 1 dependent repositories - 410 downloads last month - 8 stars on GitHub - 1 maintainer
pythaisa 0.2.1
Python Thai Sentiment Analysis
2 versions - Latest release: over 5 years ago - 1 dependent repositories - 43 downloads last month - 14 stars on GitHub - 1 maintainer
ingredient-slicer 1.2.2
Parses unstructured recipe ingredient text into standardized quantities, units, and foods
46 versions - Latest release: 13 days ago - 958 downloads last month - 1 maintainer
fixthaipdf 0.2.1 💰
Fix Thai PDF Text
3 versions - Latest release: over 1 year ago - 28 thousand downloads last month - 33 stars on GitHub - 1 maintainer
nupunkt 0.5.1
Next-generation Punkt sentence and paragraph boundary detection with zero dependencies
5 versions - Latest release: 13 days ago - 318 downloads last month - 0 stars on GitHub - 1 maintainer
easynertag 0.2 💰
Easy tagging for annotate NER corpus
2 versions - Latest release: over 2 years ago - 98 downloads last month - 2 stars on GitHub - 1 maintainer
wordtonumber 1.0.2
A Python library for converting words to numbers.
3 versions - Latest release: 14 days ago - 0 stars on GitHub - 1 maintainer
hadal 0.0.3
Tool for mining/alignment parallel texts
3 versions - Latest release: over 1 year ago - 116 downloads last month - 5 stars on GitHub - 1 maintainer
docdump 1.0.4
A package to extract text from common document types.
5 versions - Latest release: over 4 years ago - 1 dependent repositories - 119 downloads last month - 0 stars on GitHub - 1 maintainer
markdowncleaner 0.2.0
A tool for cleaning and formatting markdown documents
2 versions - Latest release: about 2 months ago - 371 downloads last month - 0 stars on GitHub - 1 maintainer
lttl 2.1.0
LangTech Text Library (LTTL) for text processing and analysis
24 versions - Latest release: 3 months ago - 1 dependent repositories - 4.93 thousand downloads last month - 3 stars on GitHub - 1 maintainer
thaitextaug 0.0.4 💰
Thai Text Augmentation
15 versions - Latest release: almost 4 years ago - 1 dependent repositories - 651 downloads last month - 5 stars on GitHub - 1 maintainer
Top 9.7% on pypi.org
thai-nner 0.3
Thai Nested Named Entity Recognition
3 versions - Latest release: almost 3 years ago - 1 dependent package - 3 dependent repositories - 183 downloads last month - 45 stars on GitHub - 2 maintainers
gatenlp-ml-tner 0.1.0a1
Train and use transformer token classification models using tner
1 version - Latest release: almost 3 years ago - 56 downloads last month - 0 stars on GitHub - 1 maintainer
regexa 0.1.1
A modern, full-featured regex library for Python
2 versions - Latest release: 5 months ago - 62 downloads last month - 0 stars on GitHub - 1 maintainer
anpe 0.1.0
Another Noun Phrase Extractor using the Berkeley Neural Parser
1 version - Latest release: 17 days ago - 1 maintainer
minification-station 0.1.4
Designed to process and combine multiple files within a specified directory into a single output ...
5 versions - Latest release: 7 months ago - 177 downloads last month - 0 stars on GitHub - 1 maintainer
bibleparser 0.0.2
Parse a mistranscribed dictated bible reference into a standard format
2 versions - Latest release: 6 months ago - 75 downloads last month - 0 stars on GitHub - 1 maintainer
geomentions 0.0.1
A mini Python package for geotagging text and retrieving location info.
1 version - Latest release: about 2 months ago - 209 downloads last month - 0 stars on GitHub - 1 maintainer
textdatasetcleaner 0.0.6
Pipeline for cleaning (preprocessing/normalizing) text datasets
4 versions - Latest release: about 4 years ago - 1 dependent repositories - 207 downloads last month - 39 stars on GitHub - 1 maintainer
freqframe 1.0.0
1 version - Latest release: almost 4 years ago - 1 dependent repositories - 36 downloads last month - 1 maintainer
nlup 0.8
('Core libraries for natural language processing',)
4 versions - Latest release: over 6 years ago - 11 dependent repositories - 2.7 thousand downloads last month - 10 stars on GitHub - 3 maintainers
printb 1.0.2 💰
printb is a wrapper for print/input built-ins, that swaps string directions for BIDI languages.
1 version - Latest release: over 3 years ago - 1 dependent repositories - 72 downloads last month - 0 stars on GitHub - 1 maintainer
cleantextkit 0.1.1
A preprocessor which performs operations of lowering text, removing special characters and removi...
2 versions - Latest release: over 1 year ago - 83 downloads last month - 1 maintainer
zalgolib 0.2.2
A Python library for a _FULL_ Zalgo experience
4 versions - Latest release: over 1 year ago - 1 dependent package - 1 dependent repositories - 12.9 thousand downloads last month - 5 stars on GitHub - 1 maintainer
Top 9.4% on pypi.org
wordcloud-fa 0.1.10 💰
A wrapper for wordcloud module for creating persian (and other rtl languages) word cloud.
10 versions - Latest release: over 2 years ago - 6 dependent repositories - 524 downloads last month - 144 stars on GitHub - 1 maintainer
Top 6.1% on pypi.org
wordaxe 1.0.1
Provide hyphenation for python programs and ReportLab paragraphs.
5 versions - Latest release: over 1 year ago - 1 dependent package - 4 dependent repositories - 1 maintainer
rdatools 0.1.7
tools for relational discourse analysis
2 versions - Latest release: over 7 years ago - 1 dependent repositories - 53 downloads last month - 1 maintainer
token-distance 0.2.5
Python library designed to perform fuzzy token matching within text documents. Utilizing advanced...
6 versions - Latest release: about 1 month ago - 208 downloads last month - 0 stars on gitlab.com - 1 maintainer
pypage 2.1.0
Light-weight Python Templating Engine
8 versions - Latest release: 9 months ago - 1 dependent package - 1 dependent repositories - 228 downloads last month - 31 stars on GitHub - 1 maintainer
thai2transformers 0.1.2
Pretraining transformer based Thai language models
8 versions - Latest release: about 4 years ago - 1 dependent repositories - 889 downloads last month - 114 stars on GitHub - 1 maintainer
arabicscript 0.1.4
Tools for Arabic script
4 versions - Latest release: over 8 years ago - 1 dependent repositories - 90 downloads last month - 8 stars on GitHub - 1 maintainer
tex-untag 1.3.0
A script for removing all of a given markup tag from a set of TeX files.
6 versions - Latest release: over 3 years ago - 1 dependent repositories - 248 downloads last month - 1 stars on GitHub - 1 maintainer
fkscore 2.0.1
Flesch Kincaid readability scoring algorithm
7 versions - Latest release: about 1 year ago - 1 dependent repositories - 1.08 thousand downloads last month - 2 stars on GitHub - 1 maintainer
quantulum 0.1.16
Extract quantities from unstructured text.
17 versions - Latest release: over 1 year ago - 1 dependent package - 4 dependent repositories - 283 downloads last month - 120 stars on GitHub - 1 maintainer
ttg 0.1.dev3
Thai Text Generator library
3 versions - Latest release: almost 5 years ago - 1 dependent repositories - 148 downloads last month - 4 stars on GitHub - 1 maintainer
ck-textprocessor 0.0.1 removed
A preprocessor which performs operations of lowering text, removing special characters and removi...
1 version - Latest release: over 1 year ago - 1 maintainer
Related Keywords
natural language processing 35 nlp 29 text analytics 21 NLP 19 localization 18 computational linguistics 17 python 17 Thai language 11 natural-language-processing 9 ThaiNLP 8 text 7 machine-learning 7 nlp-library 7 Thai NLP 7 text mining 6 text-processing 6 hacktoberfest 5 thai 5 thai-language 5 thai-nlp 5 linguistics 5 regex 5 data science 5 text-mining 5 parsing 4 python3 4 sentence boundary detection 4 spacy 4 parser 4 information extraction 4 thai-nlp-library 4 statistics 3 tagging 3 language 3 lemmatization 3 named entity recognition 3 tokenization 3 Thai 3 information-extraction 3 search 3 data analysis 3 deep-learning 2 machine learning 2 machine translation 2 text preprocessing 2 text cleaner 2 units 2 topic-modeling 2 deep learning 2 text conversion 2 math 2 science 2 neural net 2 xml 2 cli 2 scientometrics 2 bibliometrics 2 zotero 2 textmining 2 citation analysis 2 network analysis 2 discourse analysis 2 units-of-measure 2 text to structured data 2 pipeline 2 data transformation 2 data extraction 2 structured data 2 unstructured data 2 large language model 2 llm 2 data sculpting 2 measurements 2 word vectors 2 spacy model 2 dependency-parsing 2 hungarian 2 hunlp 2 morphological-analysis 2 named-entity-recognition 2 pos-tagger 2 spacy-models 2 spacy-pipeline 2 universal-dependencies 2 paragraph detection 2 linguistic tools 2 nlp tools 2 bible 2 gatenlp 2 thainlp 2 syntax 2 tts 2 artificial intelligence 2 quantities 2 text analysis 2 python-gatenlp 2 huspacy 2 regular expressions 2 clustering 2 Hungarian 2