pypi.org "text processing" keyword
View the packages on the pypi.org package registry that are tagged with the "text processing" keyword.
Top 1.9% on pypi.org
114 versions - Latest release: 3 months ago - 37 dependent packages - 183 dependent repositories - 658 thousand downloads last month - 1,053 stars on GitHub - 2 maintainers
pythainlp 5.1.2
Thai Natural Language Processing library114 versions - Latest release: 3 months ago - 37 dependent packages - 183 dependent repositories - 658 thousand downloads last month - 1,053 stars on GitHub - 2 maintainers
lingualab 3.5.9
A multilingual text and voice processing toolkit9 versions - Latest release: about 7 hours ago - 390 downloads last month - 1 maintainer
thai2transformers 0.1.2
Pretraining transformer based Thai language models8 versions - Latest release: over 4 years ago - 1 dependent repositories - 56 downloads last month - 114 stars on GitHub - 1 maintainer
arabicscript 0.1.4
Tools for Arabic script4 versions - Latest release: almost 9 years ago - 1 dependent repositories - 12 downloads last month - 8 stars on GitHub - 1 maintainer
Top 6.1% on pypi.org
5 versions - Latest release: almost 2 years ago - 1 dependent package - 4 dependent repositories - 1 maintainer
wordaxe 1.0.1
Provide hyphenation for python programs and ReportLab paragraphs.5 versions - Latest release: almost 2 years ago - 1 dependent package - 4 dependent repositories - 1 maintainer
auto-mapper 0.1.2
An auto mapper that accepts a list of string and a list of objects of the format {'code', 'name'}...2 versions - Latest release: about 5 years ago - 1 dependent repositories - 84 downloads last month - 1 maintainer
Top 1.0% on pypi.org
32 versions - Latest release: over 2 years ago - 18 dependent packages - 436 dependent repositories - 30.5 thousand downloads last month - 2,214 stars on GitHub - 1 maintainer
textacy 0.13.0
NLP, before and after spaCy32 versions - Latest release: over 2 years ago - 18 dependent packages - 436 dependent repositories - 30.5 thousand downloads last month - 2,214 stars on GitHub - 1 maintainer
tainlp 0.0.1.dev0
Tai Natural Language Processing library1 version - Latest release: over 2 years ago - 9 downloads last month - 1 maintainer
freq-frame 1.0.0
1 version - Latest release: over 4 years ago - 1 dependent repositories - 3 downloads last month - 1 maintainermathstring 0.1.0
English:1 version - Latest release: about 2 months ago - 17 downloads last month - 1 maintainer
data-filtering 0.1.21
A library to filter and deduplicate Q&A text datasets from CSV files.4 versions - Latest release: about 1 month ago - 31 downloads last month - 1 maintainer
Top 3.0% on pypi.org
41 versions - Latest release: about 1 year ago - 8 dependent packages - 44 dependent repositories - 134 thousand downloads last month - 142 stars on GitHub - 1 maintainer
quantulum3 0.9.2 💰
Extract quantities from unstructured text.41 versions - Latest release: about 1 year ago - 8 dependent packages - 44 dependent repositories - 134 thousand downloads last month - 142 stars on GitHub - 1 maintainer
tex-untag 1.3.0
A script for removing all of a given markup tag from a set of TeX files.6 versions - Latest release: over 3 years ago - 1 dependent repositories - 29 downloads last month - 1 stars on GitHub - 1 maintainer
quantulum 0.1.16
Extract quantities from unstructured text.17 versions - Latest release: almost 2 years ago - 1 dependent package - 4 dependent repositories - 38 downloads last month - 119 stars on GitHub - 1 maintainer
aiwand 0.4.24
A simple AI toolkit for text processing using OpenAI and Gemini APIs31 versions - Latest release: 3 days ago - 2.13 thousand downloads last month - 0 stars on GitHub - 1 maintainer
thaixtransformers 0.1.0
ThaiXtransformers: Use Pretraining RoBERTa based Thai language models from VISTEC-depa AI Researc...1 version - Latest release: about 2 years ago - 70 downloads last month - 7 stars on GitHub - 1 maintainer
nlp-helper 0.0.6
A small collection of NLP utility functions.6 versions - Latest release: about 1 month ago - 93 downloads last month - 1 maintainer
hanpud 0.1.dev0
Han Pud (ห่าน พูด): Thai super large generative model1 version - Latest release: about 2 years ago - 10 downloads last month - 1 stars on GitHub - 1 maintainer
ingredient-slicer 1.2.21
Parses unstructured recipe ingredient text into standardized quantities, units, and foods47 versions - Latest release: 3 months ago - 335 downloads last month - 1 maintainer
sofairfilter 1.0.0
Tool for identifying candidate documents for software mention extraction.1 version - Latest release: 3 days ago
minification-station 0.1.4
Designed to process and combine multiple files within a specified directory into a single output ...5 versions - Latest release: 10 months ago - 11 downloads last month - 0 stars on GitHub - 1 maintainer
huspacy-nightly 0.11.0.dev261 💰
HuSpaCy: industrial strength Hungarian natural language processing126 versions - Latest release: over 1 year ago - 1 dependent repositories - 281 downloads last month - 155 stars on GitHub - 1 maintainer
texterra 1.0.1
API for natural language processing.2 versions - Latest release: over 7 years ago - 2 dependent repositories - 20 downloads last month - 1 maintainer
processtext 0.1.7
An open-source python package to process text data10 versions - Latest release: over 1 year ago - 13 downloads last month - 4 stars on GitHub - 1 maintainer
flexi-nlp-tools 0.5.5
NLP toolkit based on the flexi-dict data structure, designed for efficient fuzzy search, with a f...20 versions - Latest release: 6 months ago - 75 downloads last month - 1 maintainer
ai-data-preprocessing-queue 1.6.0
Can be used to pre process data before ai processing5 versions - Latest release: 6 months ago - 338 downloads last month - 1 maintainer
anpe 1.1.3
Accurately extract complete noun phrases with customisation and strctural output.13 versions - Latest release: 2 months ago - 183 downloads last month - 0 stars on GitHub - 1 maintainer
lekcut 0.1
LEKCut (เล็ก คัด) is a Thai tokenization library that ports the deep learning model to the onnx m...1 version - Latest release: over 2 years ago - 19 downloads last month - 7 stars on GitHub - 1 maintainer
sotastream 1.0.1
Sotastream is a command line tool that augments a batch of text and produces infinite stream of r...2 versions - Latest release: almost 2 years ago - 18 downloads last month - 20 stars on GitHub - 3 maintainers
spacy-pythainlp 0.1
PyThaiNLP For spaCy9 versions - Latest release: over 2 years ago - 1 dependent repositories - 327 downloads last month - 13 stars on GitHub - 1 maintainer
Top 9.4% on pypi.org
18 versions - Latest release: about 1 year ago - 1 dependent package - 4 dependent repositories - 1.32 thousand downloads last month - 31 stars on GitHub - 1 maintainer
laonlp 1.2.0 💰
Lao Natural Language Processing library18 versions - Latest release: about 1 year ago - 1 dependent package - 4 dependent repositories - 1.32 thousand downloads last month - 31 stars on GitHub - 1 maintainer
Top 7.9% on pypi.org
14 versions - Latest release: 11 months ago - 2 dependent packages - 1 dependent repositories - 39.6 thousand downloads last month - 274 stars on GitHub - 1 maintainer
nemo-text-processing 1.1.0
NeMo text processing for ASR and TTS14 versions - Latest release: 11 months ago - 2 dependent packages - 1 dependent repositories - 39.6 thousand downloads last month - 274 stars on GitHub - 1 maintainer
sculpt 0.1.35
Sculpt: Structuring unstructured data with LLMs3 versions - Latest release: 3 months ago - 36 downloads last month - 33 stars on GitHub - 2 maintainers
pewanalytics 1.1.1
Utilities for text processing and statistical analysis from Pew Research Center5 versions - Latest release: over 3 years ago - 1 dependent repositories - 35 downloads last month - 84 stars on GitHub - 1 maintainer
regexa 0.1.1
A modern, full-featured regex library for Python2 versions - Latest release: 8 months ago - 11 downloads last month - 0 stars on GitHub - 1 maintainer
artless-template 0.6.2
The artless and minimalist templating for Python server-side rendering.11 versions - Latest release: 7 days ago - 173 downloads last month - 6,288 stars on GitHub - 1 maintainer
sculptor 0.2.0
Sculptor: Structuring unstructured data with LLMs7 versions - Latest release: 3 months ago - 272 downloads last month - 33 stars on GitHub - 1 maintainer
word2num-converter 1.0.0
A Python library to convert number words (e.g., twenty-one) to numeric digits (e.g., 21) for Engl...2 versions - Latest release: 2 months ago - 0 stars on GitHub - 1 maintainer
hassans-frame 0.0.0
1 version - Latest release: over 4 years ago - 1 dependent repositories - 3 downloads last month - 1 maintainermultiel 0.5
Multilingual Entity Linking model by BELA model5 versions - Latest release: about 2 years ago - 1 dependent package - 135 downloads last month - 12 stars on GitHub - 1 maintainer
intertext 0.0.1
tools for relational discourse analysis1 version - Latest release: over 7 years ago - 1 dependent repositories - 9 downloads last month - 1 maintainer
pug 0.1.22
Meta package to install the PDX Python User Group utilities.11 versions - Latest release: over 10 years ago - 8 dependent repositories - 46 downloads last month - 12 stars on GitHub - 1 maintainer
fixthaipdf 0.2.1 💰
Fix Thai PDF Text3 versions - Latest release: over 1 year ago - 240 downloads last month - 33 stars on GitHub - 1 maintainer
num2geotext 0.0.1
A Python package for converting numbers and floats (up to 15 digits) into Georgian text,1 version - Latest release: 10 months ago - 15 downloads last month - 1 maintainer
qante 0.0.5
qante - Query ANnotated TExt5 versions - Latest release: almost 2 years ago - 10 downloads last month - 5 stars on GitHub - 1 maintainer
doti18n 0.3.0
Python library for loading YAML localizations with dot access and pluralization.3 versions - Latest release: 9 days ago - 115 downloads last month - 0 stars on GitHub - 1 maintainer
tamilkavi 0.5.0
A command-line tool for exploring Tamil Kavithaigal.4 versions - Latest release: 3 months ago - 31 downloads last month - 0 stars on GitHub - 1 maintainer
punjabi-stemmer 1.0.1
A Python library for stemming Punjabi language words, including preprocessing for noise removal.2 versions - Latest release: over 1 year ago - 10 downloads last month - 2 stars on GitHub - 1 maintainer
rebnf 0.9
ReBNF: Regexes for Extended Backus-Naur Form (EBNF)7 versions - Latest release: about 2 years ago - 14 downloads last month - 0 stars on gitlab.com - 1 maintainer
fast-dedupe 0.1.1
Fast, Minimalist Text Deduplication Library for Python2 versions - Latest release: 5 months ago - 18 downloads last month - 1 maintainer
jange 0.1.7
Easy NLP library for Python8 versions - Latest release: almost 4 years ago - 1 dependent repositories - 30 downloads last month - 17 stars on GitHub - 1 maintainer
breame 0.1.2
Breame is a lightweight Python package with a number of tools to aid in the detection of words th...3 versions - Latest release: almost 4 years ago - 1 dependent repositories - 2.78 thousand downloads last month - 16 stars on GitHub - 1 maintainer
delb 0.5.1
A library that provides an ergonomic model for XML encoded text documents (e.g. with TEI-XML).31 versions - Latest release: 7 months ago - 2 dependent packages - 4 dependent repositories - 522 downloads last month - 17 stars on GitHub - 1 maintainer
nupunkt 0.5.1
Next-generation Punkt sentence and paragraph boundary detection with zero dependencies5 versions - Latest release: 4 months ago - 224 downloads last month - 17 stars on GitHub - 1 maintainer
linesieve 1.0
An unholy blend of grep, sed, awk, and Python.13 versions - Latest release: over 2 years ago - 1 dependent repositories - 108 downloads last month - 9 stars on GitHub - 1 maintainer
pylda2vec 1.0.0
Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec2 versions - Latest release: over 6 years ago - 1 dependent repositories - 22 downloads last month - 30 stars on GitHub - 1 maintainer
wakong 1.1.1 💰
Wakong: An appropriate and robust masking algorithm for generating the training objective of text...3 versions - Latest release: almost 3 years ago - 1 dependent repositories - 22 downloads last month - 3 stars on GitHub - 1 maintainer
pyosis 0.2.1
Unofficial Python client for parsing OSIS (Open Scriptural Information Standard) files4 versions - Latest release: 15 days ago - 40 downloads last month - 0 stars on GitHub - 1 maintainer
docdump 1.0.4
A package to extract text from common document types.5 versions - Latest release: over 4 years ago - 1 dependent repositories - 11 downloads last month - 0 stars on GitHub - 1 maintainer
fastopic 1.0.1
FASTopic8 versions - Latest release: about 2 months ago - 1.25 thousand downloads last month - 110 stars on GitHub - 1 maintainer
pypage 2.2.1
Light-weight Python Templating Engine10 versions - Latest release: 3 months ago - 1 dependent package - 1 dependent repositories - 421 downloads last month - 30 stars on GitHub - 1 maintainer
charboundary 0.5.0
Fast character-based boundary detection for sentence and paragraphs13 versions - Latest release: 4 months ago - 90 downloads last month - 3 stars on GitHub - 1 maintainer
gatenlp 1.0.8
GATE NLP implementation in Python.29 versions - Latest release: over 2 years ago - 2 dependent repositories - 3.43 thousand downloads last month - 66 stars on GitHub - 3 maintainers
wordtonumber 1.1.0
A Python library for converting words to numbers.4 versions - Latest release: 4 months ago - 212 downloads last month - 1 stars on GitHub - 1 maintainer
codextextpipe 0.1.3
All-in-one tool for text processing2 versions - Latest release: 3 months ago - 21 downloads last month - 0 stars on GitHub - 1 maintainer
pythaisa 0.2.1
Python Thai Sentiment Analysis2 versions - Latest release: almost 6 years ago - 1 dependent repositories - 18 downloads last month - 14 stars on GitHub - 1 maintainer
hadal 0.0.3
Tool for mining/alignment parallel texts3 versions - Latest release: over 1 year ago - 21 downloads last month - 6 stars on GitHub - 1 maintainer
thaitextaug 0.0.4 💰
Thai Text Augmentation15 versions - Latest release: about 4 years ago - 1 dependent repositories - 52 downloads last month - 5 stars on GitHub - 1 maintainer
khamyo 0.3.0 💰
Thai abbreviation to full text library4 versions - Latest release: 11 months ago - 1 dependent package - 1 dependent repositories - 352 downloads last month - 6 stars on GitHub - 1 maintainer
lttl 2.1.0
LangTech Text Library (LTTL) for text processing and analysis24 versions - Latest release: 6 months ago - 1 dependent repositories - 7.31 thousand downloads last month - 3 stars on GitHub - 1 maintainer
cleantextkit 0.1.1
A preprocessor which performs operations of lowering text, removing special characters and removi...2 versions - Latest release: almost 2 years ago - 22 downloads last month - 1 maintainer
printb 1.0.2 💰
printb is a wrapper for print/input built-ins, that swaps string directions for BIDI languages.1 version - Latest release: almost 4 years ago - 1 dependent repositories - 11 downloads last month - 0 stars on GitHub - 1 maintainer
geomentions 0.0.1
A mini Python package for geotagging text and retrieving location info.1 version - Latest release: 5 months ago - 10 downloads last month - 1 stars on GitHub - 1 maintainer
ttg 0.1.dev3
Thai Text Generator library3 versions - Latest release: about 5 years ago - 1 dependent repositories - 39 downloads last month - 4 stars on GitHub - 1 maintainer
bibleparser 0.0.2
Parse a mistranscribed dictated bible reference into a standard format2 versions - Latest release: 9 months ago - 28 downloads last month - 0 stars on GitHub - 1 maintainer
Top 9.4% on pypi.org
10 versions - Latest release: almost 3 years ago - 6 dependent repositories - 144 downloads last month - 145 stars on GitHub - 1 maintainer
wordcloud-fa 0.1.10 💰
A wrapper for wordcloud module for creating persian (and other rtl languages) word cloud.10 versions - Latest release: almost 3 years ago - 6 dependent repositories - 144 downloads last month - 145 stars on GitHub - 1 maintainer
zalgolib 0.2.2
A Python library for a _FULL_ Zalgo experience4 versions - Latest release: over 1 year ago - 1 dependent package - 1 dependent repositories - 15.1 thousand downloads last month - 5 stars on GitHub - 1 maintainer
gatenlp-ml-tner 0.1.0a1
Train and use transformer token classification models using tner1 version - Latest release: about 3 years ago - 11 downloads last month - 0 stars on GitHub - 1 maintainer
markdowncleaner 0.2.0
A tool for cleaning and formatting markdown documents2 versions - Latest release: 5 months ago - 34 downloads last month - 0 stars on GitHub - 1 maintainer
smartscalc 1.0.1
A library to calculate mathematical equations from text input. مكتبة تتيح لك حساب المعادلات الريا...2 versions - Latest release: about 2 months ago - 228 downloads last month - 0 stars on GitHub - 1 maintainer
nlup 0.8
('Core libraries for natural language processing',)4 versions - Latest release: over 6 years ago - 11 dependent repositories - 6.03 thousand downloads last month - 10 stars on GitHub - 3 maintainers
Top 9.7% on pypi.org
3 versions - Latest release: about 3 years ago - 1 dependent package - 3 dependent repositories - 333 downloads last month - 46 stars on GitHub - 2 maintainers
thai-nner 0.3
Thai Nested Named Entity Recognition3 versions - Latest release: about 3 years ago - 1 dependent package - 3 dependent repositories - 333 downloads last month - 46 stars on GitHub - 2 maintainers
rdatools 0.1.7
tools for relational discourse analysis2 versions - Latest release: almost 8 years ago - 1 dependent repositories - 14 downloads last month - 1 maintainer
textdatasetcleaner 0.0.6
Pipeline for cleaning (preprocessing/normalizing) text datasets4 versions - Latest release: over 4 years ago - 1 dependent repositories - 17 downloads last month - 40 stars on GitHub - 1 maintainer
easynertag 0.2 💰
Easy tagging for annotate NER corpus2 versions - Latest release: almost 3 years ago - 24 downloads last month - 2 stars on GitHub - 1 maintainer
freqframe 1.0.0
1 version - Latest release: over 4 years ago - 1 dependent repositories - 11 downloads last month - 1 maintainerthaibraille 0.1.dev2 💰
Thai Braille for Natural Language Processing.3 versions - Latest release: over 2 years ago - 22 downloads last month - 3 stars on GitHub - 1 maintainer
slaviclean 0.0.6
Text filter designed to cleanse text of profanity and offensive language, specifically tailored f...6 versions - Latest release: 6 months ago - 19 downloads last month - 1 maintainer
corpuskit 0.1.1
Corpus analysis and processing toolkit2 versions - Latest release: about 2 months ago - 66 downloads last month - 1 maintainer
Top 3.0% on pypi.org
9 versions - Latest release: over 2 years ago - 3 dependent packages - 19 dependent repositories - 12.1 thousand downloads last month - 1 stars on GitHub - 1 maintainer
addheader 0.3.2
A command to manage a header section for a source code tree9 versions - Latest release: over 2 years ago - 3 dependent packages - 19 dependent repositories - 12.1 thousand downloads last month - 1 stars on GitHub - 1 maintainer
token-distance 0.2.5
Python library designed to perform fuzzy token matching within text documents. Utilizing advanced...6 versions - Latest release: 4 months ago - 37 downloads last month - 0 stars on gitlab.com - 1 maintainer
pythaitts 0.3.0
Open Source Thai Text-to-speech library in Python5 versions - Latest release: over 1 year ago - 1 dependent repositories - 218 downloads last month - 27 stars on GitHub - 1 maintainer
fkscore 2.0.1
Flesch Kincaid readability scoring algorithm7 versions - Latest release: over 1 year ago - 1 dependent repositories - 214 downloads last month - 2 stars on GitHub - 1 maintainer
Top 6.6% on pypi.org
23 versions - Latest release: 9 months ago - 1 dependent package - 6 dependent repositories - 1.35 thousand downloads last month - 142 stars on GitHub - 1 maintainer
huspacy 0.12.1 💰
HuSpaCy: industrial strength Hungarian natural language processing23 versions - Latest release: 9 months ago - 1 dependent package - 6 dependent repositories - 1.35 thousand downloads last month - 142 stars on GitHub - 1 maintainer
pyrxg 1.0.0
Regular Expression Generator and Generaliser1 version - Latest release: 10 months ago - 9 downloads last month - 0 stars on gitlab.com - 1 maintainer
ck-textprocessor 0.0.1 removed
A preprocessor which performs operations of lowering text, removing special characters and removi...1 version - Latest release: almost 2 years ago - 1 maintainer
Related Keywords
natural language processing
37
nlp
33
NLP
21
text analytics
21
localization
19
python
19
computational linguistics
17
Thai language
11
natural-language-processing
9
ThaiNLP
8
machine-learning
7
nlp-library
7
text
7
Thai NLP
7
text-processing
6
text mining
6
text-mining
5
data science
5
linguistics
5
text analysis
5
thai-nlp
5
thai-language
5
thai
5
hacktoberfest
5
regex
5
thai-nlp-library
4
math
4
sentence boundary detection
4
information extraction
4
python3
4
spacy
4
parsing
4
parser
4
data analysis
3
Thai
3
statistics
3
llm
3
named entity recognition
3
topic-modeling
3
lemmatization
3
tagging
3
information-extraction
3
language
3
search
3
tokenization
3
language processing
3
ai
3
pos-tagger
2
spacy-models
2
spacy-pipeline
2
text preprocessing
2
universal-dependencies
2
syntax
2
text cleaner
2
named-entity-recognition
2
morphological-analysis
2
hunlp
2
hungarian
2
dependency-parsing
2
spacy model
2
word vectors
2
word embeddings
2
ner
2
pos tagging
2
sentence splitting
2
sbd
2
Hungarian
2
huspacy
2
pythainlp
2
nlp tools
2
text to structured data
2
regular expressions
2
discourse analysis
2
network analysis
2
citation analysis
2
textmining
2
zotero
2
bibliometrics
2
scientometrics
2
neural net
2
science
2
bible
2
word-embeddings
2
deep-learning
2
paragraph detection
2
xml
2
clustering
2
text conversion
2
text normalization
2
linguistic tools
2
python-gatenlp
2
gatenlp
2
machine translation
2
machine learning
2
deep learning
2
artificial intelligence
2
tts
2
data sculpting
2
large language model
2
unstructured data
2