An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "text mining" keyword

View the packages on the pypi.org package registry that are tagged with the "text mining" keyword.

grobid-quantities-client 0.4.0
A minimal client for grobid-quantities service.
5 versions - Latest release: over 2 years ago - 2 dependent packages - 2 dependent repositories - 182 downloads last month - 1 maintainer
lingualytics 0.1.3
A multilingual text analytics package.
4 versions - Latest release: over 4 years ago - 3 dependent repositories - 174 downloads last month - 37 stars on GitHub - 1 maintainer
orange-textable-prototypes 0.1.6
Extra widgets for the Textable text analysis package.
6 versions - Latest release: over 8 years ago - 1 dependent repositories - 355 downloads last month - 1 stars on GitHub - 1 maintainer
Top 0.5% on pypi.org
pdfminer.six 20250416
PDF parser and analyzer
30 versions - Latest release: 3 days ago - 162 dependent packages - 2,496 dependent repositories - 7.75 million downloads last month - 5,784 stars on GitHub - 3 maintainers
bent 0.0.80
BENT: Biomedical Entity Annotator
60 versions - Latest release: 7 months ago - 1 dependent repositories - 1.53 thousand downloads last month - 9 stars on GitHub - 1 maintainer
e.pdfminer.six 0.0.1
PDF parser and analyzer
1 version - Latest release: over 5 years ago - 87 downloads last month - 6,370 stars on GitHub - 1 maintainer
20220429-pdfminer-jameslp310 0.0.2
PDF parser and analyzer
1 version - Latest release: almost 3 years ago - 1 dependent repositories - 125 downloads last month - 6,330 stars on GitHub - 1 maintainer
pdfminer.six.reducto 0.0.1
PDF parser and analyzer
1 version - Latest release: 5 months ago - 256 downloads last month - 6,370 stars on GitHub - 1 maintainer
pdfminer.bc 0.0.2
PDF parser and analyzer
2 versions - Latest release: 5 months ago - 109 downloads last month - 6,370 stars on GitHub - 1 maintainer
Top 0.7% on pypi.org
pdfminer 20131022
PDF parser and analyzer
41 versions - Latest release: over 1 year ago - 46 dependent packages - 1,423 dependent repositories - 200 thousand downloads last month - 5,244 stars on GitHub - 1 maintainer
easylda 0.2.7
easily bult LDA Topic Models with just a list of docs (e.g. a list of twitter posts in CSV/TXT
48 versions - Latest release: about 7 years ago - 1 dependent repositories - 2.82 thousand downloads last month - 3 stars on GitHub - 1 maintainer
bibx 0.6.3
Python bibliometric tools.
21 versions - Latest release: about 1 month ago - 1 dependent repositories - 1.21 thousand downloads last month - 10 stars on GitHub - 1 maintainer
textmining-module 2.1.2
A Python Module for Comprehensive Text Mining, including Keyword Extraction and Text Analysis.
7 versions - Latest release: 4 months ago - 182 downloads last month - 0 stars on GitHub - 1 maintainer
slang 0.1.12
A structural approach to signal ML
20 versions - Latest release: over 1 year ago - 2 dependent repositories - 661 downloads last month - 5 stars on GitHub - 1 maintainer
utilss 0.1.7
Useful tools to work with text mining in Python
6 versions - Latest release: almost 5 years ago - 1 dependent repositories - 231 downloads last month - 0 stars on GitHub - 1 maintainer
pdfmajor 1.3.13
PDF parser
36 versions - Latest release: over 5 years ago - 1 dependent repositories - 1.28 thousand downloads last month - 22 stars on GitHub - 1 maintainer
textana4sc 0.4
文本分析库,可对文本进行词频统计、词典扩充、情绪分析等
4 versions - Latest release: about 2 years ago - 123 downloads last month - 1 maintainer
textpredict 0.1.3
TextPredict is a powerful Python package designed for various text analysis and prediction tasks ...
4 versions - Latest release: 9 months ago - 192 downloads last month - 2 stars on GitHub - 1 maintainer
autocorpus 1.1.0
A tool to standardise text and table data extracted from full text publications.
1 version - Latest release: 3 months ago - 60 downloads last month - 21 stars on GitHub - 1 maintainer
decat 1.0.3
De-concatenate strings that do not have white-spaces.
4 versions - Latest release: almost 2 years ago - 1 dependent repositories - 174 downloads last month - 0 stars on GitHub - 1 maintainer
textdescriptives 2.8.4
A library for calculating a variety of features from text using spaCy
38 versions - Latest release: 4 months ago - 4 dependent packages - 1 dependent repositories - 4.5 thousand downloads last month - 306 stars on GitHub - 2 maintainers
jgtextrank 0.1.6
Yet another Python implementation of TextRank: package for the creation, manipulation, and study ...
5 versions - Latest release: over 5 years ago - 1 dependent repositories - 190 downloads last month - 13 stars on GitHub - 1 maintainer
Top 5.0% on pypi.org
shorttext 2.1.0
Short Text Mining
77 versions - Latest release: 4 months ago - 6 dependent repositories - 1.9 thousand downloads last month - 470 stars on GitHub - 1 maintainer
playa-pdf 0.4.1 💰
Parallel and LazY Analyzer for PDFs
18 versions - Latest release: 30 days ago - 3.36 thousand downloads last month - 26 stars on GitHub - 1 maintainer
yapdfminer 1.2.2
PDF parser and analyzer
8 versions - Latest release: over 5 years ago - 1 dependent repositories - 329 downloads last month - 2 stars on GitHub - 1 maintainer
Top 3.0% on pypi.org
quantulum3 0.9.2 💰
Extract quantities from unstructured text.
41 versions - Latest release: 10 months ago - 8 dependent packages - 44 dependent repositories - 118 thousand downloads last month - 137 stars on GitHub - 1 maintainer
hades-nlp 0.1.2
Homologous Automated Document Exploration and Summarization - A powerful tool for comparing simil...
3 versions - Latest release: over 1 year ago - 124 downloads last month - 8 stars on GitHub - 2 maintainers
sentimentpredictor 0.1.3
A flexible sentiment analysis predictor package supporting multiple pre-trained models, customiza...
4 versions - Latest release: 10 months ago - 146 downloads last month - 2 stars on GitHub - 1 maintainer
Top 9.9% on pypi.org
pyxpdf 0.2.3
Powerful and Pythonic PDF processing library based on xpdf-4.02
6 versions - Latest release: over 4 years ago - 3 dependent repositories - 2.01 thousand downloads last month - 41 stars on GitHub - 1 maintainer
seesus 1.2.1
a social, environmental, and economic sustainability classifier based on the UN Sustainable Devel...
4 versions - Latest release: about 1 year ago - 180 downloads last month - 8 stars on GitHub - 1 maintainer
Top 9.5% on pypi.org
augmenty 1.4.4
An augmentation library based on SpaCy for joint augmentation of text and labels.
33 versions - Latest release: about 1 year ago - 4 dependent packages - 1 dependent repositories - 4.55 thousand downloads last month - 153 stars on GitHub - 1 maintainer
Top 3.0% on pypi.org
texthero 1.1.0
Text preprocessing, representation and visualization from zero to hero.
10 versions - Latest release: almost 4 years ago - 1 dependent package - 29 dependent repositories - 2.35 thousand downloads last month - 2,905 stars on GitHub - 1 maintainer
textherox 1.2.0
Text preprocessing, representation and visualization from zero to hero.
1 version - Latest release: over 2 years ago - 52 downloads last month - 2,905 stars on GitHub - 1 maintainer
pdfminer.rtl 1.0.1
PDF parser and analyzer
8 versions - Latest release: about 1 year ago - 254 downloads last month - 1 maintainer
seededpf 0.1.0
SeededPF is a seed guided topic model based on Poisson factorization.
3 versions - Latest release: about 1 month ago - 248 downloads last month - 0 stars on GitHub - 1 maintainer
multistop 1.3
文本分析停用词表,支持中英德法等15种语言。
1 version - Latest release: almost 3 years ago - 1 dependent repositories - 30 downloads last month - 1 maintainer
storynavigator 0.1.1
Narrative analysis add-on for the Orange 3 data mining software package.
31 versions - Latest release: 3 months ago - 712 downloads last month - 3 stars on GitHub - 2 maintainers
leia-br 0.0.1
LeIA (Léxico para Inferência Adaptada) é um fork do léxico e ferramenta para análise de sentiment...
1 version - Latest release: over 2 years ago - 1 dependent repositories - 616 downloads last month - 3 stars on GitHub - 1 maintainer
keypartx 0.1.20
A Graph-based Perception(Text) Representation
40 versions - Latest release: almost 2 years ago - 1.17 thousand downloads last month - 33 stars on GitHub - 1 maintainer
swinger 2.1
A sentiment classifier for Chinese
12 versions - Latest release: almost 8 years ago - 1 dependent repositories - 221 downloads last month - 36 stars on GitHub - 1 maintainer
pdfminer.six-i 20190823
PDF parser and analyzer
5 versions - Latest release: over 5 years ago - 1 dependent repositories - 1.23 thousand downloads last month - 2 maintainers
huspacy-nightly 0.11.0.dev261 💰
HuSpaCy: industrial strength Hungarian natural language processing
126 versions - Latest release: over 1 year ago - 1 dependent repositories - 2.25 thousand downloads last month - 155 stars on GitHub - 1 maintainer
Top 6.6% on pypi.org
huspacy 0.12.1 💰
HuSpaCy: industrial strength Hungarian natural language processing
23 versions - Latest release: 6 months ago - 1 dependent package - 6 dependent repositories - 2.19 thousand downloads last month - 142 stars on GitHub - 1 maintainer
Top 5.9% on pypi.org
pdfminer2 20151206 💰
PDF parser and analyzer
1 version - Latest release: over 9 years ago - 3 dependent packages - 37 dependent repositories - 53.2 thousand downloads last month - 24 stars on GitHub - 1 maintainer
orange-text 1.2a1
Orange Text Mining add-on for Orange data mining software package.
1 version - Latest release: almost 13 years ago - 1 dependent repositories - 42 downloads last month - 2 maintainers
nlpbaselines 0.0.49
Quickly establish strong baselines for NLP tasks
27 versions - Latest release: over 1 year ago - 1 dependent repositories - 546 downloads last month - 0 stars on GitHub - 1 maintainer
chemdataextractor-api 0.0.1
Chemdataextractor REST API wrapper
2 versions - Latest release: about 4 years ago - 1 dependent repositories - 59 downloads last month - 0 stars on GitHub - 1 maintainer
pdfdocx 1.7
读取pdf、docx文件,返回文件内的文本数据。
8 versions - Latest release: over 1 year ago - 1 dependent repositories - 505 downloads last month - 4 stars on GitHub - 1 maintainer
lttl 2.1.0
LangTech Text Library (LTTL) for text processing and analysis
24 versions - Latest release: 3 months ago - 1 dependent repositories - 4.93 thousand downloads last month - 3 stars on GitHub - 1 maintainer
spacy-wrap 1.4.5
Wrappers for including pre-trained transformers in spaCy pipelines
21 versions - Latest release: over 1 year ago - 3 dependent packages - 1 dependent repositories - 1.98 thousand downloads last month - 46 stars on GitHub - 1 maintainer
textanalyze4sc 2.0
文本分析库,可对文本进行词频统计、词典扩充、情绪分析等
7 versions - Latest release: about 2 years ago - 38 downloads last month - 1 maintainer
pdf-wrangler 0.0.31
PDFMiner Wrapper for extractions
9 versions - Latest release: over 3 years ago - 1 dependent repositories - 325 downloads last month - 1 stars on GitHub - 1 maintainer
mledu 0.0.2
build machine learning models for education purpose
4 versions - Latest release: about 7 years ago - 1 dependent repositories - 248 downloads last month - 1 maintainer
awessome 0.0.14
awessome
6 versions - Latest release: over 4 years ago - 1 dependent repositories - 192 downloads last month - 2 stars on GitHub - 1 maintainer
vader-sentiment 3.2.1
VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon an...
2 versions - Latest release: over 6 years ago - 5 dependent repositories - 606 downloads last month - 1 stars on GitHub - 1 maintainer
lexis 0.1.2
Wordnet wrapper - Easy access to words and their relationships
5 versions - Latest release: about 2 years ago - 2 dependent packages - 1 dependent repositories - 258 downloads last month - 1 stars on GitHub - 1 maintainer
gorpy 2.0.4
Grep tool with extensions for reading files in many different ways
9 versions - Latest release: over 1 year ago - 1 dependent repositories - 243 downloads last month - 0 stars on GitHub - 1 maintainer
pdfminer-cython 20200304
PDF parser and analyzer
1 version - Latest release: almost 5 years ago - 1 dependent repositories - 40 downloads last month - 5,143 stars on GitHub - 1 maintainer
pdfminer-with-logger 1.0.0
PDF parser and analyzer
1 version - Latest release: over 4 years ago - 1 dependent repositories - 64 downloads last month - 5,281 stars on GitHub - 1 maintainer
vader-multi 3.2.2 💰
VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon an...
2 versions - Latest release: over 5 years ago - 2 dependent repositories - 1.13 thousand downloads last month - 20 stars on GitHub - 1 maintainer
grub 0.1.3
A ridiculously simple search engine factory
17 versions - Latest release: almost 3 years ago - 4 dependent repositories - 774 downloads last month - 3 stars on GitHub - 1 maintainer
textdatasetcleaner 0.0.6
Pipeline for cleaning (preprocessing/normalizing) text datasets
4 versions - Latest release: about 4 years ago - 1 dependent repositories - 207 downloads last month - 39 stars on GitHub - 1 maintainer
seedot 3.0
SEEDOT: Tool for Enhancing Sentiment Lexicon with Machine Learning
2 versions - Latest release: almost 2 years ago - 92 downloads last month - 2 stars on GitHub - 1 maintainer
vadernew 2.0
Pacchetto contentente 4 dizionari che fungono da miglioramento di Vader per argomanti specifici, ...
9 versions - Latest release: almost 3 years ago - 1 dependent repositories - 160 downloads last month - 0 stars on GitHub - 1 maintainer
blauwal3-textable 3.1.11
蓝鲸数据挖掘软件包的文本分析附加组件。
1 version - Latest release: 7 months ago - 76 downloads last month - 28 stars on GitHub - 1 maintainer
orange3-textable-prototypes 0.31
Additional widgets for the Textable add-on to Orange 3.
26 versions - Latest release: about 1 month ago - 1 dependent repositories - 422 downloads last month - 6 stars on GitHub - 1 maintainer
orange3-textable 3.2.3
Textable add-on for Orange 3 data mining software package.
33 versions - Latest release: about 1 month ago - 1 dependent repositories - 3.17 thousand downloads last month - 27 stars on GitHub - 1 maintainer
pdfminer.hitalent 20221118
PDF parser and analyzer
20 versions - Latest release: over 2 years ago - 366 downloads last month - 1 maintainer
fummytransformers 0.0.18
Fast and dummy way of using transformers to establish quick baselines
17 versions - Latest release: over 2 years ago - 400 downloads last month - 0 stars on GitHub - 1 maintainer
statcamp 0.0.2
stat funcs from datacamp stat thinking
2 versions - Latest release: about 7 years ago - 1 dependent repositories - 112 downloads last month - 0 stars on GitHub - 1 maintainer
orange-textable 2.0.1
Textable add-on for Orange 2.7 data mining software package.
18 versions - Latest release: about 8 years ago - 1 dependent repositories - 515 downloads last month - 1 maintainer
contextmining 0.0.6
Complementing topic models with few-shot in-context learning to generate interpretable topics
6 versions - Latest release: 3 months ago - 279 downloads last month - 0 stars on GitHub - 1 maintainer
dsutils 0.2.0
data science utils for data preprocessing for feeding various models, pipelining, time data forma...
20 versions - Latest release: about 7 years ago - 1 dependent repositories - 825 downloads last month - 0 stars on GitHub - 1 maintainer
grogu1 1.1
Pacchetto contentente 4 dizionari che fungono da miglioramento di Vader per argomanti specifici, ...
1 version - Latest release: almost 2 years ago - 75 downloads last month - 0 stars on GitHub - 1 maintainer
bagofconcepts 0.1.0
This is python implementation of Bag-of-Concepts, as proposed by the paper "Bag-of-Concepts: Comp...
2 versions - Latest release: almost 3 years ago - 91 downloads last month - 20 stars on GitHub - 1 maintainer
pdfminer.aemc 20231229
PDF parser and analyzer
9 versions - Latest release: 12 months ago - 1 dependent package - 233 downloads last month - 1 maintainer
pybursts 0.1.1
A Python port of the 'burst detection' algorithm by Kleinberg, originally implemented in R
2 versions - Latest release: over 10 years ago - 4 dependent repositories - 55 downloads last month - 1 maintainer
Top 6.4% on pypi.org
metapy 0.2.13
Python bindings for MeTA
24 versions - Latest release: over 6 years ago - 30 dependent repositories - 1.31 thousand downloads last month - 50 stars on GitHub - 2 maintainers
material-parser 1.2
Grobid superconductors tools material parser
2 versions - Latest release: about 2 years ago - 54 downloads last month - 11 stars on GitHub - 1 maintainer
frenchnlp 0.2.3
State of the art toolchain for natural language processing in French
13 versions - Latest release: almost 4 years ago - 1 dependent repositories - 298 downloads last month - 1 stars on GitHub - 1 maintainer
simtext 1.3
文本、文档相似性计算
5 versions - Latest release: over 3 years ago - 1 dependent repositories - 168 downloads last month - 13 stars on GitHub - 1 maintainer
textprepro 0.0.1
Everything Everyway All At Once Text Preprocessing.
2 versions - Latest release: almost 2 years ago - 98 downloads last month - 2 stars on GitHub - 1 maintainer
vadersentiment-swedish 1.0.3
VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon an...
3 versions - Latest release: over 5 years ago - 1 dependent repositories - 73 downloads last month - 6 stars on GitHub - 1 maintainer
quantulum 0.1.16
Extract quantities from unstructured text.
17 versions - Latest release: over 1 year ago - 1 dependent package - 4 dependent repositories - 283 downloads last month - 120 stars on GitHub - 1 maintainer
Related Keywords
nlp 28 text analysis 23 natural language processing 18 pdf parser 18 pdf converter 16 python 14 layout analysis 14 text-mining 13 sentiment analysis 13 pdf 10 natural-language-processing 10 machine-learning 10 NLP 10 text 9 machine learning 9 sentiment 9 analysis 9 data 8 opinion analysis 8 twitter sentiment 8 opinion mining 8 social media 8 twitter 8 social 8 media 8 mining 8 opinion 8 vader 7 text processing 6 data science 6 spacy 5 data mining 5 information-extraction 5 parsing 5 parser 5 transformers 5 textable 5 orange3 4 corpus 4 text preprocessing 4 text representation 4 topic-modeling 4 orange 4 sentiment-analysis 4 text analytics 4 orange3 add-on 4 text similarity 3 topic modeling 3 python3 3 text-classification 3 spacy-models 3 french 3 hacktoberfest 3 spaCy 3 tagging 3 spacy-extension 3 npl 3 spacy-pipeline 3 measurements 3 text visualization 3 bert 3 orange add-on 3 named entity recognition 3 information extraction 3 text-representation 3 text-preprocessing 3 named-entity-recognition 3 natual language processing 3 text-clustering 2 lemmatization 2 ner 2 normalization 2 clustering 2 statistics 2 word embeddings 2 word vectors 2 spacy model 2 dependency-parsing 2 hungarian 2 regular-expressions 2 hunlp 2 units-of-measure 2 morphological-analysis 2 pytorch 2 units 2 quantities 2 keywords-extraction 2 pos-tagger 2 physics 2 text-analytics 2 universal-dependencies 2 pre-trained models 2 nlp-pipeline 2 huspacy 2 Hungarian 2 language processing 2 tokenization 2 sentence boundary detection 2 text-visualization 2 text-processing 2