An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "text mining" keyword

View the packages on the pypi.org package registry that are tagged with the "text mining" keyword.

huspacy-nightly 0.11.0.dev261 💰
HuSpaCy: industrial strength Hungarian natural language processing
126 versions - Latest release: over 1 year ago - 1 dependent repositories - 165 downloads last month - 155 stars on GitHub - 1 maintainer
Top 6.6% on pypi.org
huspacy 0.12.1 💰
HuSpaCy: industrial strength Hungarian natural language processing
23 versions - Latest release: 9 months ago - 1 dependent package - 6 dependent repositories - 874 downloads last month - 142 stars on GitHub - 1 maintainer
textdescriptives 2.8.4
A library for calculating a variety of features from text using spaCy
38 versions - Latest release: 8 months ago - 4 dependent packages - 1 dependent repositories - 2.31 thousand downloads last month - 306 stars on GitHub - 2 maintainers
Top 0.7% on pypi.org
pdfminer 20131022
PDF parser and analyzer
41 versions - Latest release: almost 2 years ago - 46 dependent packages - 1,423 dependent repositories - 208 thousand downloads last month - 5,244 stars on GitHub - 1 maintainer
Top 9.5% on pypi.org
augmenty 1.4.4
An augmentation library based on SpaCy for joint augmentation of text and labels.
33 versions - Latest release: over 1 year ago - 4 dependent packages - 1 dependent repositories - 3.38 thousand downloads last month - 156 stars on GitHub - 1 maintainer
seesus 1.2.1
a social, environmental, and economic sustainability classifier based on the UN Sustainable Devel...
4 versions - Latest release: over 1 year ago - 33 downloads last month - 8 stars on GitHub - 1 maintainer
Top 0.5% on pypi.org
pdfminer.six 20250506
PDF parser and analyzer
31 versions - Latest release: 3 months ago - 162 dependent packages - 2,496 dependent repositories - 10.3 million downloads last month - 5,784 stars on GitHub - 3 maintainers
Top 5.9% on pypi.org
pdfminer2 20151206 💰
PDF parser and analyzer
1 version - Latest release: over 9 years ago - 3 dependent packages - 37 dependent repositories - 78.2 thousand downloads last month - 24 stars on GitHub - 1 maintainer
pmcgrab 0.2.1
Utilities for fetching and processing PubMed Central articles
2 versions - Latest release: about 21 hours ago - 0 stars on GitHub - 1 maintainer
orange-fastcoref-plugin 0.1.1
FastCoRef coreference resolution plugin for Orange3
2 versions - Latest release: about 1 month ago - 164 stars on GitHub - 1 maintainer
seedot 3.0
SEEDOT: Tool for Enhancing Sentiment Lexicon with Machine Learning
2 versions - Latest release: about 2 years ago - 19 downloads last month - 2 stars on GitHub - 1 maintainer
fummytransformers 0.0.18
Fast and dummy way of using transformers to establish quick baselines
17 versions - Latest release: almost 3 years ago - 25 downloads last month - 0 stars on GitHub - 1 maintainer
frenchnlp 0.2.3
State of the art toolchain for natural language processing in French
13 versions - Latest release: about 4 years ago - 1 dependent repositories - 25 downloads last month - 1 stars on GitHub - 1 maintainer
storynavigator 0.1.1
Narrative analysis add-on for the Orange 3 data mining software package.
31 versions - Latest release: 6 months ago - 77 downloads last month - 3 stars on GitHub - 2 maintainers
grobid-quantities-client 0.4.0
A minimal client for grobid-quantities service.
5 versions - Latest release: over 2 years ago - 2 dependent packages - 2 dependent repositories - 37 downloads last month - 1 maintainer
lingualytics 0.1.3
A multilingual text analytics package.
4 versions - Latest release: almost 5 years ago - 3 dependent repositories - 38 downloads last month - 37 stars on GitHub - 1 maintainer
dsutils 0.2.0
data science utils for data preprocessing for feeding various models, pipelining, time data forma...
20 versions - Latest release: over 7 years ago - 1 dependent repositories - 205 downloads last month - 0 stars on GitHub - 1 maintainer
orange3-textable-prototypes 0.31
Additional widgets for the Textable add-on to Orange 3.
26 versions - Latest release: 4 months ago - 1 dependent repositories - 143 downloads last month - 6 stars on GitHub - 1 maintainer
orange-textable 2.0.1
Textable add-on for Orange 2.7 data mining software package.
18 versions - Latest release: over 8 years ago - 1 dependent repositories - 161 downloads last month - 1 maintainer
autocorpus 1.1.1
A tool to standardise text and table data extracted from full text publications.
2 versions - Latest release: about 2 months ago - 31 downloads last month - 21 stars on GitHub - 1 maintainer
sentimentpredictor 0.1.3
A flexible sentiment analysis predictor package supporting multiple pre-trained models, customiza...
4 versions - Latest release: about 1 year ago - 23 downloads last month - 2 stars on GitHub - 1 maintainer
jgtextrank 0.1.6
Yet another Python implementation of TextRank: package for the creation, manipulation, and study ...
5 versions - Latest release: over 5 years ago - 1 dependent repositories - 45 downloads last month - 13 stars on GitHub - 1 maintainer
Top 6.4% on pypi.org
metapy 0.2.13
Python bindings for MeTA
24 versions - Latest release: almost 7 years ago - 30 dependent repositories - 1.37 thousand downloads last month - 51 stars on GitHub - 2 maintainers
bibx 0.7.1
Python bibliometric tools.
23 versions - Latest release: about 1 month ago - 1 dependent repositories - 429 downloads last month - 10 stars on GitHub - 1 maintainer
Top 5.0% on pypi.org
shorttext 2.2.1
Short Text Mining
80 versions - Latest release: about 2 months ago - 6 dependent repositories - 209 downloads last month - 472 stars on GitHub - 1 maintainer
simtext 1.3
文本、文档相似性计算
5 versions - Latest release: almost 4 years ago - 1 dependent repositories - 52 downloads last month - 12 stars on GitHub - 1 maintainer
Top 3.0% on pypi.org
texthero 1.1.0
Text preprocessing, representation and visualization from zero to hero.
10 versions - Latest release: about 4 years ago - 1 dependent package - 29 dependent repositories - 1.06 thousand downloads last month - 2,905 stars on GitHub - 1 maintainer
vadernew 2.0
Pacchetto contentente 4 dizionari che fungono da miglioramento di Vader per argomanti specifici, ...
9 versions - Latest release: about 3 years ago - 1 dependent repositories - 23 downloads last month - 0 stars on GitHub - 1 maintainer
textherox 1.2.0
Text preprocessing, representation and visualization from zero to hero.
1 version - Latest release: almost 3 years ago - 10 downloads last month - 2,905 stars on GitHub - 1 maintainer
Top 3.0% on pypi.org
quantulum3 0.9.2 💰
Extract quantities from unstructured text.
41 versions - Latest release: about 1 year ago - 8 dependent packages - 44 dependent repositories - 134 thousand downloads last month - 142 stars on GitHub - 1 maintainer
orange-textable-prototypes 0.1.6
Extra widgets for the Textable text analysis package.
6 versions - Latest release: almost 9 years ago - 1 dependent repositories - 45 downloads last month - 1 stars on GitHub - 1 maintainer
quantulum 0.1.16
Extract quantities from unstructured text.
17 versions - Latest release: almost 2 years ago - 1 dependent package - 4 dependent repositories - 38 downloads last month - 119 stars on GitHub - 1 maintainer
playa-pdf 0.6.2 💰
Parallel and LazY Analyzer for PDFs
25 versions - Latest release: 11 days ago - 12 thousand downloads last month - 31 stars on GitHub - 1 maintainer
pybursts 0.1.1
A Python port of the 'burst detection' algorithm by Kleinberg, originally implemented in R
2 versions - Latest release: over 10 years ago - 4 dependent repositories - 17 downloads last month - 1 maintainer
contextmining 0.0.6
Complementing topic models with few-shot in-context learning to generate interpretable topics
6 versions - Latest release: 7 months ago - 59 downloads last month - 0 stars on GitHub - 1 maintainer
slang 0.1.12
A structural approach to signal ML
20 versions - Latest release: almost 2 years ago - 2 dependent repositories - 274 downloads last month - 5 stars on GitHub - 1 maintainer
architxt 0.3.1
ArchiTXT is a tool for structuring textual data into a valid database model. It is guided by a me...
6 versions - Latest release: 17 days ago - 169 downloads last month - 3 stars on GitHub - 1 maintainer
textmining-module 2.1.2
A Python Module for Comprehensive Text Mining, including Keyword Extraction and Text Analysis.
7 versions - Latest release: 8 months ago - 16 downloads last month - 0 stars on GitHub - 1 maintainer
vadersentiment-swedish 1.0.3
VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon an...
3 versions - Latest release: almost 6 years ago - 1 dependent repositories - 12 downloads last month - 6 stars on GitHub - 1 maintainer
bent 0.0.80
BENT: Biomedical Entity Annotator
60 versions - Latest release: 10 months ago - 1 dependent repositories - 425 downloads last month - 9 stars on GitHub - 1 maintainer
hades-nlp 0.1.2
Homologous Automated Document Exploration and Summarization - A powerful tool for comparing simil...
3 versions - Latest release: almost 2 years ago - 19 downloads last month - 8 stars on GitHub - 2 maintainers
decat 1.0.3
De-concatenate strings that do not have white-spaces.
4 versions - Latest release: about 2 years ago - 1 dependent repositories - 28 downloads last month - 0 stars on GitHub - 1 maintainer
yapdfminer 1.2.2
PDF parser and analyzer
8 versions - Latest release: almost 6 years ago - 1 dependent repositories - 70 downloads last month - 2 stars on GitHub - 1 maintainer
analytics_tasks 0.1.0
Automation including file search and slide deck preparation.
1 version - Latest release: 30 days ago - 118 downloads last month - 0 stars on GitHub - 1 maintainer
keypartx 0.1.20
A Graph-based Perception(Text) Representation
40 versions - Latest release: over 2 years ago - 240 downloads last month - 33 stars on GitHub - 1 maintainer
textminer-prozzzzxcv 0.1.0
고급 텍스트 분석을 위한 Python 패키지
1 version - Latest release: about 2 months ago - 117 downloads last month - 1 maintainer
pdfmajor 1.3.13
PDF parser
36 versions - Latest release: over 5 years ago - 1 dependent repositories - 332 downloads last month - 22 stars on GitHub - 1 maintainer
text2term 4.5.0
A tool for mapping free-text descriptions of entities to ontology terms
29 versions - Latest release: 2 months ago - 1 dependent repositories - 647 downloads last month - 11 stars on GitHub - 2 maintainers
textana4sc 0.4
文本分析库,可对文本进行词频统计、词典扩充、情绪分析等
4 versions - Latest release: over 2 years ago - 11 downloads last month - 1 maintainer
textpredict 0.1.3
TextPredict is a powerful Python package designed for various text analysis and prediction tasks ...
4 versions - Latest release: about 1 year ago - 16 downloads last month - 2 stars on GitHub - 1 maintainer
textprepro 0.0.1
Everything Everyway All At Once Text Preprocessing.
2 versions - Latest release: about 2 years ago - 12 downloads last month - 2 stars on GitHub - 1 maintainer
leia-br 0.0.1
LeIA (Léxico para Inferência Adaptada) é um fork do léxico e ferramenta para análise de sentiment...
1 version - Latest release: over 2 years ago - 1 dependent repositories - 1.2 thousand downloads last month - 3 stars on GitHub - 1 maintainer
pdfminer.rtl 1.0.1
PDF parser and analyzer
8 versions - Latest release: over 1 year ago - 76 downloads last month - 1 maintainer
pdf-wrangler 0.0.31
PDFMiner Wrapper for extractions
9 versions - Latest release: over 3 years ago - 1 dependent repositories - 57 downloads last month - 1 stars on GitHub - 1 maintainer
multistop 1.3
文本分析停用词表,支持中英德法等15种语言。
1 version - Latest release: about 3 years ago - 1 dependent repositories - 20 downloads last month - 1 maintainer
seededpf 0.1.0
SeededPF is a seed guided topic model based on Poisson factorization.
3 versions - Latest release: 5 months ago - 22 downloads last month - 0 stars on GitHub - 1 maintainer
spacy-wrap 1.4.5
Wrappers for including pre-trained transformers in spaCy pipelines
21 versions - Latest release: almost 2 years ago - 3 dependent packages - 1 dependent repositories - 633 downloads last month - 46 stars on GitHub - 1 maintainer
Top 9.9% on pypi.org
pyxpdf 0.2.3
Powerful and Pythonic PDF processing library based on xpdf-4.02
6 versions - Latest release: almost 5 years ago - 3 dependent repositories - 775 downloads last month - 42 stars on GitHub - 1 maintainer
lexis 0.1.2
Wordnet wrapper - Easy access to words and their relationships
5 versions - Latest release: over 2 years ago - 2 dependent packages - 1 dependent repositories - 101 downloads last month - 1 stars on GitHub - 1 maintainer
pdfminer-with-logger 1.0.0
PDF parser and analyzer
1 version - Latest release: over 4 years ago - 1 dependent repositories - 18 downloads last month - 5,289 stars on GitHub - 1 maintainer
chemdataextractor-api 0.0.1
Chemdataextractor REST API wrapper
2 versions - Latest release: over 4 years ago - 1 dependent repositories - 6 downloads last month - 0 stars on GitHub - 1 maintainer
nlpbaselines 0.0.49
Quickly establish strong baselines for NLP tasks
27 versions - Latest release: over 1 year ago - 1 dependent repositories - 98 downloads last month - 0 stars on GitHub - 1 maintainer
mledu 0.0.2
build machine learning models for education purpose
4 versions - Latest release: over 7 years ago - 1 dependent repositories - 52 downloads last month - 1 maintainer
pdfminer-cython 20200304
PDF parser and analyzer
1 version - Latest release: about 5 years ago - 1 dependent repositories - 7 downloads last month - 5,143 stars on GitHub - 1 maintainer
swinger 2.1
A sentiment classifier for Chinese
12 versions - Latest release: about 8 years ago - 1 dependent repositories - 22 downloads last month - 36 stars on GitHub - 1 maintainer
pdfminer.six-i 20190823
PDF parser and analyzer
5 versions - Latest release: almost 6 years ago - 1 dependent repositories - 1.25 thousand downloads last month - 2 maintainers
vader-multi 3.2.2 💰
VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon an...
2 versions - Latest release: almost 6 years ago - 2 dependent repositories - 1.43 thousand downloads last month - 22 stars on GitHub - 1 maintainer
orange-text 1.2a1
Orange Text Mining add-on for Orange data mining software package.
1 version - Latest release: about 13 years ago - 1 dependent repositories - 27 downloads last month - 2 maintainers
blauwal3-textable 3.1.11
蓝鲸数据挖掘软件包的文本分析附加组件。
1 version - Latest release: 10 months ago - 22 downloads last month - 29 stars on GitHub - 1 maintainer
lttl 2.1.0
LangTech Text Library (LTTL) for text processing and analysis
24 versions - Latest release: 6 months ago - 1 dependent repositories - 7.31 thousand downloads last month - 3 stars on GitHub - 1 maintainer
pdfdocx 1.7
读取pdf、docx文件,返回文件内的文本数据。
8 versions - Latest release: almost 2 years ago - 1 dependent repositories - 237 downloads last month - 5 stars on GitHub - 1 maintainer
utilss 0.1.7
Useful tools to work with text mining in Python
6 versions - Latest release: about 5 years ago - 1 dependent repositories - 32 downloads last month - 0 stars on GitHub - 1 maintainer
textanalyze4sc 2.0
文本分析库,可对文本进行词频统计、词典扩充、情绪分析等
7 versions - Latest release: over 2 years ago - 7 downloads last month - 1 maintainer
orange3-textable 3.2.4
Textable add-on for Orange 3 data mining software package.
34 versions - Latest release: about 2 months ago - 1 dependent repositories - 7.72 thousand downloads last month - 29 stars on GitHub - 1 maintainer
vader-sentiment 3.2.1
VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon an...
2 versions - Latest release: over 6 years ago - 5 dependent repositories - 458 downloads last month - 1 stars on GitHub - 1 maintainer
gorpy 2.0.4
Grep tool with extensions for reading files in many different ways
9 versions - Latest release: almost 2 years ago - 1 dependent repositories - 32 downloads last month - 0 stars on GitHub - 1 maintainer
grogu1 1.1
Pacchetto contentente 4 dizionari che fungono da miglioramento di Vader per argomanti specifici, ...
1 version - Latest release: about 2 years ago - 14 downloads last month - 0 stars on GitHub - 1 maintainer
statcamp 0.0.2
stat funcs from datacamp stat thinking
2 versions - Latest release: over 7 years ago - 1 dependent repositories - 28 downloads last month - 0 stars on GitHub - 1 maintainer
awessome 0.0.14
awessome
6 versions - Latest release: almost 5 years ago - 1 dependent repositories - 58 downloads last month - 2 stars on GitHub - 1 maintainer
20220429-pdfminer-jameslp310 0.0.2
PDF parser and analyzer
1 version - Latest release: over 3 years ago - 1 dependent repositories - 121 downloads last month - 6,503 stars on GitHub - 1 maintainer
pdfminer.bc 0.0.2
PDF parser and analyzer
2 versions - Latest release: 9 months ago - 12 downloads last month - 6,503 stars on GitHub - 1 maintainer
pdfminer.six.reducto 0.0.1
PDF parser and analyzer
1 version - Latest release: 9 months ago - 577 downloads last month - 6,503 stars on GitHub - 1 maintainer
e.pdfminer.six 0.0.1
PDF parser and analyzer
1 version - Latest release: over 5 years ago - 73 downloads last month - 6,503 stars on GitHub - 1 maintainer
textdatasetcleaner 0.0.6
Pipeline for cleaning (preprocessing/normalizing) text datasets
4 versions - Latest release: over 4 years ago - 1 dependent repositories - 17 downloads last month - 40 stars on GitHub - 1 maintainer
pdfminer.aemc 20231229
PDF parser and analyzer
9 versions - Latest release: about 1 year ago - 1 dependent package - 63 downloads last month - 1 maintainer
grub 0.1.4
A ridiculously simple search engine factory
18 versions - Latest release: 3 months ago - 4 dependent repositories - 406 downloads last month - 3 stars on GitHub - 1 maintainer
pdfminer.hitalent 20221118
PDF parser and analyzer
20 versions - Latest release: over 2 years ago - 132 downloads last month - 1 maintainer
easylda 0.2.7
easily bult LDA Topic Models with just a list of docs (e.g. a list of twitter posts in CSV/TXT
48 versions - Latest release: over 7 years ago - 1 dependent repositories - 535 downloads last month - 3 stars on GitHub - 1 maintainer
bagofconcepts 0.1.0
This is python implementation of Bag-of-Concepts, as proposed by the paper "Bag-of-Concepts: Comp...
2 versions - Latest release: about 3 years ago - 17 downloads last month - 20 stars on GitHub - 1 maintainer
material-parser 1.2
Grobid superconductors tools material parser
2 versions - Latest release: over 2 years ago - 7 downloads last month - 12 stars on GitHub - 1 maintainer
Related Keywords
nlp 30 text analysis 25 natural language processing 19 pdf parser 18 python 17 pdf converter 16 text-mining 15 layout analysis 14 sentiment analysis 13 pdf 11 analysis 10 NLP 10 natural-language-processing 10 text 10 machine-learning 10 machine learning 9 sentiment 9 media 8 social 8 twitter 8 social media 8 opinion mining 8 twitter sentiment 8 opinion analysis 8 data 8 mining 8 opinion 8 vader 7 text processing 6 data science 6 orange3 add-on 5 parser 5 orange 5 parsing 5 transformers 5 textable 5 spacy 5 information-extraction 5 data mining 5 sentiment-analysis 4 topic-modeling 4 text representation 4 text preprocessing 4 text analytics 4 orange3 4 corpus 4 python3 3 text-preprocessing 3 text similarity 3 text-representation 3 information extraction 3 topic modeling 3 text-analysis 3 hacktoberfest 3 orange add-on 3 natual language processing 3 text visualization 3 measurements 3 bert 3 npl 3 french 3 tagging 3 named-entity-recognition 3 spacy-extension 3 text-classification 3 named entity recognition 3 spacy-models 3 spacy-pipeline 3 spaCy 3 sentence boundary detection 2 nlp-pipeline 2 text-clustering 2 pos-tagger 2 data-analysis 2 universal-dependencies 2 units-of-measure 2 morphological-analysis 2 units 2 quantities 2 hunlp 2 pre-trained models 2 text-processing 2 lemmatization 2 tokenization 2 language processing 2 text-visualization 2 texthero 2 ner 2 word embeddings 2 word vectors 2 spacy model 2 word-embeddings 2 pytorch 2 physics 2 Hungarian 2 dependency-parsing 2 hungarian 2 Natural Language Processing 2 normalization 2 text-analytics 2