pypi.org "text mining" keyword
View the packages on the pypi.org package registry that are tagged with the "text mining" keyword.
grobid-quantities-client 0.4.0
A minimal client for grobid-quantities service.5 versions - Latest release: over 2 years ago - 2 dependent packages - 2 dependent repositories - 182 downloads last month - 1 maintainer
lingualytics 0.1.3
A multilingual text analytics package.4 versions - Latest release: over 4 years ago - 3 dependent repositories - 174 downloads last month - 37 stars on GitHub - 1 maintainer
orange-textable-prototypes 0.1.6
Extra widgets for the Textable text analysis package.6 versions - Latest release: over 8 years ago - 1 dependent repositories - 355 downloads last month - 1 stars on GitHub - 1 maintainer
Top 0.5% on pypi.org
30 versions - Latest release: 3 days ago - 162 dependent packages - 2,496 dependent repositories - 7.75 million downloads last month - 5,784 stars on GitHub - 3 maintainers
pdfminer.six 20250416
PDF parser and analyzer30 versions - Latest release: 3 days ago - 162 dependent packages - 2,496 dependent repositories - 7.75 million downloads last month - 5,784 stars on GitHub - 3 maintainers
bent 0.0.80
BENT: Biomedical Entity Annotator60 versions - Latest release: 7 months ago - 1 dependent repositories - 1.53 thousand downloads last month - 9 stars on GitHub - 1 maintainer
e.pdfminer.six 0.0.1
PDF parser and analyzer1 version - Latest release: over 5 years ago - 87 downloads last month - 6,370 stars on GitHub - 1 maintainer
20220429-pdfminer-jameslp310 0.0.2
PDF parser and analyzer1 version - Latest release: almost 3 years ago - 1 dependent repositories - 125 downloads last month - 6,330 stars on GitHub - 1 maintainer
pdfminer.six.reducto 0.0.1
PDF parser and analyzer1 version - Latest release: 5 months ago - 256 downloads last month - 6,370 stars on GitHub - 1 maintainer
pdfminer.bc 0.0.2
PDF parser and analyzer2 versions - Latest release: 5 months ago - 109 downloads last month - 6,370 stars on GitHub - 1 maintainer
Top 0.7% on pypi.org
41 versions - Latest release: over 1 year ago - 46 dependent packages - 1,423 dependent repositories - 200 thousand downloads last month - 5,244 stars on GitHub - 1 maintainer
pdfminer 20131022
PDF parser and analyzer41 versions - Latest release: over 1 year ago - 46 dependent packages - 1,423 dependent repositories - 200 thousand downloads last month - 5,244 stars on GitHub - 1 maintainer
easylda 0.2.7
easily bult LDA Topic Models with just a list of docs (e.g. a list of twitter posts in CSV/TXT48 versions - Latest release: about 7 years ago - 1 dependent repositories - 2.82 thousand downloads last month - 3 stars on GitHub - 1 maintainer
bibx 0.6.3
Python bibliometric tools.21 versions - Latest release: about 1 month ago - 1 dependent repositories - 1.21 thousand downloads last month - 10 stars on GitHub - 1 maintainer
textmining-module 2.1.2
A Python Module for Comprehensive Text Mining, including Keyword Extraction and Text Analysis.7 versions - Latest release: 4 months ago - 182 downloads last month - 0 stars on GitHub - 1 maintainer
slang 0.1.12
A structural approach to signal ML20 versions - Latest release: over 1 year ago - 2 dependent repositories - 661 downloads last month - 5 stars on GitHub - 1 maintainer
utilss 0.1.7
Useful tools to work with text mining in Python6 versions - Latest release: almost 5 years ago - 1 dependent repositories - 231 downloads last month - 0 stars on GitHub - 1 maintainer
pdfmajor 1.3.13
PDF parser36 versions - Latest release: over 5 years ago - 1 dependent repositories - 1.28 thousand downloads last month - 22 stars on GitHub - 1 maintainer
textana4sc 0.4
文本分析库,可对文本进行词频统计、词典扩充、情绪分析等4 versions - Latest release: about 2 years ago - 123 downloads last month - 1 maintainer
textpredict 0.1.3
TextPredict is a powerful Python package designed for various text analysis and prediction tasks ...4 versions - Latest release: 9 months ago - 192 downloads last month - 2 stars on GitHub - 1 maintainer
autocorpus 1.1.0
A tool to standardise text and table data extracted from full text publications.1 version - Latest release: 3 months ago - 60 downloads last month - 21 stars on GitHub - 1 maintainer
decat 1.0.3
De-concatenate strings that do not have white-spaces.4 versions - Latest release: almost 2 years ago - 1 dependent repositories - 174 downloads last month - 0 stars on GitHub - 1 maintainer
textdescriptives 2.8.4
A library for calculating a variety of features from text using spaCy38 versions - Latest release: 4 months ago - 4 dependent packages - 1 dependent repositories - 4.5 thousand downloads last month - 306 stars on GitHub - 2 maintainers
jgtextrank 0.1.6
Yet another Python implementation of TextRank: package for the creation, manipulation, and study ...5 versions - Latest release: over 5 years ago - 1 dependent repositories - 190 downloads last month - 13 stars on GitHub - 1 maintainer
Top 5.0% on pypi.org
77 versions - Latest release: 4 months ago - 6 dependent repositories - 1.9 thousand downloads last month - 470 stars on GitHub - 1 maintainer
shorttext 2.1.0
Short Text Mining77 versions - Latest release: 4 months ago - 6 dependent repositories - 1.9 thousand downloads last month - 470 stars on GitHub - 1 maintainer
playa-pdf 0.4.1 💰
Parallel and LazY Analyzer for PDFs18 versions - Latest release: 30 days ago - 3.36 thousand downloads last month - 26 stars on GitHub - 1 maintainer
yapdfminer 1.2.2
PDF parser and analyzer8 versions - Latest release: over 5 years ago - 1 dependent repositories - 329 downloads last month - 2 stars on GitHub - 1 maintainer
Top 3.0% on pypi.org
41 versions - Latest release: 10 months ago - 8 dependent packages - 44 dependent repositories - 118 thousand downloads last month - 137 stars on GitHub - 1 maintainer
quantulum3 0.9.2 💰
Extract quantities from unstructured text.41 versions - Latest release: 10 months ago - 8 dependent packages - 44 dependent repositories - 118 thousand downloads last month - 137 stars on GitHub - 1 maintainer
hades-nlp 0.1.2
Homologous Automated Document Exploration and Summarization - A powerful tool for comparing simil...3 versions - Latest release: over 1 year ago - 124 downloads last month - 8 stars on GitHub - 2 maintainers
sentimentpredictor 0.1.3
A flexible sentiment analysis predictor package supporting multiple pre-trained models, customiza...4 versions - Latest release: 10 months ago - 146 downloads last month - 2 stars on GitHub - 1 maintainer
Top 9.9% on pypi.org
6 versions - Latest release: over 4 years ago - 3 dependent repositories - 2.01 thousand downloads last month - 41 stars on GitHub - 1 maintainer
pyxpdf 0.2.3
Powerful and Pythonic PDF processing library based on xpdf-4.026 versions - Latest release: over 4 years ago - 3 dependent repositories - 2.01 thousand downloads last month - 41 stars on GitHub - 1 maintainer
seesus 1.2.1
a social, environmental, and economic sustainability classifier based on the UN Sustainable Devel...4 versions - Latest release: about 1 year ago - 180 downloads last month - 8 stars on GitHub - 1 maintainer
Top 9.5% on pypi.org
33 versions - Latest release: about 1 year ago - 4 dependent packages - 1 dependent repositories - 4.55 thousand downloads last month - 153 stars on GitHub - 1 maintainer
augmenty 1.4.4
An augmentation library based on SpaCy for joint augmentation of text and labels.33 versions - Latest release: about 1 year ago - 4 dependent packages - 1 dependent repositories - 4.55 thousand downloads last month - 153 stars on GitHub - 1 maintainer
Top 3.0% on pypi.org
10 versions - Latest release: almost 4 years ago - 1 dependent package - 29 dependent repositories - 2.35 thousand downloads last month - 2,905 stars on GitHub - 1 maintainer
texthero 1.1.0
Text preprocessing, representation and visualization from zero to hero.10 versions - Latest release: almost 4 years ago - 1 dependent package - 29 dependent repositories - 2.35 thousand downloads last month - 2,905 stars on GitHub - 1 maintainer
textherox 1.2.0
Text preprocessing, representation and visualization from zero to hero.1 version - Latest release: over 2 years ago - 52 downloads last month - 2,905 stars on GitHub - 1 maintainer
pdfminer.rtl 1.0.1
PDF parser and analyzer8 versions - Latest release: about 1 year ago - 254 downloads last month - 1 maintainer
seededpf 0.1.0
SeededPF is a seed guided topic model based on Poisson factorization.3 versions - Latest release: about 1 month ago - 248 downloads last month - 0 stars on GitHub - 1 maintainer
multistop 1.3
文本分析停用词表,支持中英德法等15种语言。1 version - Latest release: almost 3 years ago - 1 dependent repositories - 30 downloads last month - 1 maintainer
storynavigator 0.1.1
Narrative analysis add-on for the Orange 3 data mining software package.31 versions - Latest release: 3 months ago - 712 downloads last month - 3 stars on GitHub - 2 maintainers
leia-br 0.0.1
LeIA (Léxico para Inferência Adaptada) é um fork do léxico e ferramenta para análise de sentiment...1 version - Latest release: over 2 years ago - 1 dependent repositories - 616 downloads last month - 3 stars on GitHub - 1 maintainer
keypartx 0.1.20
A Graph-based Perception(Text) Representation40 versions - Latest release: almost 2 years ago - 1.17 thousand downloads last month - 33 stars on GitHub - 1 maintainer
swinger 2.1
A sentiment classifier for Chinese12 versions - Latest release: almost 8 years ago - 1 dependent repositories - 221 downloads last month - 36 stars on GitHub - 1 maintainer
pdfminer.six-i 20190823
PDF parser and analyzer5 versions - Latest release: over 5 years ago - 1 dependent repositories - 1.23 thousand downloads last month - 2 maintainers
huspacy-nightly 0.11.0.dev261 💰
HuSpaCy: industrial strength Hungarian natural language processing126 versions - Latest release: over 1 year ago - 1 dependent repositories - 2.25 thousand downloads last month - 155 stars on GitHub - 1 maintainer
Top 6.6% on pypi.org
23 versions - Latest release: 6 months ago - 1 dependent package - 6 dependent repositories - 2.19 thousand downloads last month - 142 stars on GitHub - 1 maintainer
huspacy 0.12.1 💰
HuSpaCy: industrial strength Hungarian natural language processing23 versions - Latest release: 6 months ago - 1 dependent package - 6 dependent repositories - 2.19 thousand downloads last month - 142 stars on GitHub - 1 maintainer
Top 5.9% on pypi.org
1 version - Latest release: over 9 years ago - 3 dependent packages - 37 dependent repositories - 53.2 thousand downloads last month - 24 stars on GitHub - 1 maintainer
pdfminer2 20151206 💰
PDF parser and analyzer1 version - Latest release: over 9 years ago - 3 dependent packages - 37 dependent repositories - 53.2 thousand downloads last month - 24 stars on GitHub - 1 maintainer
orange-text 1.2a1
Orange Text Mining add-on for Orange data mining software package.1 version - Latest release: almost 13 years ago - 1 dependent repositories - 42 downloads last month - 2 maintainers
nlpbaselines 0.0.49
Quickly establish strong baselines for NLP tasks27 versions - Latest release: over 1 year ago - 1 dependent repositories - 546 downloads last month - 0 stars on GitHub - 1 maintainer
chemdataextractor-api 0.0.1
Chemdataextractor REST API wrapper2 versions - Latest release: about 4 years ago - 1 dependent repositories - 59 downloads last month - 0 stars on GitHub - 1 maintainer
pdfdocx 1.7
读取pdf、docx文件,返回文件内的文本数据。8 versions - Latest release: over 1 year ago - 1 dependent repositories - 505 downloads last month - 4 stars on GitHub - 1 maintainer
lttl 2.1.0
LangTech Text Library (LTTL) for text processing and analysis24 versions - Latest release: 3 months ago - 1 dependent repositories - 4.93 thousand downloads last month - 3 stars on GitHub - 1 maintainer
spacy-wrap 1.4.5
Wrappers for including pre-trained transformers in spaCy pipelines21 versions - Latest release: over 1 year ago - 3 dependent packages - 1 dependent repositories - 1.98 thousand downloads last month - 46 stars on GitHub - 1 maintainer
textanalyze4sc 2.0
文本分析库,可对文本进行词频统计、词典扩充、情绪分析等7 versions - Latest release: about 2 years ago - 38 downloads last month - 1 maintainer
pdf-wrangler 0.0.31
PDFMiner Wrapper for extractions9 versions - Latest release: over 3 years ago - 1 dependent repositories - 325 downloads last month - 1 stars on GitHub - 1 maintainer
mledu 0.0.2
build machine learning models for education purpose4 versions - Latest release: about 7 years ago - 1 dependent repositories - 248 downloads last month - 1 maintainer
awessome 0.0.14
awessome6 versions - Latest release: over 4 years ago - 1 dependent repositories - 192 downloads last month - 2 stars on GitHub - 1 maintainer
vader-sentiment 3.2.1
VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon an...2 versions - Latest release: over 6 years ago - 5 dependent repositories - 606 downloads last month - 1 stars on GitHub - 1 maintainer
lexis 0.1.2
Wordnet wrapper - Easy access to words and their relationships5 versions - Latest release: about 2 years ago - 2 dependent packages - 1 dependent repositories - 258 downloads last month - 1 stars on GitHub - 1 maintainer
gorpy 2.0.4
Grep tool with extensions for reading files in many different ways9 versions - Latest release: over 1 year ago - 1 dependent repositories - 243 downloads last month - 0 stars on GitHub - 1 maintainer
pdfminer-cython 20200304
PDF parser and analyzer1 version - Latest release: almost 5 years ago - 1 dependent repositories - 40 downloads last month - 5,143 stars on GitHub - 1 maintainer
pdfminer-with-logger 1.0.0
PDF parser and analyzer1 version - Latest release: over 4 years ago - 1 dependent repositories - 64 downloads last month - 5,281 stars on GitHub - 1 maintainer
vader-multi 3.2.2 💰
VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon an...2 versions - Latest release: over 5 years ago - 2 dependent repositories - 1.13 thousand downloads last month - 20 stars on GitHub - 1 maintainer
grub 0.1.3
A ridiculously simple search engine factory17 versions - Latest release: almost 3 years ago - 4 dependent repositories - 774 downloads last month - 3 stars on GitHub - 1 maintainer
textdatasetcleaner 0.0.6
Pipeline for cleaning (preprocessing/normalizing) text datasets4 versions - Latest release: about 4 years ago - 1 dependent repositories - 207 downloads last month - 39 stars on GitHub - 1 maintainer
seedot 3.0
SEEDOT: Tool for Enhancing Sentiment Lexicon with Machine Learning2 versions - Latest release: almost 2 years ago - 92 downloads last month - 2 stars on GitHub - 1 maintainer
vadernew 2.0
Pacchetto contentente 4 dizionari che fungono da miglioramento di Vader per argomanti specifici, ...9 versions - Latest release: almost 3 years ago - 1 dependent repositories - 160 downloads last month - 0 stars on GitHub - 1 maintainer
blauwal3-textable 3.1.11
蓝鲸数据挖掘软件包的文本分析附加组件。1 version - Latest release: 7 months ago - 76 downloads last month - 28 stars on GitHub - 1 maintainer
orange3-textable-prototypes 0.31
Additional widgets for the Textable add-on to Orange 3.26 versions - Latest release: about 1 month ago - 1 dependent repositories - 422 downloads last month - 6 stars on GitHub - 1 maintainer
orange3-textable 3.2.3
Textable add-on for Orange 3 data mining software package.33 versions - Latest release: about 1 month ago - 1 dependent repositories - 3.17 thousand downloads last month - 27 stars on GitHub - 1 maintainer
pdfminer.hitalent 20221118
PDF parser and analyzer20 versions - Latest release: over 2 years ago - 366 downloads last month - 1 maintainer
fummytransformers 0.0.18
Fast and dummy way of using transformers to establish quick baselines17 versions - Latest release: over 2 years ago - 400 downloads last month - 0 stars on GitHub - 1 maintainer
statcamp 0.0.2
stat funcs from datacamp stat thinking2 versions - Latest release: about 7 years ago - 1 dependent repositories - 112 downloads last month - 0 stars on GitHub - 1 maintainer
orange-textable 2.0.1
Textable add-on for Orange 2.7 data mining software package.18 versions - Latest release: about 8 years ago - 1 dependent repositories - 515 downloads last month - 1 maintainer
contextmining 0.0.6
Complementing topic models with few-shot in-context learning to generate interpretable topics6 versions - Latest release: 3 months ago - 279 downloads last month - 0 stars on GitHub - 1 maintainer
dsutils 0.2.0
data science utils for data preprocessing for feeding various models, pipelining, time data forma...20 versions - Latest release: about 7 years ago - 1 dependent repositories - 825 downloads last month - 0 stars on GitHub - 1 maintainer
grogu1 1.1
Pacchetto contentente 4 dizionari che fungono da miglioramento di Vader per argomanti specifici, ...1 version - Latest release: almost 2 years ago - 75 downloads last month - 0 stars on GitHub - 1 maintainer
bagofconcepts 0.1.0
This is python implementation of Bag-of-Concepts, as proposed by the paper "Bag-of-Concepts: Comp...2 versions - Latest release: almost 3 years ago - 91 downloads last month - 20 stars on GitHub - 1 maintainer
pdfminer.aemc 20231229
PDF parser and analyzer9 versions - Latest release: 12 months ago - 1 dependent package - 233 downloads last month - 1 maintainer
pybursts 0.1.1
A Python port of the 'burst detection' algorithm by Kleinberg, originally implemented in R2 versions - Latest release: over 10 years ago - 4 dependent repositories - 55 downloads last month - 1 maintainer
Top 6.4% on pypi.org
24 versions - Latest release: over 6 years ago - 30 dependent repositories - 1.31 thousand downloads last month - 50 stars on GitHub - 2 maintainers
metapy 0.2.13
Python bindings for MeTA24 versions - Latest release: over 6 years ago - 30 dependent repositories - 1.31 thousand downloads last month - 50 stars on GitHub - 2 maintainers
material-parser 1.2
Grobid superconductors tools material parser2 versions - Latest release: about 2 years ago - 54 downloads last month - 11 stars on GitHub - 1 maintainer
frenchnlp 0.2.3
State of the art toolchain for natural language processing in French13 versions - Latest release: almost 4 years ago - 1 dependent repositories - 298 downloads last month - 1 stars on GitHub - 1 maintainer
simtext 1.3
文本、文档相似性计算5 versions - Latest release: over 3 years ago - 1 dependent repositories - 168 downloads last month - 13 stars on GitHub - 1 maintainer
textprepro 0.0.1
Everything Everyway All At Once Text Preprocessing.2 versions - Latest release: almost 2 years ago - 98 downloads last month - 2 stars on GitHub - 1 maintainer
vadersentiment-swedish 1.0.3
VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon an...3 versions - Latest release: over 5 years ago - 1 dependent repositories - 73 downloads last month - 6 stars on GitHub - 1 maintainer
quantulum 0.1.16
Extract quantities from unstructured text.17 versions - Latest release: over 1 year ago - 1 dependent package - 4 dependent repositories - 283 downloads last month - 120 stars on GitHub - 1 maintainer
Related Keywords
nlp
28
text analysis
23
natural language processing
18
pdf parser
18
pdf converter
16
python
14
layout analysis
14
text-mining
13
sentiment analysis
13
pdf
10
natural-language-processing
10
machine-learning
10
NLP
10
text
9
machine learning
9
sentiment
9
analysis
9
data
8
opinion analysis
8
twitter sentiment
8
opinion mining
8
social media
8
twitter
8
social
8
media
8
mining
8
opinion
8
vader
7
text processing
6
data science
6
spacy
5
data mining
5
information-extraction
5
parsing
5
parser
5
transformers
5
textable
5
orange3
4
corpus
4
text preprocessing
4
text representation
4
topic-modeling
4
orange
4
sentiment-analysis
4
text analytics
4
orange3 add-on
4
text similarity
3
topic modeling
3
python3
3
text-classification
3
spacy-models
3
french
3
hacktoberfest
3
spaCy
3
tagging
3
spacy-extension
3
npl
3
spacy-pipeline
3
measurements
3
text visualization
3
bert
3
orange add-on
3
named entity recognition
3
information extraction
3
text-representation
3
text-preprocessing
3
named-entity-recognition
3
natual language processing
3
text-clustering
2
lemmatization
2
ner
2
normalization
2
clustering
2
statistics
2
word embeddings
2
word vectors
2
spacy model
2
dependency-parsing
2
hungarian
2
regular-expressions
2
hunlp
2
units-of-measure
2
morphological-analysis
2
pytorch
2
units
2
quantities
2
keywords-extraction
2
pos-tagger
2
physics
2
text-analytics
2
universal-dependencies
2
pre-trained models
2
nlp-pipeline
2
huspacy
2
Hungarian
2
language processing
2
tokenization
2
sentence boundary detection
2
text-visualization
2
text-processing
2