pypi.org "text mining" keyword
View the packages on the pypi.org package registry that are tagged with the "text mining" keyword.
huspacy-nightly 0.11.0.dev261 💰
HuSpaCy: industrial strength Hungarian natural language processing126 versions - Latest release: over 1 year ago - 1 dependent repositories - 165 downloads last month - 155 stars on GitHub - 1 maintainer
Top 6.6% on pypi.org
23 versions - Latest release: 9 months ago - 1 dependent package - 6 dependent repositories - 874 downloads last month - 142 stars on GitHub - 1 maintainer
huspacy 0.12.1 💰
HuSpaCy: industrial strength Hungarian natural language processing23 versions - Latest release: 9 months ago - 1 dependent package - 6 dependent repositories - 874 downloads last month - 142 stars on GitHub - 1 maintainer
textdescriptives 2.8.4
A library for calculating a variety of features from text using spaCy38 versions - Latest release: 8 months ago - 4 dependent packages - 1 dependent repositories - 2.31 thousand downloads last month - 306 stars on GitHub - 2 maintainers
Top 0.7% on pypi.org
41 versions - Latest release: almost 2 years ago - 46 dependent packages - 1,423 dependent repositories - 208 thousand downloads last month - 5,244 stars on GitHub - 1 maintainer
pdfminer 20131022
PDF parser and analyzer41 versions - Latest release: almost 2 years ago - 46 dependent packages - 1,423 dependent repositories - 208 thousand downloads last month - 5,244 stars on GitHub - 1 maintainer
Top 9.5% on pypi.org
33 versions - Latest release: over 1 year ago - 4 dependent packages - 1 dependent repositories - 3.38 thousand downloads last month - 156 stars on GitHub - 1 maintainer
augmenty 1.4.4
An augmentation library based on SpaCy for joint augmentation of text and labels.33 versions - Latest release: over 1 year ago - 4 dependent packages - 1 dependent repositories - 3.38 thousand downloads last month - 156 stars on GitHub - 1 maintainer
seesus 1.2.1
a social, environmental, and economic sustainability classifier based on the UN Sustainable Devel...4 versions - Latest release: over 1 year ago - 33 downloads last month - 8 stars on GitHub - 1 maintainer
Top 0.5% on pypi.org
31 versions - Latest release: 3 months ago - 162 dependent packages - 2,496 dependent repositories - 10.3 million downloads last month - 5,784 stars on GitHub - 3 maintainers
pdfminer.six 20250506
PDF parser and analyzer31 versions - Latest release: 3 months ago - 162 dependent packages - 2,496 dependent repositories - 10.3 million downloads last month - 5,784 stars on GitHub - 3 maintainers
Top 5.9% on pypi.org
1 version - Latest release: over 9 years ago - 3 dependent packages - 37 dependent repositories - 78.2 thousand downloads last month - 24 stars on GitHub - 1 maintainer
pdfminer2 20151206 💰
PDF parser and analyzer1 version - Latest release: over 9 years ago - 3 dependent packages - 37 dependent repositories - 78.2 thousand downloads last month - 24 stars on GitHub - 1 maintainer
pmcgrab 0.2.1
Utilities for fetching and processing PubMed Central articles2 versions - Latest release: about 21 hours ago - 0 stars on GitHub - 1 maintainer
orange-fastcoref-plugin 0.1.1
FastCoRef coreference resolution plugin for Orange32 versions - Latest release: about 1 month ago - 164 stars on GitHub - 1 maintainer
seedot 3.0
SEEDOT: Tool for Enhancing Sentiment Lexicon with Machine Learning2 versions - Latest release: about 2 years ago - 19 downloads last month - 2 stars on GitHub - 1 maintainer
fummytransformers 0.0.18
Fast and dummy way of using transformers to establish quick baselines17 versions - Latest release: almost 3 years ago - 25 downloads last month - 0 stars on GitHub - 1 maintainer
frenchnlp 0.2.3
State of the art toolchain for natural language processing in French13 versions - Latest release: about 4 years ago - 1 dependent repositories - 25 downloads last month - 1 stars on GitHub - 1 maintainer
storynavigator 0.1.1
Narrative analysis add-on for the Orange 3 data mining software package.31 versions - Latest release: 6 months ago - 77 downloads last month - 3 stars on GitHub - 2 maintainers
grobid-quantities-client 0.4.0
A minimal client for grobid-quantities service.5 versions - Latest release: over 2 years ago - 2 dependent packages - 2 dependent repositories - 37 downloads last month - 1 maintainer
lingualytics 0.1.3
A multilingual text analytics package.4 versions - Latest release: almost 5 years ago - 3 dependent repositories - 38 downloads last month - 37 stars on GitHub - 1 maintainer
dsutils 0.2.0
data science utils for data preprocessing for feeding various models, pipelining, time data forma...20 versions - Latest release: over 7 years ago - 1 dependent repositories - 205 downloads last month - 0 stars on GitHub - 1 maintainer
orange3-textable-prototypes 0.31
Additional widgets for the Textable add-on to Orange 3.26 versions - Latest release: 4 months ago - 1 dependent repositories - 143 downloads last month - 6 stars on GitHub - 1 maintainer
orange-textable 2.0.1
Textable add-on for Orange 2.7 data mining software package.18 versions - Latest release: over 8 years ago - 1 dependent repositories - 161 downloads last month - 1 maintainer
autocorpus 1.1.1
A tool to standardise text and table data extracted from full text publications.2 versions - Latest release: about 2 months ago - 31 downloads last month - 21 stars on GitHub - 1 maintainer
sentimentpredictor 0.1.3
A flexible sentiment analysis predictor package supporting multiple pre-trained models, customiza...4 versions - Latest release: about 1 year ago - 23 downloads last month - 2 stars on GitHub - 1 maintainer
jgtextrank 0.1.6
Yet another Python implementation of TextRank: package for the creation, manipulation, and study ...5 versions - Latest release: over 5 years ago - 1 dependent repositories - 45 downloads last month - 13 stars on GitHub - 1 maintainer
Top 6.4% on pypi.org
24 versions - Latest release: almost 7 years ago - 30 dependent repositories - 1.37 thousand downloads last month - 51 stars on GitHub - 2 maintainers
metapy 0.2.13
Python bindings for MeTA24 versions - Latest release: almost 7 years ago - 30 dependent repositories - 1.37 thousand downloads last month - 51 stars on GitHub - 2 maintainers
bibx 0.7.1
Python bibliometric tools.23 versions - Latest release: about 1 month ago - 1 dependent repositories - 429 downloads last month - 10 stars on GitHub - 1 maintainer
Top 5.0% on pypi.org
80 versions - Latest release: about 2 months ago - 6 dependent repositories - 209 downloads last month - 472 stars on GitHub - 1 maintainer
shorttext 2.2.1
Short Text Mining80 versions - Latest release: about 2 months ago - 6 dependent repositories - 209 downloads last month - 472 stars on GitHub - 1 maintainer
simtext 1.3
文本、文档相似性计算5 versions - Latest release: almost 4 years ago - 1 dependent repositories - 52 downloads last month - 12 stars on GitHub - 1 maintainer
Top 3.0% on pypi.org
10 versions - Latest release: about 4 years ago - 1 dependent package - 29 dependent repositories - 1.06 thousand downloads last month - 2,905 stars on GitHub - 1 maintainer
texthero 1.1.0
Text preprocessing, representation and visualization from zero to hero.10 versions - Latest release: about 4 years ago - 1 dependent package - 29 dependent repositories - 1.06 thousand downloads last month - 2,905 stars on GitHub - 1 maintainer
vadernew 2.0
Pacchetto contentente 4 dizionari che fungono da miglioramento di Vader per argomanti specifici, ...9 versions - Latest release: about 3 years ago - 1 dependent repositories - 23 downloads last month - 0 stars on GitHub - 1 maintainer
textherox 1.2.0
Text preprocessing, representation and visualization from zero to hero.1 version - Latest release: almost 3 years ago - 10 downloads last month - 2,905 stars on GitHub - 1 maintainer
Top 3.0% on pypi.org
41 versions - Latest release: about 1 year ago - 8 dependent packages - 44 dependent repositories - 134 thousand downloads last month - 142 stars on GitHub - 1 maintainer
quantulum3 0.9.2 💰
Extract quantities from unstructured text.41 versions - Latest release: about 1 year ago - 8 dependent packages - 44 dependent repositories - 134 thousand downloads last month - 142 stars on GitHub - 1 maintainer
orange-textable-prototypes 0.1.6
Extra widgets for the Textable text analysis package.6 versions - Latest release: almost 9 years ago - 1 dependent repositories - 45 downloads last month - 1 stars on GitHub - 1 maintainer
quantulum 0.1.16
Extract quantities from unstructured text.17 versions - Latest release: almost 2 years ago - 1 dependent package - 4 dependent repositories - 38 downloads last month - 119 stars on GitHub - 1 maintainer
playa-pdf 0.6.2 💰
Parallel and LazY Analyzer for PDFs25 versions - Latest release: 11 days ago - 12 thousand downloads last month - 31 stars on GitHub - 1 maintainer
pybursts 0.1.1
A Python port of the 'burst detection' algorithm by Kleinberg, originally implemented in R2 versions - Latest release: over 10 years ago - 4 dependent repositories - 17 downloads last month - 1 maintainer
contextmining 0.0.6
Complementing topic models with few-shot in-context learning to generate interpretable topics6 versions - Latest release: 7 months ago - 59 downloads last month - 0 stars on GitHub - 1 maintainer
slang 0.1.12
A structural approach to signal ML20 versions - Latest release: almost 2 years ago - 2 dependent repositories - 274 downloads last month - 5 stars on GitHub - 1 maintainer
architxt 0.3.1
ArchiTXT is a tool for structuring textual data into a valid database model. It is guided by a me...6 versions - Latest release: 17 days ago - 169 downloads last month - 3 stars on GitHub - 1 maintainer
textmining-module 2.1.2
A Python Module for Comprehensive Text Mining, including Keyword Extraction and Text Analysis.7 versions - Latest release: 8 months ago - 16 downloads last month - 0 stars on GitHub - 1 maintainer
vadersentiment-swedish 1.0.3
VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon an...3 versions - Latest release: almost 6 years ago - 1 dependent repositories - 12 downloads last month - 6 stars on GitHub - 1 maintainer
bent 0.0.80
BENT: Biomedical Entity Annotator60 versions - Latest release: 10 months ago - 1 dependent repositories - 425 downloads last month - 9 stars on GitHub - 1 maintainer
hades-nlp 0.1.2
Homologous Automated Document Exploration and Summarization - A powerful tool for comparing simil...3 versions - Latest release: almost 2 years ago - 19 downloads last month - 8 stars on GitHub - 2 maintainers
decat 1.0.3
De-concatenate strings that do not have white-spaces.4 versions - Latest release: about 2 years ago - 1 dependent repositories - 28 downloads last month - 0 stars on GitHub - 1 maintainer
yapdfminer 1.2.2
PDF parser and analyzer8 versions - Latest release: almost 6 years ago - 1 dependent repositories - 70 downloads last month - 2 stars on GitHub - 1 maintainer
analytics_tasks 0.1.0
Automation including file search and slide deck preparation.1 version - Latest release: 30 days ago - 118 downloads last month - 0 stars on GitHub - 1 maintainer
keypartx 0.1.20
A Graph-based Perception(Text) Representation40 versions - Latest release: over 2 years ago - 240 downloads last month - 33 stars on GitHub - 1 maintainer
textminer-prozzzzxcv 0.1.0
고급 텍스트 분석을 위한 Python 패키지1 version - Latest release: about 2 months ago - 117 downloads last month - 1 maintainer
pdfmajor 1.3.13
PDF parser36 versions - Latest release: over 5 years ago - 1 dependent repositories - 332 downloads last month - 22 stars on GitHub - 1 maintainer
text2term 4.5.0
A tool for mapping free-text descriptions of entities to ontology terms29 versions - Latest release: 2 months ago - 1 dependent repositories - 647 downloads last month - 11 stars on GitHub - 2 maintainers
textana4sc 0.4
文本分析库,可对文本进行词频统计、词典扩充、情绪分析等4 versions - Latest release: over 2 years ago - 11 downloads last month - 1 maintainer
textpredict 0.1.3
TextPredict is a powerful Python package designed for various text analysis and prediction tasks ...4 versions - Latest release: about 1 year ago - 16 downloads last month - 2 stars on GitHub - 1 maintainer
textprepro 0.0.1
Everything Everyway All At Once Text Preprocessing.2 versions - Latest release: about 2 years ago - 12 downloads last month - 2 stars on GitHub - 1 maintainer
leia-br 0.0.1
LeIA (Léxico para Inferência Adaptada) é um fork do léxico e ferramenta para análise de sentiment...1 version - Latest release: over 2 years ago - 1 dependent repositories - 1.2 thousand downloads last month - 3 stars on GitHub - 1 maintainer
pdfminer.rtl 1.0.1
PDF parser and analyzer8 versions - Latest release: over 1 year ago - 76 downloads last month - 1 maintainer
pdf-wrangler 0.0.31
PDFMiner Wrapper for extractions9 versions - Latest release: over 3 years ago - 1 dependent repositories - 57 downloads last month - 1 stars on GitHub - 1 maintainer
multistop 1.3
文本分析停用词表,支持中英德法等15种语言。1 version - Latest release: about 3 years ago - 1 dependent repositories - 20 downloads last month - 1 maintainer
seededpf 0.1.0
SeededPF is a seed guided topic model based on Poisson factorization.3 versions - Latest release: 5 months ago - 22 downloads last month - 0 stars on GitHub - 1 maintainer
spacy-wrap 1.4.5
Wrappers for including pre-trained transformers in spaCy pipelines21 versions - Latest release: almost 2 years ago - 3 dependent packages - 1 dependent repositories - 633 downloads last month - 46 stars on GitHub - 1 maintainer
Top 9.9% on pypi.org
6 versions - Latest release: almost 5 years ago - 3 dependent repositories - 775 downloads last month - 42 stars on GitHub - 1 maintainer
pyxpdf 0.2.3
Powerful and Pythonic PDF processing library based on xpdf-4.026 versions - Latest release: almost 5 years ago - 3 dependent repositories - 775 downloads last month - 42 stars on GitHub - 1 maintainer
lexis 0.1.2
Wordnet wrapper - Easy access to words and their relationships5 versions - Latest release: over 2 years ago - 2 dependent packages - 1 dependent repositories - 101 downloads last month - 1 stars on GitHub - 1 maintainer
pdfminer-with-logger 1.0.0
PDF parser and analyzer1 version - Latest release: over 4 years ago - 1 dependent repositories - 18 downloads last month - 5,289 stars on GitHub - 1 maintainer
chemdataextractor-api 0.0.1
Chemdataextractor REST API wrapper2 versions - Latest release: over 4 years ago - 1 dependent repositories - 6 downloads last month - 0 stars on GitHub - 1 maintainer
nlpbaselines 0.0.49
Quickly establish strong baselines for NLP tasks27 versions - Latest release: over 1 year ago - 1 dependent repositories - 98 downloads last month - 0 stars on GitHub - 1 maintainer
mledu 0.0.2
build machine learning models for education purpose4 versions - Latest release: over 7 years ago - 1 dependent repositories - 52 downloads last month - 1 maintainer
pdfminer-cython 20200304
PDF parser and analyzer1 version - Latest release: about 5 years ago - 1 dependent repositories - 7 downloads last month - 5,143 stars on GitHub - 1 maintainer
swinger 2.1
A sentiment classifier for Chinese12 versions - Latest release: about 8 years ago - 1 dependent repositories - 22 downloads last month - 36 stars on GitHub - 1 maintainer
pdfminer.six-i 20190823
PDF parser and analyzer5 versions - Latest release: almost 6 years ago - 1 dependent repositories - 1.25 thousand downloads last month - 2 maintainers
vader-multi 3.2.2 💰
VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon an...2 versions - Latest release: almost 6 years ago - 2 dependent repositories - 1.43 thousand downloads last month - 22 stars on GitHub - 1 maintainer
orange-text 1.2a1
Orange Text Mining add-on for Orange data mining software package.1 version - Latest release: about 13 years ago - 1 dependent repositories - 27 downloads last month - 2 maintainers
blauwal3-textable 3.1.11
蓝鲸数据挖掘软件包的文本分析附加组件。1 version - Latest release: 10 months ago - 22 downloads last month - 29 stars on GitHub - 1 maintainer
lttl 2.1.0
LangTech Text Library (LTTL) for text processing and analysis24 versions - Latest release: 6 months ago - 1 dependent repositories - 7.31 thousand downloads last month - 3 stars on GitHub - 1 maintainer
pdfdocx 1.7
读取pdf、docx文件,返回文件内的文本数据。8 versions - Latest release: almost 2 years ago - 1 dependent repositories - 237 downloads last month - 5 stars on GitHub - 1 maintainer
utilss 0.1.7
Useful tools to work with text mining in Python6 versions - Latest release: about 5 years ago - 1 dependent repositories - 32 downloads last month - 0 stars on GitHub - 1 maintainer
textanalyze4sc 2.0
文本分析库,可对文本进行词频统计、词典扩充、情绪分析等7 versions - Latest release: over 2 years ago - 7 downloads last month - 1 maintainer
orange3-textable 3.2.4
Textable add-on for Orange 3 data mining software package.34 versions - Latest release: about 2 months ago - 1 dependent repositories - 7.72 thousand downloads last month - 29 stars on GitHub - 1 maintainer
vader-sentiment 3.2.1
VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon an...2 versions - Latest release: over 6 years ago - 5 dependent repositories - 458 downloads last month - 1 stars on GitHub - 1 maintainer
gorpy 2.0.4
Grep tool with extensions for reading files in many different ways9 versions - Latest release: almost 2 years ago - 1 dependent repositories - 32 downloads last month - 0 stars on GitHub - 1 maintainer
grogu1 1.1
Pacchetto contentente 4 dizionari che fungono da miglioramento di Vader per argomanti specifici, ...1 version - Latest release: about 2 years ago - 14 downloads last month - 0 stars on GitHub - 1 maintainer
statcamp 0.0.2
stat funcs from datacamp stat thinking2 versions - Latest release: over 7 years ago - 1 dependent repositories - 28 downloads last month - 0 stars on GitHub - 1 maintainer
awessome 0.0.14
awessome6 versions - Latest release: almost 5 years ago - 1 dependent repositories - 58 downloads last month - 2 stars on GitHub - 1 maintainer
20220429-pdfminer-jameslp310 0.0.2
PDF parser and analyzer1 version - Latest release: over 3 years ago - 1 dependent repositories - 121 downloads last month - 6,503 stars on GitHub - 1 maintainer
pdfminer.bc 0.0.2
PDF parser and analyzer2 versions - Latest release: 9 months ago - 12 downloads last month - 6,503 stars on GitHub - 1 maintainer
pdfminer.six.reducto 0.0.1
PDF parser and analyzer1 version - Latest release: 9 months ago - 577 downloads last month - 6,503 stars on GitHub - 1 maintainer
e.pdfminer.six 0.0.1
PDF parser and analyzer1 version - Latest release: over 5 years ago - 73 downloads last month - 6,503 stars on GitHub - 1 maintainer
textdatasetcleaner 0.0.6
Pipeline for cleaning (preprocessing/normalizing) text datasets4 versions - Latest release: over 4 years ago - 1 dependent repositories - 17 downloads last month - 40 stars on GitHub - 1 maintainer
pdfminer.aemc 20231229
PDF parser and analyzer9 versions - Latest release: about 1 year ago - 1 dependent package - 63 downloads last month - 1 maintainer
grub 0.1.4
A ridiculously simple search engine factory18 versions - Latest release: 3 months ago - 4 dependent repositories - 406 downloads last month - 3 stars on GitHub - 1 maintainer
pdfminer.hitalent 20221118
PDF parser and analyzer20 versions - Latest release: over 2 years ago - 132 downloads last month - 1 maintainer
easylda 0.2.7
easily bult LDA Topic Models with just a list of docs (e.g. a list of twitter posts in CSV/TXT48 versions - Latest release: over 7 years ago - 1 dependent repositories - 535 downloads last month - 3 stars on GitHub - 1 maintainer
bagofconcepts 0.1.0
This is python implementation of Bag-of-Concepts, as proposed by the paper "Bag-of-Concepts: Comp...2 versions - Latest release: about 3 years ago - 17 downloads last month - 20 stars on GitHub - 1 maintainer
material-parser 1.2
Grobid superconductors tools material parser2 versions - Latest release: over 2 years ago - 7 downloads last month - 12 stars on GitHub - 1 maintainer
Related Keywords
nlp
30
text analysis
25
natural language processing
19
pdf parser
18
python
17
pdf converter
16
text-mining
15
layout analysis
14
sentiment analysis
13
pdf
11
analysis
10
NLP
10
natural-language-processing
10
text
10
machine-learning
10
machine learning
9
sentiment
9
media
8
social
8
twitter
8
social media
8
opinion mining
8
twitter sentiment
8
opinion analysis
8
data
8
mining
8
opinion
8
vader
7
text processing
6
data science
6
orange3 add-on
5
parser
5
orange
5
parsing
5
transformers
5
textable
5
spacy
5
information-extraction
5
data mining
5
sentiment-analysis
4
topic-modeling
4
text representation
4
text preprocessing
4
text analytics
4
orange3
4
corpus
4
python3
3
text-preprocessing
3
text similarity
3
text-representation
3
information extraction
3
topic modeling
3
text-analysis
3
hacktoberfest
3
orange add-on
3
natual language processing
3
text visualization
3
measurements
3
bert
3
npl
3
french
3
tagging
3
named-entity-recognition
3
spacy-extension
3
text-classification
3
named entity recognition
3
spacy-models
3
spacy-pipeline
3
spaCy
3
sentence boundary detection
2
nlp-pipeline
2
text-clustering
2
pos-tagger
2
data-analysis
2
universal-dependencies
2
units-of-measure
2
morphological-analysis
2
units
2
quantities
2
hunlp
2
pre-trained models
2
text-processing
2
lemmatization
2
tokenization
2
language processing
2
text-visualization
2
texthero
2
ner
2
word embeddings
2
word vectors
2
spacy model
2
word-embeddings
2
pytorch
2
physics
2
Hungarian
2
dependency-parsing
2
hungarian
2
Natural Language Processing
2
normalization
2
text-analytics
2