Ecosyste.ms: Packages

An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "text-mining" keyword

pytextract 2.0.1
extract text from any document. no muss. no fuss.
1 version - Latest release: 8 months ago - 1 dependent package - 1 dependent repositories - 302 downloads last month - 3,754 stars on GitHub - 1 maintainer
Top 0.9% on pypi.org
textract 1.6.5
extract text from any document. no muss. no fuss.
18 versions - Latest release: about 2 years ago - 23 dependent packages - 739 dependent repositories - 154 thousand downloads last month - 3,754 stars on GitHub - 1 maintainer
textract-edited-dependencies 0.0.2
extract text from any document. no muss. no fuss.
2 versions - Latest release: 5 months ago - 175 downloads last month - 3,754 stars on GitHub - 1 maintainer
textract3 1.6.4.post1
extract text from any document. no muss. no fuss. (A fork with python3 support only)
1 version - Latest release: over 2 years ago - 1 dependent repositories - 23 downloads last month - 3,754 stars on GitHub - 1 maintainer
textherox 1.2.0
Text preprocessing, representation and visualization from zero to hero.
1 version - Latest release: over 1 year ago - 30 downloads last month - 2,865 stars on GitHub - 1 maintainer
Top 3.0% on pypi.org
texthero 1.1.0
Text preprocessing, representation and visualization from zero to hero.
10 versions - Latest release: almost 3 years ago - 1 dependent package - 29 dependent repositories - 8.59 thousand downloads last month - 2,865 stars on GitHub - 1 maintainer
Top 1.7% on pypi.org
trafilatura 1.9.0 💰
Python package and command-line tool designed to gather text on the Web, includes all necessary d...
44 versions - Latest release: 17 days ago - 71 dependent packages - 63 dependent repositories - 476 thousand downloads last month - 2,688 stars on GitHub - 1 maintainer
Top 2.5% on pypi.org
scattertext 0.2.1
An NLP package to visualize interesting terms in text.
149 versions - Latest release: 2 months ago - 2 dependent packages - 90 dependent repositories - 14.1 thousand downloads last month - 2,174 stars on GitHub - 1 maintainer
autophrase 1.4.3
Automated Phrase Mining from Massive Text Corpora
3 versions - Latest release: over 3 years ago - 1 dependent repositories - 25 downloads last month - 1,154 stars on GitHub - 1 maintainer
Top 2.1% on pypi.org
rake-nltk 1.0.6
RAKE short for Rapid Automatic Keyword Extraction algorithm, is a domain independent keyword extr...
7 versions - Latest release: over 2 years ago - 8 dependent packages - 255 dependent repositories - 232 thousand downloads last month - 1,034 stars on GitHub - 1 maintainer
Top 5.0% on pypi.org
bigartm 0.9.2
BigARTM: the state-of-the-art platform for topic modeling
1 version - Latest release: over 4 years ago - 1 dependent package - 11 dependent repositories - 438 downloads last month - 661 stars on GitHub - 2 maintainers
bigartm10 0.10.1
BigARTM: the state-of-the-art platform for topic modeling
1 version - Latest release: over 4 years ago - 1 dependent repositories - 251 downloads last month - 661 stars on GitHub - 2 maintainers
bigartm9 0.9.2 removed
BigARTM: the state-of-the-art platform for topic modeling
1 version - Latest release: over 1 year ago - 18 downloads last month - 630 stars on GitHub - 1 maintainer
Top 5.0% on pypi.org
shorttext 1.6.1
Short Text Mining
75 versions - Latest release: 5 months ago - 6 dependent repositories - 4.07 thousand downloads last month - 467 stars on GitHub - 1 maintainer
rmdl 1.0.8
RMDL: Random Multimodel Deep Learning for Classification
6 versions - Latest release: almost 4 years ago - 1 dependent repositories - 59 downloads last month - 412 stars on GitHub - 1 maintainer
adversary 1.1.1
Creates adversarial text examples for machine learning models
3 versions - Latest release: over 5 years ago - 4 dependent repositories - 42 downloads last month - 391 stars on GitHub - 1 maintainer
similarity-check 0.2.12
package for measuring the similarity of two texts
41 versions - Latest release: 6 months ago - 237 downloads last month - 368 stars on GitHub - 1 maintainer
Top 9.6% on pypi.org
pyss3 0.6.4
Python package that implements the SS3 text classifier (with visualizations tools for XAI)
28 versions - Latest release: over 3 years ago - 1 dependent repositories - 273 downloads last month - 332 stars on GitHub - 1 maintainer
chemdataextractor-c 1.0.0
A toolkit for extracting chemical information from the scientific literature.
1 version - Latest release: about 1 year ago - 34 downloads last month - 278 stars on GitHub - 1 maintainer
Top 6.3% on pypi.org
chemdataextractor 1.3.0
A toolkit for extracting chemical information from the scientific literature.
8 versions - Latest release: over 7 years ago - 14 dependent repositories - 884 downloads last month - 278 stars on GitHub - 1 maintainer
Top 6.1% on pypi.org
multi-rake 0.0.2
Multilingual Rapid Automatic Keyword Extraction (RAKE) for Python
2 versions - Latest release: almost 3 years ago - 14 dependent repositories - 2.29 thousand downloads last month - 262 stars on GitHub - 1 maintainer
hdltex 1.0.5
HDLTex: Hierarchical Deep Learning for Text Classification
2 versions - Latest release: about 6 years ago - 1 dependent repositories - 16 downloads last month - 255 stars on GitHub - 1 maintainer
nlp-profiler 0.0.3 💰
A simple NLP library allows profiling datasets with one or more text columns.
3 versions - Latest release: over 3 years ago - 1 dependent repositories - 54 downloads last month - 239 stars on GitHub - 1 maintainer
Top 3.4% on pypi.org
breadability 0.1.20
Port of Readability HTML parser in Python
21 versions - Latest release: about 10 years ago - 2 dependent packages - 212 dependent repositories - 225 thousand downloads last month - 203 stars on GitHub - 1 maintainer
shallowlearn 0.0.5
A collection of supervised learning models based on shallow neural network approaches (e.g., word...
4 versions - Latest release: over 7 years ago - 1 dependent repositories - 21 downloads last month - 197 stars on GitHub - 1 maintainer
pyconverse 0.1.0
Coversational Transcript Analysis using various NLP techniques
1 version - Latest release: over 2 years ago - 1 dependent repositories - 44 downloads last month - 176 stars on GitHub - 1 maintainer
extractnet 2.0.7
Extract the main article content (and optionally comments) from a web page
9 versions - Latest release: over 1 year ago - 1 dependent repositories - 515 downloads last month - 168 stars on GitHub - 1 maintainer
autophrasex 0.3.1
Automated Phrase Mining from Massive Text Corpora in Python.
5 versions - Latest release: almost 3 years ago - 1 dependent repositories - 109 downloads last month - 160 stars on GitHub - 1 maintainer
Top 9.7% on pypi.org
kindred 2.8.3
A relation extraction toolkit for biomedical text mining
42 versions - Latest release: about 1 year ago - 3 dependent repositories - 127 downloads last month - 154 stars on GitHub - 1 maintainer
Top 6.6% on pypi.org
huspacy 0.11.0 💰
HuSpaCy: industrial strength Hungarian natural language processing
21 versions - Latest release: 7 months ago - 1 dependent package - 6 dependent repositories - 947 downloads last month - 142 stars on GitHub - 1 maintainer
huspacy-nightly 0.11.0.dev261 💰
HuSpaCy: industrial strength Hungarian natural language processing
126 versions - Latest release: 5 months ago - 1 dependent repositories - 275 downloads last month - 142 stars on GitHub - 1 maintainer
sparselsh 2.1.1
A locality sensitive hashing library with an emphasis on large, sparse datasets.
7 versions - Latest release: over 1 year ago - 1 dependent repositories - 45 downloads last month - 135 stars on GitHub - 1 maintainer
trrex 0.0.5
Transform set of words to efficient regular expression
4 versions - Latest release: about 1 year ago - 1 dependent repositories - 781 downloads last month - 134 stars on GitHub - 1 maintainer
tregex 0.0.1
Transform trie to regular expression
1 version - Latest release: about 4 years ago - 1 dependent repositories - 13 downloads last month - 134 stars on GitHub - 1 maintainer
keywords2vec 0.1.0
To generate a word2vec model, but using multi-word keywords instead of single words.
1 version - Latest release: about 4 years ago - 1 dependent repositories - 14 downloads last month - 124 stars on GitHub - 1 maintainer
bluewhale3-text 1.6.0 💰
用于文本挖掘的蓝鲸附加组件。
5 versions - Latest release: 12 months ago - 1 dependent repositories - 56 downloads last month - 124 stars on GitHub - 1 maintainer
Top 8.3% on pypi.org
orange3-text 1.15.0 💰
Orange3 TextMining add-on.
60 versions - Latest release: 6 months ago - 1 dependent package - 1 dependent repositories - 5.21 thousand downloads last month - 124 stars on GitHub - 5 maintainers
Top 5.0% on pypi.org
pyphonetics 0.5.3
A Python 3 phonetics library.
6 versions - Latest release: about 4 years ago - 1 dependent package - 21 dependent repositories - 37.5 thousand downloads last month - 113 stars on GitHub - 1 maintainer
chemdataextractor2 2.3.2
A toolkit for extracting chemical information from the scientific literature.
7 versions - Latest release: 13 days ago - 313 downloads last month - 108 stars on GitHub - 2 maintainers
multiplex-plot 0.5.0
Multiplex: visualizations that tell stories—A Python library to create and annotate beautiful net...
14 versions - Latest release: over 3 years ago - 1 dependent repositories - 133 downloads last month - 104 stars on GitHub - 1 maintainer
edsnlp 0.11.2
Modular, fast NLP framework, compatible with Pytorch and spaCy, offering tailored support for Fre...
33 versions - Latest release: about 1 month ago - 1 dependent package - 1 dependent repositories - 7.41 thousand downloads last month - 97 stars on GitHub - 3 maintainers
lisc 0.3.0
Literature Scanner
5 versions - Latest release: 7 months ago - 2 dependent repositories - 67 downloads last month - 86 stars on GitHub - 1 maintainer
perke 0.4.4
A keyphrase extractor for Persian
13 versions - Latest release: 11 months ago - 1 dependent repositories - 110 downloads last month - 68 stars on GitHub - 1 maintainer
kwx 1.0.2
BERT, LDA, and TFIDF based keyword extraction in Python
25 versions - Latest release: over 1 year ago - 1 dependent repositories - 120 downloads last month - 64 stars on GitHub - 1 maintainer
dariah 2.0.2
A library for topic modeling and visualization.
3 versions - Latest release: over 3 years ago - 1 dependent repositories - 34 downloads last month - 62 stars on GitHub - 1 maintainer
dariah_topics 1.0.2.dev0
DARIAH Topic Modeling
3 versions - Latest release: almost 6 years ago - 16 downloads last month - 62 stars on GitHub - 1 maintainer
arabica 1.7.7
Python package for exploratory text data analysis
58 versions - Latest release: 5 months ago - 1 dependent package - 1 dependent repositories - 590 downloads last month - 58 stars on GitHub - 1 maintainer
horusner 0.1.5
HORUS Framework
3 versions - Latest release: almost 7 years ago - 1 dependent repositories - 14 downloads last month - 50 stars on GitHub - 1 maintainer
Top 8.7% on pypi.org
deduce 3.0.2
Deduce: de-identification method for Dutch medical text
24 versions - Latest release: 3 months ago - 3 dependent repositories - 1.93 thousand downloads last month - 46 stars on GitHub - 1 maintainer
pubrunner 0.5.5
A framework to rerun text mining tools on the latest publications
20 versions - Latest release: almost 4 years ago - 1 dependent repositories - 61 downloads last month - 41 stars on GitHub - 1 maintainer
bluesearch 0.2.0
Blue Brain text mining toolbox for semantic search and information extraction
3 versions - Latest release: almost 3 years ago - 2 dependent repositories - 30 downloads last month - 40 stars on GitHub - 1 maintainer
textdatasetcleaner 0.0.6
Pipeline for cleaning (preprocessing/normalizing) text datasets
4 versions - Latest release: over 3 years ago - 1 dependent repositories - 31 downloads last month - 38 stars on GitHub - 1 maintainer
Top 7.8% on pypi.org
pytrials 1.0.0
Python wrapper around the clinicaltrials.gov API
7 versions - Latest release: 18 days ago - 1 dependent package - 2 dependent repositories - 1.18 thousand downloads last month - 37 stars on GitHub - 1 maintainer
Top 7.0% on pypi.org
rosette-api 1.28.0
Rosette API Python client SDK
32 versions - Latest release: 4 months ago - 3 dependent repositories - 1.54 thousand downloads last month - 37 stars on GitHub - 2 maintainers
htrc-feature-reader 2.0.7
Library for working with the HTRC Extracted Features dataset
22 versions - Latest release: almost 4 years ago - 7 dependent repositories - 95 downloads last month - 36 stars on GitHub - 2 maintainers
Top 10.0% on pypi.org
dandelion-eu 0.3.3
Connect to the dandelion.eu API in a very pythonic way!
8 versions - Latest release: 10 months ago - 8 dependent repositories - 114 downloads last month - 35 stars on GitHub - 2 maintainers
nlppln 0.3.3
NLP pipeline software using common workflow language
5 versions - Latest release: over 5 years ago - 1 dependent repositories - 28 downloads last month - 33 stars on GitHub - 2 maintainers
keypartx 0.1.20
A Graph-based Perception(Text) Representation
40 versions - Latest release: about 1 year ago - 173 downloads last month - 33 stars on GitHub - 1 maintainer
bloatectomy 0.0.12
Bloatectomy: a method for the identification and removal of duplicate text in the bloated notes o...
12 versions - Latest release: almost 4 years ago - 1 dependent repositories - 137 downloads last month - 30 stars on GitHub - 2 maintainers
trunajod 0.1.1
A python lib for readability analyses.
3 versions - Latest release: about 3 years ago - 1 dependent repositories - 71 downloads last month - 29 stars on GitHub - 1 maintainer
newshound 0.0.1 💰
A future news extractor package for Python 3
1 version - Latest release: over 2 years ago - 1 dependent repositories - 37 downloads last month - 29 stars on GitHub - 1 maintainer
pylda2vec 1.0.0
Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec
2 versions - Latest release: over 5 years ago - 1 dependent repositories - 37 downloads last month - 29 stars on GitHub - 1 maintainer
pyresearchinsights 1.59
End-to-end tool for scientific literature analysis
23 versions - Latest release: over 2 years ago - 1 dependent repositories - 97 downloads last month - 26 stars on GitHub - 1 maintainer
cnsenti 0.0.7
中文情感分析库(Chinese Sentiment))可对文本进行情绪分析、正负情感分析。
6 versions - Latest release: about 3 years ago - 4 dependent repositories - 496 downloads last month - 25 stars on GitHub - 1 maintainer
eventextraction 0.0.6
中文复合事件抽取,可以用来识别文本的模式,包括条件事件、因果事件、顺承事件、反转事件。代码为刘焕勇原创设计,项目地址https://github.com/liuhuanyong/ComplexE...
3 versions - Latest release: about 4 years ago - 1 dependent repositories - 38 downloads last month - 23 stars on GitHub - 1 maintainer
supermat 3.0.0
SuperMat (Superconductors Material) dataset is a manually **linked** **annotated** dataset of sup...
5 versions - Latest release: 6 months ago - 43 downloads last month - 22 stars on GitHub - 1 maintainer
bagofconcepts 0.1.0
This is python implementation of Bag-of-Concepts, as proposed by the paper "Bag-of-Concepts: Comp...
2 versions - Latest release: almost 2 years ago - 31 downloads last month - 20 stars on GitHub - 1 maintainer
refchaser 0.0.3
Written in python, for checking reference lists in systematic reviews and literature reviews, hel...
3 versions - Latest release: almost 4 years ago - 1 dependent repositories - 29 downloads last month - 18 stars on GitHub - 1 maintainer
wikirec 1.0.1
Recommendation engine framework based on Wikipedia data
34 versions - Latest release: almost 2 years ago - 1 dependent repositories - 143 downloads last month - 18 stars on GitHub - 1 maintainer
document-qa-engine 0.3.4
Scientific Document Insight Q/A
12 versions - Latest release: 5 months ago - 73 downloads last month - 16 stars on GitHub - 1 maintainer
topicgpt 1.0.0
A package for integrating LLMs like GPT-3.5 and GPT-4 into topic modelling
13 versions - Latest release: 8 months ago - 177 downloads last month - 14 stars on GitHub - 1 maintainer
pubget 0.0.8
Download neuroimaging articles and extract text and stereotactic coordinates.
6 versions - Latest release: 11 months ago - 53 downloads last month - 13 stars on GitHub - 1 maintainer
summarext 0.0.1
Extraction most important keywords from any website
1 version - Latest release: about 3 years ago - 1 dependent repositories - 21 downloads last month - 13 stars on GitHub - 1 maintainer
scopus-caller 0.1.4
This package allows the users to query its database for all the articles based on a specified key...
3 versions - Latest release: 7 months ago - 23 downloads last month - 13 stars on GitHub - 1 maintainer
jgtextrank 0.1.6
Yet another Python implementation of TextRank: package for the creation, manipulation, and study ...
5 versions - Latest release: over 4 years ago - 1 dependent repositories - 52 downloads last month - 13 stars on GitHub - 1 maintainer
cazy-parser 2.0.3
A way to extract specific information from CAZy
11 versions - Latest release: 7 months ago - 130 downloads last month - 12 stars on GitHub - 1 maintainer
ppaxe 1.2
Protein-Protein interactions extractor from PubMed articles
4 versions - Latest release: about 5 years ago - 1 dependent repositories - 18 downloads last month - 11 stars on GitHub - 1 maintainer
averbis-python-api 0.11.0
Averbis REST API client for Python.
11 versions - Latest release: 19 days ago - 2 dependent repositories - 414 downloads last month - 11 stars on GitHub - 1 maintainer
pedl 1.0.2
Search the biomedical literature for protein interactions andprotein associations.
8 versions - Latest release: 9 months ago - 1 dependent repositories - 55 downloads last month - 11 stars on GitHub - 1 maintainer
qda 0.0.2
A tool for quantitatively measuring the discursive similarity between bodies of text.
2 versions - Latest release: over 1 year ago - 1 dependent repositories - 47 downloads last month - 10 stars on GitHub - 1 maintainer
bent 0.0.62
BENT: Biomedical Entity Annotator
52 versions - Latest release: 3 days ago - 1 dependent repositories - 1.92 thousand downloads last month - 9 stars on GitHub - 1 maintainer
text-summarizer 0.0.6
A text summarization package
3 versions - Latest release: about 4 years ago - 2 dependent repositories - 37 downloads last month - 9 stars on GitHub - 1 maintainer
onto2nx 0.1.1 💰
A package for parsing ontologies in the OWL and OBO format into NetworkX graphs
2 versions - Latest release: over 5 years ago - 2 dependent repositories - 26 downloads last month - 9 stars on GitHub - 1 maintainer
keyword-ranker 0.2
Python implementation ranking keywords from a corpus with with respect to other text files using ...
2 versions - Latest release: almost 7 years ago - 1 dependent repositories - 16 downloads last month - 8 stars on GitHub - 1 maintainer
material-parser 1.2
Grobid superconductors tools material parser
2 versions - Latest release: about 1 year ago - 19 downloads last month - 6 stars on GitHub - 1 maintainer
nmf 0.0.6
Non-negative matrix factorization for building topic models in Python
5 versions - Latest release: over 5 years ago - 3 dependent repositories - 119 downloads last month - 6 stars on GitHub - 1 maintainer
material-parsers 3.0.1
Set of parsers and linkers for materials extraction
3 versions - Latest release: 5 months ago - 36 downloads last month - 6 stars on GitHub - 1 maintainer
pydaily 0.4.4
Daily python utility functions.
5 versions - Latest release: about 3 years ago - 1 dependent package - 4 dependent repositories - 186 downloads last month - 6 stars on GitHub - 1 maintainer
unifunc 1.3
Tool for similarity analysis of protein function annotations.
1 version - Latest release: over 2 years ago - 1 dependent repositories - 19 downloads last month - 5 stars on GitHub - 1 maintainer
jobtimize 0.0.5a2
Collect and standardize data on job posting platforms.
4 versions - Latest release: over 3 years ago - 1 dependent repositories - 42 downloads last month - 5 stars on GitHub - 1 maintainer
preon 0.1.1
preon (PREcision Oncology Normalization) is a fuzzy search tool for medical entities.
2 versions - Latest release: 6 months ago - 22 downloads last month - 4 stars on GitHub - 1 maintainer
twilight-nlp 0.1.1
A no code tool to quickly understand text-based document and it provides an intuitive UI to explo...
3 versions - Latest release: almost 3 years ago - 1 dependent repositories - 29 downloads last month - 4 stars on GitHub - 1 maintainer
pygame-framework 0.0.1
something for my project
1 version - Latest release: 7 months ago - 11 downloads last month - 4 stars on GitHub - 1 maintainer
texturizer 0.1.9
Python command line application to add text features to a CSV or TSV dataset.
8 versions - Latest release: about 2 years ago - 1 dependent repositories - 25 downloads last month - 4 stars on GitHub - 1 maintainer
random-word-generator 1.3
This is a random word generator module
4 versions - Latest release: over 3 years ago - 3 dependent repositories - 567 downloads last month - 4 stars on GitHub - 1 maintainer
hybridtfidf 1.1.0
An implementation of the Hybrid TF-IDF microblog summarisation algorithm as proposed by David Ion...
19 versions - Latest release: almost 3 years ago - 1 dependent repositories - 99 downloads last month - 4 stars on GitHub - 1 maintainer
druglinker 0.1.1
Simple drug linking
2 versions - Latest release: about 4 years ago - 1 dependent repositories - 20 downloads last month - 4 stars on GitHub - 1 maintainer
chemdataextractor-ide 1.3.2
A toolkit for extracting chemical information from the scientific literature.
2 versions - Latest release: over 4 years ago - 1 dependent repositories - 14 downloads last month - 3 stars on GitHub - 2 maintainers
seesus 1.2.1
a social, environmental, and economic sustainability classifier based on the UN Sustainable Devel...
4 versions - Latest release: about 1 month ago - 56 downloads last month - 3 stars on GitHub - 1 maintainer
ezweb 4.4.0 removed
An easy to use web page analyzer (scraper or crawler) with many useful features and properties
17 versions - Latest release: over 2 years ago - 89 downloads last month - 3 stars on GitHub
Related Keywords
nlp 54 python 47 machine-learning 31 natural-language-processing 30 text 16 text-analysis 14 text-classification 14 topic-modeling 14 data-mining 12 text mining 12 text-processing 10 python3 10 deep-learning 8 mining 8 data-science 8 information-extraction 8 information-retrieval 7 science 7 spacy 6 scientific 6 natural language processing 6 keyword-extraction 6 word2vec 6 word-embeddings 5 lemmatization 5 text processing 5 named-entity-recognition 5 sentiment-analysis 5 lda 5 tokenization 4 NLP 4 python-library 4 chemistry 4 cheminformatics 4 html 4 xml 4 unsupervised-learning 4 web-scraping 4 webscraping 4 bioinformatics 4 nltk 4 parsing 4 text-visualization 4 bigdata 3 classification 3 c-plus-plus 3 python-api 3 regularizer 3 neural-network 3 informatics 3 named entity recognition 3 ner 3 pdf 3 gensim 3 relation-extraction 3 document-classification 3 entity-linking 3 pytorch 3 scikit-learn 3 normalization 3 tf-idf 3 nlp-machine-learning 3 entity-extraction 3 news 3 data-analysis 3 named-entity-disambiguation 3 keywords-extraction 3 superconductors 3 orange3 add-on 3 readability 3 text representation 3 text-representation 3 text-preprocessing 3 machine learning 3 text-clustering 3 pandas 3 bigartm 3 twitter 3 clustering 3 literature-review 3 algorithms 3 scraping 3 meta-analysis 3 materials 2 physics 2 data-visualization 2 open-source 2 feature-extraction 2 noise 2 modeling 2 topic 2 digital-humanities 2 tfidf 2 phrase-extraction 2 soundex 2 bert 2 keyword 2 keyword-extractor 2 protein-protein-interaction 2 language-processing 2