Ecosyste.ms: Packages

An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "text-cleaning" keyword

valx 0.1.6
An open-source Python library for text cleaning tasks.
6 versions - Latest release: about 1 month ago - 123 downloads last month - 0 stars on GitHub - 1 maintainer
Top 1.7% on pypi.org
trafilatura 1.9.0 💰
Python package and command-line tool designed to gather text on the Web, includes all necessary d...
44 versions - Latest release: 16 days ago - 71 dependent packages - 63 dependent repositories - 476 thousand downloads last month - 2,688 stars on GitHub - 1 maintainer
Top 1.8% on pypi.org
clean-text 0.6.0
Functions to preprocess and normalize text.
7 versions - Latest release: over 2 years ago - 15 dependent packages - 97 dependent repositories - 41.6 thousand downloads last month - 1 maintainer
cleanphi 1.0.0 removed
Natural language processing framework to clean sentences and texts.
3 versions - Latest release: 11 months ago - 7 downloads last month - 7 stars on GitHub - 1 maintainer
textprepro 0.0.1
Everything Everyway All At Once Text Preprocessing.
2 versions - Latest release: about 1 year ago - 24 downloads last month - 1 stars on GitHub - 1 maintainer
ternaus-cleantext 0.0.1 💰
Clean text from extra spaces and special symbols as in the CLIP model.
1 version - Latest release: over 1 year ago - 7 downloads last month - 2 stars on GitHub - 1 maintainer
nlp-text-cleaner 1.0.11
Clean the text for NLP project
12 versions - Latest release: over 1 year ago - 1 dependent repositories - 405 downloads last month - 1 stars on GitHub - 1 maintainer
preprocessingtweet 0.1.5
Preprocessing tweets prior to NLP pipeline
7 versions - Latest release: about 1 year ago - 50 downloads last month - 1 maintainer
textcl 1.0.0
Text preprocessing package for use in NLP tasks
5 versions - Latest release: about 3 years ago - 1 dependent repositories - 41 downloads last month - 10 stars on GitHub - 1 maintainer
takin 1.1.4
A Python Toolkit for File Processing, Text Cleaning and Data Splitting
4 versions - Latest release: over 1 year ago - 1 dependent repositories - 28 downloads last month - 25 stars on GitHub - 1 maintainer
nlp-preprocessing 0.2.0
A Package for text preprocessing
14 versions - Latest release: almost 4 years ago - 1 dependent repositories - 141 downloads last month - 16 stars on GitHub - 1 maintainer
hnlp 0.0.1
Humanly Deeplearning NLP.
2 versions - Latest release: almost 4 years ago - 1 dependent repositories - 14 downloads last month - 27 stars on GitHub - 1 maintainer
Top 4.9% on pypi.org
harvesttext 0.8.2
文本挖掘和预处理工具(文本清洗、新词发现、情感分析、实体识别链接、关键词抽取、知识抽取、句法分析等),无监督或弱监督方法
36 versions - Latest release: 9 months ago - 1 dependent package - 7 dependent repositories - 772 downloads last month - 2,301 stars on GitHub - 1 maintainer
extractnet 2.0.7
Extract the main article content (and optionally comments) from a web page
9 versions - Latest release: over 1 year ago - 1 dependent repositories - 515 downloads last month - 168 stars on GitHub - 1 maintainer
topicrankpy 1.1.0
A Python package to get useful information from documents using TopicRank Algorithm.
8 versions - Latest release: over 4 years ago - 1 dependent repositories - 71 downloads last month - 16 stars on GitHub - 1 maintainer
pnlp 0.4.10
A pre/post-processing tool for NLP.
23 versions - Latest release: 4 months ago - 1 dependent package - 2 dependent repositories - 170 downloads last month - 27 stars on GitHub - 1 maintainer
Related Keywords
nlp 13 natural-language-processing 6 text-preprocessing 5 NLP 4 python 4 text-processing 4 text 3 text-extraction 3 news 2 text cleaning 2 named-entity-recognition 2 web-scraping 2 webscraping 2 text-mining 2 text-length 2 preprocessing 2 normalization 2 nlp-preprocess 2 nlp-enhancer 2 concurrency 2 chinese-nlp 2 sentiment-analysis 1 pyhanlp 1 new-word-discovery 1 text-segmentation 1 keyword-extraction 1 harvesttext 1 gitee 1 dependency-parser 1 sentiment analysis 1 entity linking 1 tokenizing 1 scraping 1 topicrank 1 textrank 1 spacy 1 phone-parse 1 pagerank-python 1 network-x 1 keywords-extraction 1 keyphrase-extraction 1 hierarchical-clustering 1 graph-algorithms 1 email-parsing 1 data-preprocessing 1 news-extractor 1 news-extraction 1 news-articles 1 machine-learning 1 date-extraction 1 content-extraction 1 author-extraction 1 HTML parsing 1 web page dechroming 1 automatic content extraction 1 unsupervised 1 text-summarization 1 rss-feed 1 readability 1 news-aggregator 1 html-to-markdown 1 crawler 1 corpus-tools 1 corpus-builder 1 article-extractor 1 tei-xml 1 scraper 1 news-crawler 1 html2text 1 corpus 1 sensitive-data-detection 1 sensitive-data 1 removal 1 profanity-filter 1 profanity-detection 1 datasets 1 cleaner 1 ai 1 tokenization 1 spacy-nlp 1 nlp-library 1 file-processing 1 data-splitting 1 splitting 1 data 1 processing 1 file 1 cleaning 1 outlier-detection 1 Outlier detection 1 Text preprocessing 1 twitter 1 Natural Language Processing 1 text mining 1 text preprocessing 1 framework 1 clean-text 1 user-generated-content 1 text-normalization 1 tei 1