An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "text-analysis" keyword

View the packages on the pypi.org package registry that are tagged with the "text-analysis" keyword.

chonkie 1.4.2
🦛 CHONK your texts with Chonkie ✨ - The no-nonsense chunking library
47 versions - Latest release: about 15 hours ago - 297 thousand downloads last month - 2,478 stars on GitHub - 2 maintainers
pelican-nlp 0.3.22
Preprocessing and Extraction of Linguistic Information for Computational Analysis
34 versions - Latest release: about 22 hours ago - 422 downloads last month - 0 stars on GitHub - 1 maintainer
cntext 2.2.0
Chinese text analysis library, which can perform word frequency statistics, dictionary expansion,...
40 versions - Latest release: 3 days ago - 1.05 thousand downloads last month - 368 stars on GitHub - 1 maintainer
Top 3.6% on pypi.org
wikitextparser 0.56.4
A simple parsing tool for MediaWiki's wikitext markup.
210 versions - Latest release: 6 months ago - 7 dependent packages - 97 dependent repositories - 64.8 thousand downloads last month - 314 stars on GitHub - 1 maintainer
ner-kit 0.0.5
Named Entity Recognition Toolkit
20 versions - Latest release: over 2 years ago - 1 dependent package - 2 dependent repositories - 68 downloads last month - 0 stars on GitHub - 1 maintainer
prosegrinder 1.4.1
A text analytics library for prose fiction.
19 versions - Latest release: 3 days ago - 1 dependent repositories - 153 downloads last month - 4 stars on GitHub - 1 maintainer
adversary 1.1.1
Creates adversarial text examples for machine learning models
3 versions - Latest release: about 7 years ago - 4 dependent repositories - 53 downloads last month - 400 stars on GitHub - 1 maintainer
nlpap 1.0.1
A comprehensive NLP library for text preprocessing, semantic analysis, and information extraction
2 versions - Latest release: about 1 month ago - 43 downloads last month - 1 maintainer
textprettify 1.0.0
A comprehensive Python library for text formatting, transformation, and analysis
3 versions - Latest release: about 1 month ago - 89 downloads last month - 0 stars on GitHub - 1 maintainer
nepalikit 1.0.2
A Nepali language processing library
3 versions - Latest release: over 1 year ago - 23 downloads last month - 6 stars on GitHub - 1 maintainer
convosense-utilities 1.0.1
A package to extract the email body out of the email text, by removing signature, in order to get...
2 versions - Latest release: over 2 years ago - 623 downloads last month - 20 stars on GitHub - 1 maintainer
metis-agent 0.20.0
Production-ready AI agent framework with comprehensive testing, intelligent caching, connection p...
63 versions - Latest release: about 2 months ago - 115 downloads last month - 9 stars on GitHub - 1 maintainer
huggingface-text-data-analyzer 1.1.0
A comprehensive tool for analyzing text datasets from HuggingFace's datasets library
3 versions - Latest release: 11 months ago - 15 downloads last month - 7 stars on GitHub - 1 maintainer
bwypy 1.0
Bwypy: Python Tools for Bookworm
1 version - Latest release: over 8 years ago - 1 dependent repositories - 4 downloads last month - 1 maintainer
split-markdown4gpt 1.0.9
A Python tool for splitting large Markdown files into smaller sections based on a specified token...
7 versions - Latest release: over 2 years ago - 1 dependent repositories - 1.45 thousand downloads last month - 25 stars on GitHub - 1 maintainer
trunajod 0.1.1
A python lib for readability analyses.
3 versions - Latest release: over 4 years ago - 1 dependent repositories - 624 downloads last month - 29 stars on GitHub - 1 maintainer
Top 8.6% on pypi.org
wordhoard 1.5.5 💰
A comprehensive lexical discovery application that is useful for finding semantic relationships s...
23 versions - Latest release: over 1 year ago - 5 dependent repositories - 914 downloads last month - 124 stars on GitHub - 1 maintainer
Top 10.0% on pypi.org
oneai 0.9.89
NLP as a Service
119 versions - Latest release: over 2 years ago - 2 dependent repositories - 564 downloads last month - 39 stars on GitHub - 1 maintainer
stark-trees 3.1.0
Parser for dependency trees
4 versions - Latest release: 6 months ago - 20 downloads last month - 5 stars on GitHub - 3 maintainers
traligner 0.2.0
Text Reuse Alignment for Hebrew and multi-language texts
2 versions - Latest release: 9 days ago - 7 downloads last month - 0 stars on GitHub - 1 maintainer
stark-place 1.1.0
S.T.A.R.K. Platform Library And Community Extensions
5 versions - Latest release: about 2 years ago - 22 downloads last month - 7 stars on GitHub - 1 maintainer
textmining3 1.1.0
Text Mining Utilities for Python 3
2 versions - Latest release: about 7 years ago - 1 dependent repositories - 25 downloads last month - 1 stars on GitHub - 1 maintainer
v5-relevance 1.0.0
V5版相关度算法 - 基于jieba的中文文本相关度计算
1 version - Latest release: 9 days ago - 1 maintainer
dygest 0.7.0
DYGEST: Document Insights Generator
8 versions - Latest release: about 2 months ago - 57 downloads last month - 3 stars on GitHub - 1 maintainer
constituent-treelib 0.0.8
A lightweight Python library for constructing, processing, and visualizing constituent trees.
7 versions - Latest release: 12 months ago - 67 downloads last month - 60 stars on GitHub - 1 maintainer
historianstexttools 0.0.3
Common tools for historians to aid text analysis.
3 versions - Latest release: over 1 year ago - 15 downloads last month - 1 maintainer
lftk 1.0.9
Comprehensive Handcrafted Linguistic Features Extraction in Python
18 versions - Latest release: over 2 years ago - 160 downloads last month - 140 stars on GitHub - 1 maintainer
bluewhale3-text 1.6.0 💰
用于文本挖掘的蓝鲸附加组件。
5 versions - Latest release: over 2 years ago - 1 dependent repositories - 76 downloads last month - 131 stars on GitHub - 1 maintainer
nocodetextclassifier 0.0.5
This package is for Text Classification of NLP Task
5 versions - Latest release: about 1 year ago - 11 downloads last month - 3 stars on GitHub - 1 maintainer
stark-engine 4.2.1
S.T.A.R.K - Speech and Text Algorithmic Recognition Kit. Modern framework for creating powerfull ...
18 versions - Latest release: 11 days ago - 198 downloads last month - 65 stars on GitHub - 1 maintainer
page-counter 1.0.1
Small Python library and commandline tool to count number of standard pages in the text, files an...
2 versions - Latest release: over 7 years ago - 1 dependent repositories - 14 downloads last month - 1 stars on GitHub - 1 maintainer
tagmaster-python 1.0.7
A comprehensive Python client for the Tagmaster classification API with project management, categ...
7 versions - Latest release: 3 months ago - 26 downloads last month - 1 maintainer
articleparse 0.2.1 💰
Heuristic text extraction from news articles
3 versions - Latest release: almost 8 years ago - 46 downloads last month - 10 stars on GitHub - 1 maintainer
Top 3.7% on pypi.org
padatious 0.4.8
A neural network intent parser
25 versions - Latest release: over 5 years ago - 3 dependent packages - 47 dependent repositories - 5.84 thousand downloads last month - 158 stars on GitHub - 1 maintainer
rck-python-sdk 1.0.1
优雅的RCK (Relational Calculate Kernel) Python SDK
2 versions - Latest release: 4 months ago - 25 downloads last month - 1 maintainer
contextgem 0.19.2
Effortless LLM extraction from documents
40 versions - Latest release: about 2 months ago - 2.93 thousand downloads last month - 1,511 stars on GitHub - 1 maintainer
obsei 0.0.15
Obsei is an automation tool for text analysis need
14 versions - Latest release: almost 2 years ago - 3 dependent repositories - 115 downloads last month - 1,335 stars on GitHub - 1 maintainer
primetext 0.2.2
package for indexing text datasets using prime number factorisation for fast word frequency analysis
4 versions - Latest release: about 9 years ago - 1 dependent repositories - 16 downloads last month - 4 stars on GitHub - 1 maintainer
Top 9.3% on pypi.org
align 0.1.1
Analyzing Linguistic Interaction with Generalizable techNiques. Read the latest ALIGN tutorials.
10 versions - Latest release: about 3 years ago - 8 dependent repositories - 484 downloads last month - 38 stars on GitHub - 2 maintainers
nmf-standalone 0.3.4
A standalone NMF topic modeling tool for Turkish and English texts
20 versions - Latest release: 5 months ago - 72 downloads last month - 2 stars on GitHub - 1 maintainer
text-gen 1.9.0
build a text generation model
19 versions - Latest release: over 4 years ago - 1 dependent repositories - 39 downloads last month - 66 stars on GitHub - 1 maintainer
Top 8.3% on pypi.org
orange3-text 1.16.3 💰
Orange3 TextMining add-on.
64 versions - Latest release: 6 months ago - 1 dependent package - 1 dependent repositories - 14.7 thousand downloads last month - 131 stars on GitHub - 6 maintainers
text-matcher 0.1.6
A simple text reuse detection CLI tool.
6 versions - Latest release: almost 6 years ago - 2 dependent repositories - 35 downloads last month - 136 stars on GitHub - 1 maintainer
pulse-sdk 0.5.0
Idiomatic, type-safe Python client for the Pulse REST API
32 versions - Latest release: about 2 months ago - 232 downloads last month - 0 stars on GitHub - 1 maintainer
textnets 0.10.4
Automated text analysis with networks
38 versions - Latest release: about 2 months ago - 1 dependent repositories - 579 downloads last month - 289 stars on GitHub - 1 maintainer
keyphrases-mcp 0.0.4
Keyphrases MCP server - Model Context Protocol server to extract keyphrases from a text using a B...
1 version - Latest release: about 1 month ago - 214 downloads last month - 0 stars on GitHub - 1 maintainer
nlpiper 0.3.1
NLPiper, a lightweight package integrated with a universe of frameworks to pre-process documents.
5 versions - Latest release: over 3 years ago - 2 dependent repositories - 23 downloads last month - 19 stars on GitHub - 3 maintainers
sentimentpredictor 0.1.3
A flexible sentiment analysis predictor package supporting multiple pre-trained models, customiza...
4 versions - Latest release: over 1 year ago - 23 downloads last month - 2 stars on GitHub - 1 maintainer
matcher-py 0.5.8
A high-performance matcher designed to solve LOGICAL and TEXT VARIATIONS problems in word matchin...
39 versions - Latest release: 3 months ago - 924 downloads last month - 15 stars on GitHub - 1 maintainer
corpus-downloader 0.1.11
A downloader for textual corpora, for use in digital humanities, corpus linguistics, and natural ...
10 versions - Latest release: about 9 years ago - 2 dependent repositories - 127 downloads last month - 34 stars on GitHub - 1 maintainer
wordtangible 0.1.1
Python library for word concreteness and imageability analysis.
2 versions - Latest release: about 1 year ago - 29 downloads last month - 4 stars on GitHub - 1 maintainer
llm-detector 0.2.0
Transparent, probabilistic classification of text as human-generated or LLM-generated
3 versions - Latest release: about 2 months ago - 33 downloads last month - 0 stars on GitHub - 1 maintainer
oet-core 1.1.0
Lightweight data processing toolkit - algorithms, utilities, and database helpers for ETL pipelines
2 versions - Latest release: 17 days ago - 95 downloads last month - 0 stars on GitHub - 1 maintainer
gamma-scanner 1.0.6
Advanced string manipulation and pattern matching engine with unique DSL syntax
7 versions - Latest release: 3 months ago - 37 downloads last month - 1 maintainer
embedisualization 0.4
Visualization of text embeddings/vectorization with clustering
4 versions - Latest release: over 7 years ago - 1 dependent repositories - 20 downloads last month - 2 stars on GitHub - 1 maintainer
semantic-components 0.1.1
Finding semantic components in your neural representations.
2 versions - Latest release: about 1 year ago - 15 downloads last month - 4 stars on GitHub - 1 maintainer
Top 7.0% on pypi.org
rosette-api 1.31.0
Babel Street Analytics API Python client SDK
35 versions - Latest release: 12 months ago - 3 dependent repositories - 3.56 thousand downloads last month - 38 stars on GitHub - 3 maintainers
iambic 3.0.0
Data extraction and rendering library for Shakespearean text.
30 versions - Latest release: over 2 years ago - 1 dependent repositories - 97 downloads last month - 1 stars on GitHub - 1 maintainer
homer-text 0.4.1
Homer, a text analyser in Python, can help make your text more clear, simple and useful for your ...
1 version - Latest release: about 6 years ago - 1 dependent repositories - 15 downloads last month - 633 stars on GitHub - 1 maintainer
logmap 0.0.3
A hierarchical, context-manager logger utility with multiprocess mapping capabilities
1 version - Latest release: almost 2 years ago - 933 downloads last month - 4 stars on GitHub - 1 maintainer
blauwal3-text 1.6.0 💰
用于文本挖掘的蓝鲸附加组件。
1 version - Latest release: about 1 year ago - 14 downloads last month - 131 stars on GitHub - 1 maintainer
Top 9.4% on pypi.org
textpipe 0.12.2
textpipe: clean and extract metadata from text
38 versions - Latest release: almost 5 years ago - 2 dependent repositories - 165 downloads last month - 302 stars on GitHub - 3 maintainers
contexto 0.2.0
Librería para el procesamiento y análisis de texto con Python
3 versions - Latest release: over 4 years ago - 1 dependent repositories - 41 downloads last month - 48 stars on GitHub - 1 maintainer
smoothtext 0.4.0
A Python library for readability and textual metrics analysis, supporting multiple languages.
20 versions - Latest release: 7 months ago - 596 downloads last month - 1 stars on GitHub - 1 maintainer
shreport 0.1.3
上海证券交易所上市公司定期报告下载,项目地址 https://github.com/thunderhit/shreport
3 versions - Latest release: about 5 years ago - 1 dependent repositories - 94 downloads last month - 98 stars on GitHub - 1 maintainer
texcptulz 2.5.3
Tools for cleaning and preprocessing text
14 versions - Latest release: over 4 years ago - 1 dependent repositories - 41 downloads last month - 1 stars on GitHub - 1 maintainer
textstat-cli-tddschn 0.1.1
Wrapper around textstat. Get quick and easy readability and other metrics for your texts right on...
1 version - Latest release: over 1 year ago - 9 downloads last month - 1 stars on GitHub - 1 maintainer
macroetym 0.1.2 💰
A tool for macro-etymological textual analysis.
1 version - Latest release: over 9 years ago - 2 dependent repositories - 10 downloads last month - 35 stars on GitHub - 1 maintainer
oneai-stage 0.0.1
NLP as a Service
1 version - Latest release: over 3 years ago - 1 dependent repositories - 13 downloads last month - 39 stars on GitHub - 1 maintainer
clusx 0.6.0
Bayesian nonparametric toolkit for text clustering, analysis, and benchmarking with advanced embe...
6 versions - Latest release: 8 months ago - 47 downloads last month - 2 stars on GitHub - 1 maintainer
textvec 1.0.1
Supervised text features extraction
3 versions - Latest release: over 7 years ago - 1 dependent repositories - 16 downloads last month - 193 stars on GitHub - 1 maintainer
architxt 0.4.1
ArchiTXT is a tool for structuring textual data into a valid database model. It is guided by a me...
8 versions - Latest release: about 1 month ago - 268 downloads last month - 5 stars on GitHub - 1 maintainer
textmetric 0.1.0
Python implementations of common text metric algorithms.
1 version - Latest release: over 8 years ago - 1 dependent repositories - 7 downloads last month - 0 stars on GitHub - 1 maintainer
graphow 0.4
graphoW is a Python package for the creation of a Graph-of-Words (GoW) representation of texts.
1 version - Latest release: over 4 years ago - 1 dependent repositories - 16 downloads last month - 1 stars on GitHub - 1 maintainer
id-svo-extractor 0.3.0
id-svo-extractor: Extract SVO triples from Indonesian text.
3 versions - Latest release: about 1 year ago - 20 downloads last month - 0 stars on GitHub - 1 maintainer
token-counter-cli 0.1.4
A fast, user-friendly CLI tool to count tokens in text files using tiktoken, with LLM context lim...
5 versions - Latest release: 5 months ago - 69 downloads last month - 0 stars on GitHub - 1 maintainer
dcss 1.0.2
Utilities for the book Doing Computational Social Science
3 versions - Latest release: almost 3 years ago - 1 dependent repositories - 36 downloads last month - 17 stars on GitHub - 1 maintainer
rb-tocase 1.3.2 💰
RB toCase is a Case converter.
2 versions - Latest release: almost 4 years ago - 1 dependent repositories - 32 downloads last month - 3 stars on GitHub - 1 maintainer
dphon 2.1.0
Tools and algorithms for phonology-aware Early Chinese NLP.
12 versions - Latest release: 2 months ago - 1 dependent repositories - 74 downloads last month - 14 stars on GitHub - 1 maintainer
dhelp 0.0.5
DH Python tools for scraping web pages, pre-processing data, and performing nlp analysis quickly.
4 versions - Latest release: over 7 years ago - 1 dependent repositories - 17 downloads last month - 5 stars on GitHub - 1 maintainer
Top 8.8% on pypi.org
aylien-apiclient 0.7.0
AYLIEN Text API Client Library for Python
8 versions - Latest release: over 7 years ago - 59 dependent repositories - 118 downloads last month - 35 stars on GitHub - 1 maintainer
manta-topic-modelling 0.7.1
Multi-lingual Advanced NMF-based Topic Analysis - A comprehensive NMF topic modeling tool for Tur...
8 versions - Latest release: 2 months ago - 291 downloads last month - 2 stars on GitHub - 1 maintainer
eng-text-cleaner 0.0.5
This package is for clean the text as text processing
5 versions - Latest release: about 1 year ago - 17 downloads last month - 2 stars on GitHub - 1 maintainer
kwx 1.0.2
BERT, LDA, and TFIDF based keyword extraction in Python
25 versions - Latest release: almost 3 years ago - 1 dependent repositories - 430 downloads last month - 74 stars on GitHub - 1 maintainer
python-topic-model-preprocessor 0.0.3
A helper class for facilitating preprocessing of text corpus before any topic modeling algorithms
9 versions - Latest release: almost 8 years ago - 1 dependent repositories - 23 downloads last month - 2 stars on GitHub - 1 maintainer
Top 10.0% on pypi.org
dandelion-eu 0.3.3
Connect to the dandelion.eu API in a very pythonic way!
8 versions - Latest release: over 2 years ago - 8 dependent repositories - 243 downloads last month - 36 stars on GitHub - 2 maintainers
Top 6.9% on pypi.org
shifterator 0.3.0
Interpretable data visualizations for understanding how texts differ at the word level
7 versions - Latest release: almost 4 years ago - 1 dependent package - 7 dependent repositories - 359 downloads last month - 280 stars on GitHub - 1 maintainer
semantic-kit 0.0.3
A toolkit to estimate semantic similarity or relatedness
11 versions - Latest release: almost 4 years ago - 1 dependent repositories - 41 downloads last month - 5 stars on GitHub - 1 maintainer
giveme5w1h 1.0.18 💰
Extraction of the journalistic five W and one H questions (5W1H) from news articles.
15 versions - Latest release: over 4 years ago - 3 dependent repositories - 72 downloads last month - 506 stars on GitHub - 1 maintainer
pysemantics 1.0.3
NLP client for python
4 versions - Latest release: almost 6 years ago - 1 dependent repositories - 11 downloads last month - 8 stars on GitHub - 1 maintainer
bloatectomy 0.0.12
Bloatectomy: a method for the identification and removal of duplicate text in the bloated notes o...
12 versions - Latest release: over 5 years ago - 1 dependent repositories - 135 downloads last month - 37 stars on GitHub - 2 maintainers
sentidict 0.1.13
Utilities for dictionary-based sentiment analysis. Includes 28 sentiment dictionaries with loader...
7 versions - Latest release: 7 months ago - 1 dependent repositories - 32 downloads last month - 27 stars on GitHub - 1 maintainer
textexploration 0.2.1
Python Package
2 versions - Latest release: almost 7 years ago - 1 dependent repositories - 16 downloads last month - 0 stars on GitHub - 1 maintainer
htrc-feature-reader 2.0.7
Library for working with the HTRC Extracted Features dataset
22 versions - Latest release: over 5 years ago - 7 dependent repositories - 432 downloads last month - 39 stars on GitHub - 2 maintainers
humlab-westac 0.5.40
Welfare State Analytics
58 versions - Latest release: over 1 year ago - 1 dependent repositories - 80 downloads last month - 5 stars on GitHub - 1 maintainer
notion-nlp 1.0.6
Reading rich text information from a Notion database and performing simple NLP analysis.
7 versions - Latest release: over 2 years ago - 46 downloads last month - 18 stars on GitHub - 1 maintainer
ocraccuracyreporter 0.0.5
OCR Accuracy Reporter
5 versions - Latest release: over 7 years ago - 1 dependent repositories - 8 downloads last month - 0 stars on GitHub - 1 maintainer
xml-cleaner 2.0.4
Word and sentence tokenization.
27 versions - Latest release: almost 9 years ago - 4 dependent repositories - 280 downloads last month - 13 stars on GitHub - 1 maintainer
lingpatlab 0.2.13
Linguistic Pattern Lab using spaCy
8 versions - Latest release: 12 months ago - 51 downloads last month - 0 stars on GitHub - 1 maintainer
ciseau 1.0.1
Word and sentence tokenization.
2 versions - Latest release: almost 8 years ago - 8 dependent repositories - 309 downloads last month - 12 stars on GitHub - 1 maintainer
Related Keywords
nlp 59 python 45 natural-language-processing 35 text-processing 28 text 27 machine-learning 24 text-mining 18 text-classification 15 sentiment-analysis 14 python3 13 ai 10 NLP 9 api 8 linguistics 8 llm 8 text analysis 8 data-science 7 artificial-intelligence 7 nltk 6 nlp-library 6 natural language processing 6 semantic-analysis 5 topic-modeling 5 spacy 5 readability 5 language-detection 5 data-analysis 4 embeddings 4 clustering 4 open-source 4 python-library 4 lemmatization 4 entity-extraction 4 digital-humanities 4 natural-language-understanding 4 api-client 4 openai 4 summarization 4 bag-of-words 4 analysis 4 nlp-machine-learning 3 turkish 3 english 3 stopwords 3 readability-scores 3 tokenization 3 twitter 3 tokenizer 3 utility 3 text-summarization 3 document-processing 3 language 3 bert 3 analyzer 3 tf-idf 3 llms 3 text mining 3 word 3 question-answering 3 visualization 3 large-language-models 3 structured-data 3 knowledge-extraction 3 multilingual 3 information-extraction 3 generative-ai 3 docx 3 low-code 3 ocr 3 orange3-text 3 text-reuse 3 parser 3 data mining 3 orange3 add-on 3 extraction 3 newspapers 3 named-entity-recognition 3 computational-social-science 3 orange 3 stemming 3 contract-management 2 semantic-similarity 2 text-similarity 2 contract-parsing 2 contract-review 2 data-extraction 2 document 2 document-analysis 2 XML 2 llm-reasoning 2 llm-library 2 llm-framework 2 llm-extraction 2 legaltech 2 detection 2 insights-extraction 2 fintech 2 extraction-pipeline 2 extraction-justifications 2 document-understanding 2