pypi.org "text-analysis" keyword
View the packages on the pypi.org package registry that are tagged with the "text-analysis" keyword.
chonkie 1.4.2
🦛 CHONK your texts with Chonkie ✨ - The no-nonsense chunking library47 versions - Latest release: about 15 hours ago - 297 thousand downloads last month - 2,478 stars on GitHub - 2 maintainers
pelican-nlp 0.3.22
Preprocessing and Extraction of Linguistic Information for Computational Analysis34 versions - Latest release: about 22 hours ago - 422 downloads last month - 0 stars on GitHub - 1 maintainer
cntext 2.2.0
Chinese text analysis library, which can perform word frequency statistics, dictionary expansion,...40 versions - Latest release: 3 days ago - 1.05 thousand downloads last month - 368 stars on GitHub - 1 maintainer
Top 3.6% on pypi.org
210 versions - Latest release: 6 months ago - 7 dependent packages - 97 dependent repositories - 64.8 thousand downloads last month - 314 stars on GitHub - 1 maintainer
wikitextparser 0.56.4
A simple parsing tool for MediaWiki's wikitext markup.210 versions - Latest release: 6 months ago - 7 dependent packages - 97 dependent repositories - 64.8 thousand downloads last month - 314 stars on GitHub - 1 maintainer
ner-kit 0.0.5
Named Entity Recognition Toolkit20 versions - Latest release: over 2 years ago - 1 dependent package - 2 dependent repositories - 68 downloads last month - 0 stars on GitHub - 1 maintainer
prosegrinder 1.4.1
A text analytics library for prose fiction.19 versions - Latest release: 3 days ago - 1 dependent repositories - 153 downloads last month - 4 stars on GitHub - 1 maintainer
adversary 1.1.1
Creates adversarial text examples for machine learning models3 versions - Latest release: about 7 years ago - 4 dependent repositories - 53 downloads last month - 400 stars on GitHub - 1 maintainer
nlpap 1.0.1
A comprehensive NLP library for text preprocessing, semantic analysis, and information extraction2 versions - Latest release: about 1 month ago - 43 downloads last month - 1 maintainer
textprettify 1.0.0
A comprehensive Python library for text formatting, transformation, and analysis3 versions - Latest release: about 1 month ago - 89 downloads last month - 0 stars on GitHub - 1 maintainer
nepalikit 1.0.2
A Nepali language processing library3 versions - Latest release: over 1 year ago - 23 downloads last month - 6 stars on GitHub - 1 maintainer
convosense-utilities 1.0.1
A package to extract the email body out of the email text, by removing signature, in order to get...2 versions - Latest release: over 2 years ago - 623 downloads last month - 20 stars on GitHub - 1 maintainer
metis-agent 0.20.0
Production-ready AI agent framework with comprehensive testing, intelligent caching, connection p...63 versions - Latest release: about 2 months ago - 115 downloads last month - 9 stars on GitHub - 1 maintainer
huggingface-text-data-analyzer 1.1.0
A comprehensive tool for analyzing text datasets from HuggingFace's datasets library3 versions - Latest release: 11 months ago - 15 downloads last month - 7 stars on GitHub - 1 maintainer
bwypy 1.0
Bwypy: Python Tools for Bookworm1 version - Latest release: over 8 years ago - 1 dependent repositories - 4 downloads last month - 1 maintainer
split-markdown4gpt 1.0.9
A Python tool for splitting large Markdown files into smaller sections based on a specified token...7 versions - Latest release: over 2 years ago - 1 dependent repositories - 1.45 thousand downloads last month - 25 stars on GitHub - 1 maintainer
trunajod 0.1.1
A python lib for readability analyses.3 versions - Latest release: over 4 years ago - 1 dependent repositories - 624 downloads last month - 29 stars on GitHub - 1 maintainer
Top 8.6% on pypi.org
23 versions - Latest release: over 1 year ago - 5 dependent repositories - 914 downloads last month - 124 stars on GitHub - 1 maintainer
wordhoard 1.5.5 💰
A comprehensive lexical discovery application that is useful for finding semantic relationships s...23 versions - Latest release: over 1 year ago - 5 dependent repositories - 914 downloads last month - 124 stars on GitHub - 1 maintainer
Top 10.0% on pypi.org
119 versions - Latest release: over 2 years ago - 2 dependent repositories - 564 downloads last month - 39 stars on GitHub - 1 maintainer
oneai 0.9.89
NLP as a Service119 versions - Latest release: over 2 years ago - 2 dependent repositories - 564 downloads last month - 39 stars on GitHub - 1 maintainer
stark-trees 3.1.0
Parser for dependency trees4 versions - Latest release: 6 months ago - 20 downloads last month - 5 stars on GitHub - 3 maintainers
traligner 0.2.0
Text Reuse Alignment for Hebrew and multi-language texts2 versions - Latest release: 9 days ago - 7 downloads last month - 0 stars on GitHub - 1 maintainer
stark-place 1.1.0
S.T.A.R.K. Platform Library And Community Extensions5 versions - Latest release: about 2 years ago - 22 downloads last month - 7 stars on GitHub - 1 maintainer
textmining3 1.1.0
Text Mining Utilities for Python 32 versions - Latest release: about 7 years ago - 1 dependent repositories - 25 downloads last month - 1 stars on GitHub - 1 maintainer
v5-relevance 1.0.0
V5版相关度算法 - 基于jieba的中文文本相关度计算1 version - Latest release: 9 days ago - 1 maintainer
dygest 0.7.0
DYGEST: Document Insights Generator8 versions - Latest release: about 2 months ago - 57 downloads last month - 3 stars on GitHub - 1 maintainer
constituent-treelib 0.0.8
A lightweight Python library for constructing, processing, and visualizing constituent trees.7 versions - Latest release: 12 months ago - 67 downloads last month - 60 stars on GitHub - 1 maintainer
historianstexttools 0.0.3
Common tools for historians to aid text analysis.3 versions - Latest release: over 1 year ago - 15 downloads last month - 1 maintainer
lftk 1.0.9
Comprehensive Handcrafted Linguistic Features Extraction in Python18 versions - Latest release: over 2 years ago - 160 downloads last month - 140 stars on GitHub - 1 maintainer
bluewhale3-text 1.6.0 💰
用于文本挖掘的蓝鲸附加组件。5 versions - Latest release: over 2 years ago - 1 dependent repositories - 76 downloads last month - 131 stars on GitHub - 1 maintainer
nocodetextclassifier 0.0.5
This package is for Text Classification of NLP Task5 versions - Latest release: about 1 year ago - 11 downloads last month - 3 stars on GitHub - 1 maintainer
stark-engine 4.2.1
S.T.A.R.K - Speech and Text Algorithmic Recognition Kit. Modern framework for creating powerfull ...18 versions - Latest release: 11 days ago - 198 downloads last month - 65 stars on GitHub - 1 maintainer
page-counter 1.0.1
Small Python library and commandline tool to count number of standard pages in the text, files an...2 versions - Latest release: over 7 years ago - 1 dependent repositories - 14 downloads last month - 1 stars on GitHub - 1 maintainer
tagmaster-python 1.0.7
A comprehensive Python client for the Tagmaster classification API with project management, categ...7 versions - Latest release: 3 months ago - 26 downloads last month - 1 maintainer
articleparse 0.2.1 💰
Heuristic text extraction from news articles3 versions - Latest release: almost 8 years ago - 46 downloads last month - 10 stars on GitHub - 1 maintainer
Top 3.7% on pypi.org
25 versions - Latest release: over 5 years ago - 3 dependent packages - 47 dependent repositories - 5.84 thousand downloads last month - 158 stars on GitHub - 1 maintainer
padatious 0.4.8
A neural network intent parser25 versions - Latest release: over 5 years ago - 3 dependent packages - 47 dependent repositories - 5.84 thousand downloads last month - 158 stars on GitHub - 1 maintainer
rck-python-sdk 1.0.1
优雅的RCK (Relational Calculate Kernel) Python SDK2 versions - Latest release: 4 months ago - 25 downloads last month - 1 maintainer
contextgem 0.19.2
Effortless LLM extraction from documents40 versions - Latest release: about 2 months ago - 2.93 thousand downloads last month - 1,511 stars on GitHub - 1 maintainer
obsei 0.0.15
Obsei is an automation tool for text analysis need14 versions - Latest release: almost 2 years ago - 3 dependent repositories - 115 downloads last month - 1,335 stars on GitHub - 1 maintainer
primetext 0.2.2
package for indexing text datasets using prime number factorisation for fast word frequency analysis4 versions - Latest release: about 9 years ago - 1 dependent repositories - 16 downloads last month - 4 stars on GitHub - 1 maintainer
Top 9.3% on pypi.org
10 versions - Latest release: about 3 years ago - 8 dependent repositories - 484 downloads last month - 38 stars on GitHub - 2 maintainers
align 0.1.1
Analyzing Linguistic Interaction with Generalizable techNiques. Read the latest ALIGN tutorials.10 versions - Latest release: about 3 years ago - 8 dependent repositories - 484 downloads last month - 38 stars on GitHub - 2 maintainers
nmf-standalone 0.3.4
A standalone NMF topic modeling tool for Turkish and English texts20 versions - Latest release: 5 months ago - 72 downloads last month - 2 stars on GitHub - 1 maintainer
text-gen 1.9.0
build a text generation model19 versions - Latest release: over 4 years ago - 1 dependent repositories - 39 downloads last month - 66 stars on GitHub - 1 maintainer
Top 8.3% on pypi.org
64 versions - Latest release: 6 months ago - 1 dependent package - 1 dependent repositories - 14.7 thousand downloads last month - 131 stars on GitHub - 6 maintainers
orange3-text 1.16.3 💰
Orange3 TextMining add-on.64 versions - Latest release: 6 months ago - 1 dependent package - 1 dependent repositories - 14.7 thousand downloads last month - 131 stars on GitHub - 6 maintainers
text-matcher 0.1.6
A simple text reuse detection CLI tool.6 versions - Latest release: almost 6 years ago - 2 dependent repositories - 35 downloads last month - 136 stars on GitHub - 1 maintainer
pulse-sdk 0.5.0
Idiomatic, type-safe Python client for the Pulse REST API32 versions - Latest release: about 2 months ago - 232 downloads last month - 0 stars on GitHub - 1 maintainer
textnets 0.10.4
Automated text analysis with networks38 versions - Latest release: about 2 months ago - 1 dependent repositories - 579 downloads last month - 289 stars on GitHub - 1 maintainer
keyphrases-mcp 0.0.4
Keyphrases MCP server - Model Context Protocol server to extract keyphrases from a text using a B...1 version - Latest release: about 1 month ago - 214 downloads last month - 0 stars on GitHub - 1 maintainer
nlpiper 0.3.1
NLPiper, a lightweight package integrated with a universe of frameworks to pre-process documents.5 versions - Latest release: over 3 years ago - 2 dependent repositories - 23 downloads last month - 19 stars on GitHub - 3 maintainers
sentimentpredictor 0.1.3
A flexible sentiment analysis predictor package supporting multiple pre-trained models, customiza...4 versions - Latest release: over 1 year ago - 23 downloads last month - 2 stars on GitHub - 1 maintainer
matcher-py 0.5.8
A high-performance matcher designed to solve LOGICAL and TEXT VARIATIONS problems in word matchin...39 versions - Latest release: 3 months ago - 924 downloads last month - 15 stars on GitHub - 1 maintainer
corpus-downloader 0.1.11
A downloader for textual corpora, for use in digital humanities, corpus linguistics, and natural ...10 versions - Latest release: about 9 years ago - 2 dependent repositories - 127 downloads last month - 34 stars on GitHub - 1 maintainer
wordtangible 0.1.1
Python library for word concreteness and imageability analysis.2 versions - Latest release: about 1 year ago - 29 downloads last month - 4 stars on GitHub - 1 maintainer
llm-detector 0.2.0
Transparent, probabilistic classification of text as human-generated or LLM-generated3 versions - Latest release: about 2 months ago - 33 downloads last month - 0 stars on GitHub - 1 maintainer
oet-core 1.1.0
Lightweight data processing toolkit - algorithms, utilities, and database helpers for ETL pipelines2 versions - Latest release: 17 days ago - 95 downloads last month - 0 stars on GitHub - 1 maintainer
gamma-scanner 1.0.6
Advanced string manipulation and pattern matching engine with unique DSL syntax7 versions - Latest release: 3 months ago - 37 downloads last month - 1 maintainer
embedisualization 0.4
Visualization of text embeddings/vectorization with clustering4 versions - Latest release: over 7 years ago - 1 dependent repositories - 20 downloads last month - 2 stars on GitHub - 1 maintainer
semantic-components 0.1.1
Finding semantic components in your neural representations.2 versions - Latest release: about 1 year ago - 15 downloads last month - 4 stars on GitHub - 1 maintainer
Top 7.0% on pypi.org
35 versions - Latest release: 12 months ago - 3 dependent repositories - 3.56 thousand downloads last month - 38 stars on GitHub - 3 maintainers
rosette-api 1.31.0
Babel Street Analytics API Python client SDK35 versions - Latest release: 12 months ago - 3 dependent repositories - 3.56 thousand downloads last month - 38 stars on GitHub - 3 maintainers
iambic 3.0.0
Data extraction and rendering library for Shakespearean text.30 versions - Latest release: over 2 years ago - 1 dependent repositories - 97 downloads last month - 1 stars on GitHub - 1 maintainer
homer-text 0.4.1
Homer, a text analyser in Python, can help make your text more clear, simple and useful for your ...1 version - Latest release: about 6 years ago - 1 dependent repositories - 15 downloads last month - 633 stars on GitHub - 1 maintainer
logmap 0.0.3
A hierarchical, context-manager logger utility with multiprocess mapping capabilities1 version - Latest release: almost 2 years ago - 933 downloads last month - 4 stars on GitHub - 1 maintainer
blauwal3-text 1.6.0 💰
用于文本挖掘的蓝鲸附加组件。1 version - Latest release: about 1 year ago - 14 downloads last month - 131 stars on GitHub - 1 maintainer
Top 9.4% on pypi.org
38 versions - Latest release: almost 5 years ago - 2 dependent repositories - 165 downloads last month - 302 stars on GitHub - 3 maintainers
textpipe 0.12.2
textpipe: clean and extract metadata from text38 versions - Latest release: almost 5 years ago - 2 dependent repositories - 165 downloads last month - 302 stars on GitHub - 3 maintainers
contexto 0.2.0
Librería para el procesamiento y análisis de texto con Python3 versions - Latest release: over 4 years ago - 1 dependent repositories - 41 downloads last month - 48 stars on GitHub - 1 maintainer
smoothtext 0.4.0
A Python library for readability and textual metrics analysis, supporting multiple languages.20 versions - Latest release: 7 months ago - 596 downloads last month - 1 stars on GitHub - 1 maintainer
shreport 0.1.3
上海证券交易所上市公司定期报告下载,项目地址 https://github.com/thunderhit/shreport3 versions - Latest release: about 5 years ago - 1 dependent repositories - 94 downloads last month - 98 stars on GitHub - 1 maintainer
texcptulz 2.5.3
Tools for cleaning and preprocessing text14 versions - Latest release: over 4 years ago - 1 dependent repositories - 41 downloads last month - 1 stars on GitHub - 1 maintainer
textstat-cli-tddschn 0.1.1
Wrapper around textstat. Get quick and easy readability and other metrics for your texts right on...1 version - Latest release: over 1 year ago - 9 downloads last month - 1 stars on GitHub - 1 maintainer
macroetym 0.1.2 💰
A tool for macro-etymological textual analysis.1 version - Latest release: over 9 years ago - 2 dependent repositories - 10 downloads last month - 35 stars on GitHub - 1 maintainer
oneai-stage 0.0.1
NLP as a Service1 version - Latest release: over 3 years ago - 1 dependent repositories - 13 downloads last month - 39 stars on GitHub - 1 maintainer
clusx 0.6.0
Bayesian nonparametric toolkit for text clustering, analysis, and benchmarking with advanced embe...6 versions - Latest release: 8 months ago - 47 downloads last month - 2 stars on GitHub - 1 maintainer
textvec 1.0.1
Supervised text features extraction3 versions - Latest release: over 7 years ago - 1 dependent repositories - 16 downloads last month - 193 stars on GitHub - 1 maintainer
architxt 0.4.1
ArchiTXT is a tool for structuring textual data into a valid database model. It is guided by a me...8 versions - Latest release: about 1 month ago - 268 downloads last month - 5 stars on GitHub - 1 maintainer
textmetric 0.1.0
Python implementations of common text metric algorithms.1 version - Latest release: over 8 years ago - 1 dependent repositories - 7 downloads last month - 0 stars on GitHub - 1 maintainer
graphow 0.4
graphoW is a Python package for the creation of a Graph-of-Words (GoW) representation of texts.1 version - Latest release: over 4 years ago - 1 dependent repositories - 16 downloads last month - 1 stars on GitHub - 1 maintainer
id-svo-extractor 0.3.0
id-svo-extractor: Extract SVO triples from Indonesian text.3 versions - Latest release: about 1 year ago - 20 downloads last month - 0 stars on GitHub - 1 maintainer
token-counter-cli 0.1.4
A fast, user-friendly CLI tool to count tokens in text files using tiktoken, with LLM context lim...5 versions - Latest release: 5 months ago - 69 downloads last month - 0 stars on GitHub - 1 maintainer
dcss 1.0.2
Utilities for the book Doing Computational Social Science3 versions - Latest release: almost 3 years ago - 1 dependent repositories - 36 downloads last month - 17 stars on GitHub - 1 maintainer
rb-tocase 1.3.2 💰
RB toCase is a Case converter.2 versions - Latest release: almost 4 years ago - 1 dependent repositories - 32 downloads last month - 3 stars on GitHub - 1 maintainer
dphon 2.1.0
Tools and algorithms for phonology-aware Early Chinese NLP.12 versions - Latest release: 2 months ago - 1 dependent repositories - 74 downloads last month - 14 stars on GitHub - 1 maintainer
dhelp 0.0.5
DH Python tools for scraping web pages, pre-processing data, and performing nlp analysis quickly.4 versions - Latest release: over 7 years ago - 1 dependent repositories - 17 downloads last month - 5 stars on GitHub - 1 maintainer
Top 8.8% on pypi.org
8 versions - Latest release: over 7 years ago - 59 dependent repositories - 118 downloads last month - 35 stars on GitHub - 1 maintainer
aylien-apiclient 0.7.0
AYLIEN Text API Client Library for Python8 versions - Latest release: over 7 years ago - 59 dependent repositories - 118 downloads last month - 35 stars on GitHub - 1 maintainer
manta-topic-modelling 0.7.1
Multi-lingual Advanced NMF-based Topic Analysis - A comprehensive NMF topic modeling tool for Tur...8 versions - Latest release: 2 months ago - 291 downloads last month - 2 stars on GitHub - 1 maintainer
eng-text-cleaner 0.0.5
This package is for clean the text as text processing5 versions - Latest release: about 1 year ago - 17 downloads last month - 2 stars on GitHub - 1 maintainer
kwx 1.0.2
BERT, LDA, and TFIDF based keyword extraction in Python25 versions - Latest release: almost 3 years ago - 1 dependent repositories - 430 downloads last month - 74 stars on GitHub - 1 maintainer
python-topic-model-preprocessor 0.0.3
A helper class for facilitating preprocessing of text corpus before any topic modeling algorithms9 versions - Latest release: almost 8 years ago - 1 dependent repositories - 23 downloads last month - 2 stars on GitHub - 1 maintainer
Top 10.0% on pypi.org
8 versions - Latest release: over 2 years ago - 8 dependent repositories - 243 downloads last month - 36 stars on GitHub - 2 maintainers
dandelion-eu 0.3.3
Connect to the dandelion.eu API in a very pythonic way!8 versions - Latest release: over 2 years ago - 8 dependent repositories - 243 downloads last month - 36 stars on GitHub - 2 maintainers
Top 6.9% on pypi.org
7 versions - Latest release: almost 4 years ago - 1 dependent package - 7 dependent repositories - 359 downloads last month - 280 stars on GitHub - 1 maintainer
shifterator 0.3.0
Interpretable data visualizations for understanding how texts differ at the word level7 versions - Latest release: almost 4 years ago - 1 dependent package - 7 dependent repositories - 359 downloads last month - 280 stars on GitHub - 1 maintainer
semantic-kit 0.0.3
A toolkit to estimate semantic similarity or relatedness11 versions - Latest release: almost 4 years ago - 1 dependent repositories - 41 downloads last month - 5 stars on GitHub - 1 maintainer
giveme5w1h 1.0.18 💰
Extraction of the journalistic five W and one H questions (5W1H) from news articles.15 versions - Latest release: over 4 years ago - 3 dependent repositories - 72 downloads last month - 506 stars on GitHub - 1 maintainer
pysemantics 1.0.3
NLP client for python4 versions - Latest release: almost 6 years ago - 1 dependent repositories - 11 downloads last month - 8 stars on GitHub - 1 maintainer
bloatectomy 0.0.12
Bloatectomy: a method for the identification and removal of duplicate text in the bloated notes o...12 versions - Latest release: over 5 years ago - 1 dependent repositories - 135 downloads last month - 37 stars on GitHub - 2 maintainers
sentidict 0.1.13
Utilities for dictionary-based sentiment analysis. Includes 28 sentiment dictionaries with loader...7 versions - Latest release: 7 months ago - 1 dependent repositories - 32 downloads last month - 27 stars on GitHub - 1 maintainer
textexploration 0.2.1
Python Package2 versions - Latest release: almost 7 years ago - 1 dependent repositories - 16 downloads last month - 0 stars on GitHub - 1 maintainer
htrc-feature-reader 2.0.7
Library for working with the HTRC Extracted Features dataset22 versions - Latest release: over 5 years ago - 7 dependent repositories - 432 downloads last month - 39 stars on GitHub - 2 maintainers
humlab-westac 0.5.40
Welfare State Analytics58 versions - Latest release: over 1 year ago - 1 dependent repositories - 80 downloads last month - 5 stars on GitHub - 1 maintainer
notion-nlp 1.0.6
Reading rich text information from a Notion database and performing simple NLP analysis.7 versions - Latest release: over 2 years ago - 46 downloads last month - 18 stars on GitHub - 1 maintainer
ocraccuracyreporter 0.0.5
OCR Accuracy Reporter5 versions - Latest release: over 7 years ago - 1 dependent repositories - 8 downloads last month - 0 stars on GitHub - 1 maintainer
xml-cleaner 2.0.4
Word and sentence tokenization.27 versions - Latest release: almost 9 years ago - 4 dependent repositories - 280 downloads last month - 13 stars on GitHub - 1 maintainer
lingpatlab 0.2.13
Linguistic Pattern Lab using spaCy8 versions - Latest release: 12 months ago - 51 downloads last month - 0 stars on GitHub - 1 maintainer
ciseau 1.0.1
Word and sentence tokenization.2 versions - Latest release: almost 8 years ago - 8 dependent repositories - 309 downloads last month - 12 stars on GitHub - 1 maintainer
Related Keywords
nlp
59
python
45
natural-language-processing
35
text-processing
28
text
27
machine-learning
24
text-mining
18
text-classification
15
sentiment-analysis
14
python3
13
ai
10
NLP
9
api
8
linguistics
8
llm
8
text analysis
8
data-science
7
artificial-intelligence
7
nltk
6
nlp-library
6
natural language processing
6
semantic-analysis
5
topic-modeling
5
spacy
5
readability
5
language-detection
5
data-analysis
4
embeddings
4
clustering
4
open-source
4
python-library
4
lemmatization
4
entity-extraction
4
digital-humanities
4
natural-language-understanding
4
api-client
4
openai
4
summarization
4
bag-of-words
4
analysis
4
nlp-machine-learning
3
turkish
3
english
3
stopwords
3
readability-scores
3
tokenization
3
twitter
3
tokenizer
3
utility
3
text-summarization
3
document-processing
3
language
3
bert
3
analyzer
3
tf-idf
3
llms
3
text mining
3
word
3
question-answering
3
visualization
3
large-language-models
3
structured-data
3
knowledge-extraction
3
multilingual
3
information-extraction
3
generative-ai
3
docx
3
low-code
3
ocr
3
orange3-text
3
text-reuse
3
parser
3
data mining
3
orange3 add-on
3
extraction
3
newspapers
3
named-entity-recognition
3
computational-social-science
3
orange
3
stemming
3
contract-management
2
semantic-similarity
2
text-similarity
2
contract-parsing
2
contract-review
2
data-extraction
2
document
2
document-analysis
2
XML
2
llm-reasoning
2
llm-library
2
llm-framework
2
llm-extraction
2
legaltech
2
detection
2
insights-extraction
2
fintech
2
extraction-pipeline
2
extraction-justifications
2
document-understanding
2