Ecosyste.ms: Packages

An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

Top 1.7% on pypi.org
Top 0.7% downloads on pypi.org
Top 0.2% dependent packages on pypi.org
Top 1.9% dependent repos on pypi.org
Top 3.8% forks on pypi.org
Top 2.4% docker downloads on pypi.org

pypi.org : trafilatura

Python package and command-line tool designed to gather text on the Web, includes all necessary discovery and text processing components to perform web crawling, downloads, scraping, and extraction of main texts, metadata and comments.

Registry - Source - Homepage - Documentation - JSON
purl: pkg:pypi/trafilatura
Keywords: corpus, html2text, news-crawler, natural-language-processing, scraper, tei-xml, text-extraction, webscraping, web-scraping, article-extractor, corpus-builder, corpus-tools, crawler, html-to-markdown, news, news-aggregator, nlp, readability, rss-feed, scraping, tei, text-cleaning, text-mining, text-preprocessing
License: Apache-2.0
Latest release: about 1 month ago
First release: almost 5 years ago
Dependent packages: 71
Dependent repositories: 63
Downloads: 434,104 last month
Stars: 2,965 on GitHub
Forks: 228 on GitHub
Docker dependents: 13
Docker downloads: 9,651
Total Commits: 1395
Committers: 39
Average commits per author: 35.769
Development Distribution Score (DDS): 0.102
More commit stats: commits.ecosyste.ms
See more repository details: repos.ecosyste.ms
Funding links: https://ko-fi.com/adbarbaresi
Last synced: about 7 hours ago

opsci-toolbox 0.0.3
a complete toolbox
4 versions - Latest release: 27 days ago - 321 downloads last month - 1 maintainer
thirdai 0.8.2
A faster cpu machine learning library
101 versions - Latest release: 27 days ago - 1 dependent repositories - 11.1 thousand downloads last month - 2 maintainers
langroid 0.1.245 💰
Harness LLMs with Multi-Agent Programming
237 versions - Latest release: 27 days ago - 1 dependent repositories - 7.57 thousand downloads last month - 1,779 stars on GitHub - 1 maintainer
Top 5.6% on pypi.org
griptape 0.25.1
Modular Python framework for LLM workflows, tools, memory, and data.
70 versions - Latest release: 28 days ago - 4 dependent packages - 5 dependent repositories - 3.93 thousand downloads last month - 1,668 stars on GitHub - 6 maintainers
Top 2.2% on pypi.org
marvin 2.3.4
A lightweight AI engineering toolkit for building natural language interfaces that are reliable, ...
62 versions - Latest release: 28 days ago - 14 dependent packages - 22 dependent repositories - 13 thousand downloads last month - 4,705 stars on GitHub - 2 maintainers
edubot 0.7.6
Basic Edubot module
26 versions - Latest release: 29 days ago - 1 dependent package - 1 dependent repositories - 318 downloads last month - 4 stars on GitHub - 2 maintainers
scrapework 0.5.4
simple scraping framework
26 versions - Latest release: about 1 month ago - 161 downloads last month - 1 stars on GitHub - 1 maintainer
mediacloud-metadata 1.0.1
Media Cloud news article metadata extraction
39 versions - Latest release: about 1 month ago - 1 dependent package - 4 dependent repositories - 316 downloads last month - 11 stars on GitHub - 2 maintainers
Top 5.9% on pypi.org
minet 2.0.3
A webmining CLI tool & library for python.
264 versions - Latest release: about 2 months ago - 1 dependent package - 6 dependent repositories - 8.39 thousand downloads last month - 251 stars on GitHub - 1 maintainer
datatrove 0.2.0
HuggingFace library to process and filter large amounts of webdata
3 versions - Latest release: about 2 months ago - 11.7 thousand downloads last month - 1,632 stars on GitHub - 3 maintainers
testing-datatrove 4.0.1 removed
HuggingFace library to process and filter large amounts of webdata
1 version - Latest release: about 2 months ago - 1 maintainer
dolma 1.0.3
Data filters
17 versions - Latest release: 2 months ago - 17.7 thousand downloads last month - 800 stars on GitHub - 2 maintainers
dataverse 1.0.5
An open-source simplifies ETL workflow with Python based on Spark
14 versions - Latest release: 2 months ago - 2 dependent repositories - 356 downloads last month - 1 maintainer
hollarek 0.9.1 removed
Collection of general python utilities for future projects
10 versions - Latest release: 3 months ago - 1 dependent package - 52 downloads last month - 0 stars on GitHub - 1 maintainer
contentmap 0.4.0
5 versions - Latest release: 3 months ago - 71 downloads last month - 1 maintainer
opendatagen 0.0.35
Data preparation system to build controllable AI system
33 versions - Latest release: 4 months ago - 201 downloads last month - 15 stars on GitHub - 1 maintainer
delphai-ml-utils 1.0.17
A Python package to manage delphai machine learning operations.
17 versions - Latest release: 4 months ago - 1 dependent repositories - 144 downloads last month - 1 maintainer
yvestest 0.1.3
An open-source simplifies ETL workflow with Python based on Spark
5 versions - Latest release: 4 months ago - 81 downloads last month - 1 maintainer
docop-tasks-restricted 0.3.3
Tasks for docop that have more restrictive open source licensing
4 versions - Latest release: 4 months ago - 26 downloads last month - 0 stars on GitHub - 1 maintainer
obsei 0.0.15
Obsei is an automation tool for text analysis need
14 versions - Latest release: 5 months ago - 3 dependent repositories - 138 downloads last month - 1,146 stars on GitHub - 1 maintainer
wasc 1.1.0
Web Accessibility Simple Checker
12 versions - Latest release: 6 months ago - 34 downloads last month - 2 stars on GitHub - 1 maintainer
maincontentextractor 0.0.4
A library to extract the main content from html. Developed for information on LLM and for feeding...
4 versions - Latest release: 6 months ago - 694 downloads last month - 4 stars on GitHub - 1 maintainer
openams 0.1.24
2 versions - Latest release: 6 months ago - 18 downloads last month - 1 maintainer
nextpy-ai 0.1.24
1 version - Latest release: 6 months ago - 10 downloads last month - 1 maintainer
agentdb 0.1.23
1 version - Latest release: 7 months ago - 30 downloads last month - 1 maintainer
code-context 0.1.22
1 version - Latest release: 7 months ago - 15 downloads last month - 1 maintainer
agent-context 0.1.22
1 version - Latest release: 7 months ago - 28 downloads last month - 1 maintainer
openagent-py 0.1.22
2 versions - Latest release: 7 months ago - 32 downloads last month - 2 maintainers
openagent-dev 0.2.1
Web apps in pure Python with all the flexibility and speed of nextjs.
2 versions - Latest release: 7 months ago - 18 downloads last month - 2,124 stars on GitHub - 1 maintainer
agent.ngo 0.1.22
1 version - Latest release: 7 months ago - 27 downloads last month - 1 maintainer
incognitogpt 0.1.22
1 version - Latest release: 7 months ago - 14 downloads last month - 1 maintainer
openlora 0.1.22
1 version - Latest release: 7 months ago - 13 downloads last month - 1 maintainer
llm-server 0.1.22
1 version - Latest release: 7 months ago - 12 downloads last month - 1 maintainer
llmproxy 0.1.22
1 version - Latest release: 7 months ago - 22 downloads last month - 1 maintainer
nextapi 0.1.22
1 version - Latest release: 7 months ago - 10 downloads last month - 1 maintainer
agent-system 0.1.22
1 version - Latest release: 7 months ago - 13 downloads last month - 1 maintainer
appnext 0.1.22
1 version - Latest release: 7 months ago - 7 downloads last month - 1 maintainer
next-llm 0.1.22
1 version - Latest release: 7 months ago - 11 downloads last month - 1 maintainer
next-ams 0.1.22
1 version - Latest release: 7 months ago - 10 downloads last month - 1 maintainer
namas 0.1.22
1 version - Latest release: 7 months ago - 23 downloads last month - 1 maintainer
dotnext 0.1.22
1 version - Latest release: 7 months ago - 15 downloads last month - 1 maintainer
nextagent 0.1.22
1 version - Latest release: 7 months ago - 13 downloads last month - 1 maintainer
auto-ams 0.1.22
1 version - Latest release: 7 months ago - 9 downloads last month - 1 maintainer
open-ams 0.1.22 removed
1 version - Latest release: 7 months ago - 1 maintainer
codegraph-agent 0.1.22
1 version - Latest release: 7 months ago - 14 downloads last month - 1 maintainer
dotagent 0.1.211
9 versions - Latest release: 7 months ago - 2 dependent repositories - 63 downloads last month - 2 maintainers
dotagent-dev 0.1.7
5 versions - Latest release: 8 months ago - 36 downloads last month - 1 maintainer
galactic-ai 0.2.16
Curate, annotate, and clean massive unstructured text datasets for machine learning and AI systems.
27 versions - Latest release: 8 months ago - 213 downloads last month - 305 stars on GitHub - 1 maintainer
python-switch-case-nguyenquoctuan02011992-2 0.0.1 removed
A sample python package to start sharing your code with the world
1 version - Latest release: 8 months ago - 0 stars on GitHub - 1 maintainer
ams-core 0.1.0
1 version - Latest release: 8 months ago - 12 downloads last month - 1 maintainer
ams-python 0.1.0
1 version - Latest release: 8 months ago - 9 downloads last month - 1 maintainer
dotams 0.1.0
1 version - Latest release: 8 months ago - 4 downloads last month - 1 maintainer
agent-management-system 0.1.0
1 version - Latest release: 8 months ago - 32 downloads last month - 1 maintainer
agentvm 0.1.0
1 version - Latest release: 9 months ago - 13 downloads last month - 2 maintainers
agentbox 0.1.0
1 version - Latest release: 9 months ago - 12 downloads last month - 1 maintainer
agent-cloud 0.1.0
1 version - Latest release: 9 months ago - 25 downloads last month - 1 maintainer
agent-cloud-os 0.1.0
1 version - Latest release: 9 months ago - 18 downloads last month - 1 maintainer
agent-vm 0.1.0 removed
1 version - Latest release: 9 months ago - 1 maintainer
openagentos 0.1.0
1 version - Latest release: 9 months ago - 7 downloads last month - 1 maintainer
deva 1.2.3
data eval in future
33 versions - Latest release: 9 months ago - 1 dependent repositories - 196 downloads last month - 9 stars on GitHub - 1 maintainer
Top 7.1% on pypi.org
opencopilot-ai 0.3.8
OpenCopilot Backend
11 versions - Latest release: 9 months ago - 3 dependent repositories - 35 downloads last month - 532 stars on GitHub - 1 maintainer
aicompleter 0.0.1rc5 removed
Interactive AI program framework for Python
2 versions - Latest release: 10 months ago - 79 downloads last month - 1 maintainer
atradebot 0.1.0
atradebot package
1 version - Latest release: 10 months ago - 31 downloads last month - 3,835 stars on GitHub - 1 maintainer
Top 10.0% on pypi.org
oneai 0.9.89
NLP as a Service
119 versions - Latest release: 10 months ago - 2 dependent repositories - 1.63 thousand downloads last month - 34 stars on GitHub - 1 maintainer
genia 0.4.0
Your Engineering Gen AI Team member 🧬🤖💻
3 versions - Latest release: 10 months ago - 21 downloads last month - 344 stars on GitHub - 1 maintainer
readthis 0.1.1
readthis - A command line tool to read a text file aloud
9 versions - Latest release: 11 months ago - 90 downloads last month - 4 stars on GitHub - 1 maintainer
newsfeedback 0.1.0
Tool for extracting and saving news article metadata at regular intervals.
1 version - Latest release: about 1 year ago - 10 downloads last month - 1 maintainer
skatepark-lib 0.10.0
Python framework for AI workflows and pipelines.
4 versions - Latest release: about 1 year ago - 14 downloads last month - 1,547 stars on GitHub - 1 maintainer
nlp-toolbox 0.0.3 💰
Natural Language Processing Tools
3 versions - Latest release: about 1 year ago - 17 downloads last month - 2 stars on GitHub - 1 maintainer
ur-gadget 0.0.4 💰
Useful gadgets for your python projects
4 versions - Latest release: over 1 year ago - 3 dependent packages - 1 dependent repositories - 67 downloads last month - 5 stars on GitHub - 1 maintainer
pydata-master 0.0.7 💰
All frequently used functions in one package for the data operation in a daily basis.
5 versions - Latest release: over 1 year ago - 1 dependent repositories - 18 downloads last month - 0 stars on GitHub - 1 maintainer