An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "html-extraction" keyword

View the packages on the pypi.org package registry that are tagged with the "html-extraction" keyword.

Top 1.6% on pypi.org
sumy 0.11.0 💰
Module for automatic summarization of text documents and HTML pages.
16 versions - Latest release: over 2 years ago - 9 dependent packages - 413 dependent repositories - 135 thousand downloads last month - 3,431 stars on GitHub - 1 maintainer
Top 8.7% on pypi.org
hext 1.0.12
A module and command-line utility to extract structured data from HTML
19 versions - Latest release: 6 months ago - 4 dependent repositories - 4.33 thousand downloads last month - 52 stars on GitHub - 1 maintainer
nofollow 0.1.0
read-it-later content manager with a couple of twists
4 versions - Latest release: over 6 years ago - 1 dependent repositories - 186 downloads last month - 4 stars on GitHub - 1 maintainer
Top 3.8% on pypi.org
htmldate 1.9.3 💰
Fast and robust extraction of original and updated publication dates from URLs and web pages.
58 versions - Latest release: 4 months ago - 5 dependent packages - 50 dependent repositories - 1.6 million downloads last month - 124 stars on GitHub - 1 maintainer
extracteur-de-fou-malade-pour-charles-le-charlo 0.0.1 💰
PDF data parser
1 version - Latest release: over 4 years ago - 1 dependent repositories - 54 downloads last month - 3,518 stars on GitHub - 1 maintainer
util-ds 0.5.3 💰
This project is a convenient part of the NLP project, including several already exposed projects ...
22 versions - Latest release: almost 3 years ago - 1 dependent repositories - 129 downloads last month - 3,518 stars on GitHub - 1 maintainer
Top 3.4% on pypi.org
breadability 0.1.20
Port of Readability HTML parser in Python
21 versions - Latest release: about 11 years ago - 2 dependent packages - 212 dependent repositories - 89.6 thousand downloads last month - 204 stars on GitHub - 1 maintainer