An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org : scrapysub

ScrapySub is a Python library designed to recursively scrape website content, including subpages. It fetches the visible text from web pages and stores it in a structured format for easy access and analysis. This library is particularly useful for NLP and AI developers who need to gather large amounts of web content for their projects.

Registry - Source - Documentation - JSON
purl: pkg:pypi/scrapysub
Keywords: crawling , datapreparation , datapreprocessing , python , python-package , scraper , scraping-websites , urllib3
License: MIT
Latest release: about 1 year ago
First release: about 1 year ago
Downloads: 13 last month
Stars: 5 on GitHub
Forks: 0 on GitHub
See more repository details: repos.ecosyste.ms
Last synced: 21 days ago

    Loading...
    Readme
    Loading...