pypi.org : scrapysub
ScrapySub is a Python library designed to recursively scrape website content, including subpages. It fetches the visible text from web pages and stores it in a structured format for easy access and analysis. This library is particularly useful for NLP and AI developers who need to gather large amounts of web content for their projects.
Registry
-
Source
- Documentation
- JSON
purl: pkg:pypi/scrapysub
Keywords:
crawling
, datapreparation
, datapreprocessing
, python
, python-package
, scraper
, scraping-websites
, urllib3
License: MIT
Latest release: about 1 year ago
First release: about 1 year ago
Downloads: 13 last month
Stars: 5 on GitHub
Forks: 0 on GitHub
See more repository details: repos.ecosyste.ms
Last synced: 21 days ago