An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "web-scraping" keyword

View the packages on the pypi.org package registry that are tagged with the "web-scraping" keyword.

ragoon 0.0.15 ๐Ÿ’ฐ
RAGoon : High level library for batched embeddings generation, blazingly-fast web-based RAG and q...
15 versions - Latest release: 11 months ago - 164 downloads last month - 0 stars on GitHub - 1 maintainer
scrapy-rnet 0.0.5
A blazing-fast Python HTTP Client with TLS/HTTP2 fingerprint
5 versions - Latest release: 3 months ago - 57 downloads last month - 306 stars on GitHub - 1 maintainer
Top 1.3% on pypi.org
curl-cffi 0.13.0 ๐Ÿ’ฐ
libcurl ffi bindings for Python, with impersonation support.
79 versions - Latest release: about 2 months ago - 56 dependent packages - 155 dependent repositories - 11.6 million downloads last month - 1,838 stars on GitHub - 1 maintainer
Top 1.7% on pypi.org
trafilatura 2.0.0 ๐Ÿ’ฐ
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction...
50 versions - Latest release: 10 months ago - 71 dependent packages - 63 dependent repositories - 1.65 million downloads last month - 4,758 stars on GitHub - 1 maintainer
Top 0.4% on pypi.org
scrapy 2.13.3
A high-level Web Crawling and Web Scraping framework
104 versions - Latest release: 3 months ago - 136 dependent packages - 2,753 dependent repositories - 1.97 million downloads last month - 52,897 stars on GitHub - 4 maintainers
firecrawl-py 4.3.6
Python SDK for Firecrawl API
114 versions - Latest release: 25 days ago - 1 dependent package - 1.3 million downloads last month - 60,856 stars on GitHub - 1 maintainer
openbb-finviz 1.3.2 ๐Ÿ’ฐ
Finviz extension for OpenBB
18 versions - Latest release: 3 months ago - 1 dependent package - 7.07 thousand downloads last month - 638 stars on GitHub - 1 maintainer
lisc 0.4.0
Literature Scanner
6 versions - Latest release: 7 months ago - 2 dependent repositories - 157 downloads last month - 90 stars on GitHub - 1 maintainer
Top 3.8% on pypi.org
htmldate 1.9.3 ๐Ÿ’ฐ
Fast and robust extraction of original and updated publication dates from URLs and web pages.
58 versions - Latest release: 9 months ago - 5 dependent packages - 50 dependent repositories - 4.52 million downloads last month - 142 stars on GitHub - 1 maintainer
changedetection.io 0.50.14 ๐Ÿ’ฐ
Website change detection and monitoring service, detect changes to web pages and send alerts/noti...
137 versions - Latest release: 15 days ago - 6.43 thousand downloads last month - 26,691 stars on GitHub - 1 maintainer
scrapy-cffi 0.2.1
An asyncio-style web scraping framework inspired by Scrapy, powered by curl_cffi.
9 versions - Latest release: 1 day ago - 342 downloads last month - 3 stars on GitHub - 1 maintainer
Top 7.4% on pypi.org
pytest-sbase 4.41.11
A complete web automation framework for end-to-end testing.
682 versions - Latest release: 1 day ago - 1 dependent repositories - 3.61 thousand downloads last month - 4,950 stars on GitHub - 1 maintainer
Top 5.2% on pypi.org
sbase 4.41.11
A complete web automation framework for end-to-end testing.
685 versions - Latest release: 1 day ago - 2 dependent repositories - 4.71 thousand downloads last month - 4,950 stars on GitHub - 1 maintainer
selenium-base 4.41.11
A complete web automation framework for end-to-end testing.
638 versions - Latest release: 1 day ago - 6.78 thousand downloads last month - 4,950 stars on GitHub - 1 maintainer
Top 1.9% on pypi.org
google-search-results 2.4.2
Scrape and search localized results from Google, Bing, Baidu, Yahoo, Yandex, Ebay, Homedepot, you...
41 versions - Latest release: over 2 years ago - 113 dependent packages - 2,025 dependent repositories - 797 thousand downloads last month - 459 stars on GitHub - 3 maintainers
scrapling 0.3.6 ๐Ÿ’ฐ
Scrapling is an undetectable, powerful, flexible, high-performance Python library that makes Web ...
29 versions - Latest release: 1 day ago - 19.2 thousand downloads last month - 7,383 stars on GitHub - 1 maintainer
Top 6.7% on pypi.org
basketball_reference_web_scraper 4.15.4
A Basketball Reference client that generates data by scraping the website
48 versions - Latest release: 2 months ago - 3 dependent repositories - 2.53 thousand downloads last month - 401 stars on GitHub - 1 maintainer
melodic 3.0.0
An asynchronous client for fetching lyrical discographies of music artists.
8 versions - Latest release: 17 days ago - 138 downloads last month - 0 stars on GitHub - 1 maintainer
botasaurus-driver 4.0.92 ๐Ÿ’ฐ
Super Fast, Super Anti-Detect, and Super Intuitive Web Driver
68 versions - Latest release: 2 days ago - 11.9 thousand downloads last month - 80 stars on GitHub - 1 maintainer
bitbuffet 1.0.2
Python SDK for the bitbuffet API - BitBuffet
7 versions - Latest release: 21 days ago - 795 downloads last month - 1 stars on GitHub - 1 maintainer
bcitflex 3.11.0
Browse BCIT Flex course offerings.
20 versions - Latest release: over 1 year ago - 182 downloads last month - 9 stars on GitHub - 1 maintainer
souperscraper 1.0.2
A simple web scraper base combining Beautiful Soup and Selenium
3 versions - Latest release: over 1 year ago - 517 downloads last month - 3 stars on GitHub - 1 maintainer
soupsavvy 1.0.1
Powerful and flexible search engine for BeautifulSoup
18 versions - Latest release: 20 days ago - 46 downloads last month - 9 stars on GitHub - 1 maintainer
webscrapehelper 0.1.0
Playwright helper to capture manual browsing sessions and save page HTML snapshots.
1 version - Latest release: 3 days ago
scrappeycom 0.3.8
An API wrapper for Scrappey.com written in Python (cloudflare bypass & solver)
11 versions - Latest release: over 2 years ago - 176 downloads last month - 21 stars on GitHub - 1 maintainer
scrapeai 0.3.0
A Python library to scrape web data using LLMs and Selenium
2 versions - Latest release: about 1 year ago - 16 downloads last month - 0 stars on GitHub - 1 maintainer
qsobad-helium 5.1.0
Lighter browser automation based on Selenium.
1 version - Latest release: 7 months ago - 7,978 stars on GitHub - 1 maintainer
qhelium 5.1.0
Lighter browser automation based on Selenium.
5 versions - Latest release: 7 months ago - 30 downloads last month - 7,978 stars on GitHub - 1 maintainer
Top 2.1% on pypi.org
helium 5.1.1
Lighter browser automation based on Selenium.
29 versions - Latest release: 7 months ago - 6 dependent packages - 69 dependent repositories - 46.8 thousand downloads last month - 7,978 stars on GitHub - 1 maintainer
qufe 0.5.11
A comprehensive Python utility library for data processing, file handling, database management, a...
17 versions - Latest release: 3 days ago - 1.84 thousand downloads last month - 0 stars on GitHub - 1 maintainer
pcs-scraper 0.2.0
A python-based api to access procyclingstats data
2 versions - Latest release: over 2 years ago - 1 dependent repositories - 51 downloads last month - 3 stars on GitHub - 1 maintainer
metis-agent 0.20.0
Production-ready AI agent framework with comprehensive testing, intelligent caching, connection p...
63 versions - Latest release: 4 days ago - 570 downloads last month - 9 stars on GitHub - 1 maintainer
Top 2.3% on pypi.org
selectolax 3.8.9
Fast HTML5 parser with CSS selectors.
56 versions - Latest release: about 3 years ago - 46 dependent packages - 175 dependent repositories - 1.08 million downloads last month - 1,141 stars on GitHub - 1 maintainer
kryptone 6.0.0
Kryptone is a hight level web scapper dedicated to marketers and wrapped around the Selenium libr...
2 versions - Latest release: 8 months ago - 12 downloads last month - 0 stars on GitHub - 1 maintainer
kagebunshin 0.1.6
AI web automation agent swarm with self-cloning capabilities
7 versions - Latest release: 29 days ago - 686 downloads last month - 1 stars on GitHub - 1 maintainer
Top 3.7% on pypi.org
grab 1.2.0
Web scraping framework
124 versions - Latest release: 14 days ago - 1 dependent package - 123 dependent repositories - 66.4 thousand downloads last month - 2,416 stars on GitHub - 1 maintainer
scrapling-zhng 0.2.99 ๐Ÿ’ฐ
Scrapling is an undetectable, powerful, flexible, high-performance Python library that makes Web ...
1 version - Latest release: 3 months ago - 12 downloads last month - 7,383 stars on GitHub - 1 maintainer
django-scrapyd-manager 0.2.2
A Django application for managing Scrapyd nodes, projects, spiders and jobs
13 versions - Latest release: 10 days ago - 1.59 thousand downloads last month - 1 stars on GitHub - 1 maintainer
patchright 1.55.2
Undetected Python version of the Playwright testing and automation library.
15 versions - Latest release: 16 days ago - 1.27 million downloads last month - 863 stars on GitHub - 1 maintainer
datacrawl 0.6.1
A simple and efficient web crawler in Python.
2 versions - Latest release: about 1 year ago - 17 downloads last month - 64 stars on GitHub - 1 maintainer
thisisapogreq 21.3.3 ๐Ÿ’ฐ
Faster & simpler requests replacement for Python
1 version - Latest release: over 3 years ago - 1 dependent repositories - 10 downloads last month - 1,115 stars on GitHub - 1 maintainer
web-scraping-bot-template 1.2.0
Tools that are useful for building web-scraping automated bots.
3 versions - Latest release: over 3 years ago - 1 dependent repositories - 14 downloads last month - 1 maintainer
fastcrawl 1.1.0
Fast and asynchronous web crawling and scraping library for Python.
12 versions - Latest release: about 1 month ago - 95 downloads last month - 0 stars on GitHub - 1 maintainer
funfake 1.0.9
A lightweight Python library for generating realistic HTTP headers to simulate various browsers a...
7 versions - Latest release: about 1 month ago - 89 downloads last month - 0 stars on GitHub - 1 maintainer
fundus 0.5.2
A very simple news crawler
16 versions - Latest release: 6 days ago - 589 downloads last month - 403 stars on GitHub - 1 maintainer
solvium 1.0.1
Python SDK for Solvium
2 versions - Latest release: 4 months ago - 37 downloads last month - 0 stars on GitHub - 1 maintainer
iflow-mcp_undoom-douyin-data-analysis 0.1.3
ๆŠ–้Ÿณๆ•ฐๆฎๅˆ†ๆž MCP ๆœๅŠกๅ™จ - ๆไพ›ๆŠ–้Ÿณ่ง†้ข‘ๅ’Œ็”จๆˆทๆ•ฐๆฎ็š„้‡‡้›†ใ€ๅˆ†ๆžๅ’Œๅฏผๅ‡บๅŠŸ่ƒฝ
1 version - Latest release: about 1 month ago - 34 downloads last month - 4 stars on GitHub - 1 maintainer
undoom-douyin-data-analysis 0.1.3
ๆŠ–้Ÿณๆ•ฐๆฎๅˆ†ๆž MCP ๆœๅŠกๅ™จ - ๆไพ›ๆŠ–้Ÿณ่ง†้ข‘ๅ’Œ็”จๆˆทๆ•ฐๆฎ็š„้‡‡้›†ใ€ๅˆ†ๆžๅ’Œๅฏผๅ‡บๅŠŸ่ƒฝ
4 versions - Latest release: about 2 months ago - 153 downloads last month - 4 stars on GitHub - 1 maintainer
rnet 2.4.2 ๐Ÿ’ฐ
A blazing-fast Python HTTP client with TLS fingerprint
64 versions - Latest release: 2 months ago - 31.8 thousand downloads last month - 852 stars on GitHub - 1 maintainer
primp 0.15.0
HTTP client that can impersonate web browsers, mimicking their headers and `TLS/JA3/JA4/HTTP2` fi...
26 versions - Latest release: 6 months ago - 1.83 million downloads last month - 373 stars on GitHub - 1 maintainer
oxylabs-ai-studio 0.2.12
Oxylabs studio python sdk
14 versions - Latest release: about 2 months ago - 752 downloads last month - 85 stars on GitHub - 1 maintainer
price-tracker 0.6.1
a python application to track prices on some online shopping applications
6 versions - Latest release: almost 5 years ago - 1 dependent repositories - 16 downloads last month - 27 stars on GitHub - 1 maintainer
pydatamax 0.2.0
Advanced Data Crawling and Processing Framework
20 versions - Latest release: 29 days ago - 310 downloads last month - 140 stars on GitHub - 1 maintainer
cyberplant-scrapy 1.2.0.dev2
A high-level Web Crawling and Web Scraping framework
1 version - Latest release: over 9 years ago - 1 dependent repositories - 18 downloads last month - 55,083 stars on GitHub - 1 maintainer
ai-research-planner 0.0.2
AI-powered research planning and execution system
2 versions - Latest release: 4 months ago - 28 downloads last month - 1 maintainer
webdown 0.7.0
Convert web pages and HTML files to markdown and Claude XML formats
8 versions - Latest release: 6 months ago - 43 downloads last month - 4 stars on GitHub - 1 maintainer
google-search-results-async 2.4.2
Scrape and search localized results from Google, Bing, Baidu, Yahoo, Yandex, Ebay, Homedepot, you...
3 versions - Latest release: over 3 years ago - 1 dependent repositories - 29 downloads last month - 710 stars on GitHub - 1 maintainer
metadatascraper 1.0.4 ๐Ÿ’ฐ
A module designed to automate the extraction of follower counts and post details from a public Fa...
8 versions - Latest release: about 1 year ago - 80 downloads last month - 10 stars on GitHub - 1 maintainer
httpz-scanner 2.1.9
Hyper-fast HTTP Scraping Tool
27 versions - Latest release: 8 months ago - 349 downloads last month - 4 stars on GitHub - 1 maintainer
yirabot 1.0.9
YiraBot: Simplifying Web Scraping for All. A user-friendly tool for developers and enthusiasts, o...
20 versions - Latest release: over 1 year ago - 78 downloads last month - 20 stars on GitHub - 1 maintainer
pyanimeplanet 1.0.0
A python module to extract information from anime-planet website
6 versions - Latest release: about 1 year ago - 18 downloads last month - 1 maintainer
async-tls-client 2.0.0
Asyncio TLS client with advanced fingerprinting capabilities
14 versions - Latest release: 5 months ago - 36.1 thousand downloads last month - 21 stars on GitHub - 1 maintainer
langchain-agentql 1.0.1
An integration package connecting AgentQL and LangChain
2 versions - Latest release: 7 months ago - 327 downloads last month - 12 stars on GitHub - 1 maintainer
bisque 0.3.0
Web scraping into structured Pydantic data models.
3 versions - Latest release: about 2 years ago - 47 downloads last month - 1 stars on GitHub - 1 maintainer
vidurl 0.1.0
Extract video URLs from web pages and generate curl download commands
1 version - Latest release: 19 days ago - 137 downloads last month - 0 stars on GitHub - 1 maintainer
Top 3.8% on pypi.org
autoscraper 1.1.14 ๐Ÿ’ฐ
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
16 versions - Latest release: about 3 years ago - 2 dependent packages - 36 dependent repositories - 3.2 thousand downloads last month - 6,896 stars on GitHub - 1 maintainer
pymultidictionary 1.3.2 ๐Ÿ’ฐ
PyMultiDictionary is a Dictionary Module for Python 2 to get meanings, translations, synonyms and...
17 versions - Latest release: 6 months ago - 2 dependent repositories - 2.02 thousand downloads last month - 52 stars on GitHub - 1 maintainer
wrighter 0.1.2
eb scraping/browser automation framework built on Playwright
3 versions - Latest release: over 2 years ago - 1 dependent package - 1 dependent repositories - 23 downloads last month - 3 stars on GitHub - 1 maintainer
outlook-calendar-sync 0.0.5
CLI interface to scrape outlook calendar events from the webapp and insert them to a google calen...
5 versions - Latest release: almost 3 years ago - 21 downloads last month - 2 stars on GitHub - 1 maintainer
askpablos-api 0.2.0
Professional Python client for the AskPablos proxy API service
2 versions - Latest release: 3 months ago - 41 downloads last month - 0 stars on GitHub - 1 maintainer
py-easy-scrape 0.1.6
A useful package for web scraping with Selenium
4 versions - Latest release: almost 2 years ago - 20 downloads last month - 1 stars on GitHub - 1 maintainer
fala-parlamentar 0.1.1
Mรณdulo para obtenรงรฃo dos discursos dos parlamentares via webscraping mantendo a url para a public...
2 versions - Latest release: almost 2 years ago - 16 downloads last month - 6 stars on GitHub - 1 maintainer
bibscrap 0.0.4
Semi-automated tools for systematic literature reviews.
13 versions - Latest release: almost 4 years ago - 1 dependent repositories - 104 downloads last month - 3 stars on GitHub - 1 maintainer
crawl4ai-mcp-sse-stdio 1.1.0
MCP (Model Context Protocol) server for Crawl4AI - Universal web crawling and data extraction
10 versions - Latest release: 17 days ago - 2.55 thousand downloads last month - 3 stars on GitHub - 1 maintainer
bluemoss 1.0.0
bluemoss enables you to easily scrape websites.
18 versions - Latest release: almost 2 years ago - 136 downloads last month - 7 stars on GitHub - 1 maintainer
foot-fixtures 1.0.3
Utility to find schedule for a football club
4 versions - Latest release: about 11 years ago - 2 dependent repositories - 16 downloads last month - 1 stars on GitHub - 1 maintainer
audiobooker 0.4.0
audio book scrapper
10 versions - Latest release: almost 4 years ago - 4 dependent repositories - 58 downloads last month - 27 stars on GitHub - 2 maintainers
microwler 0.1.8
A micro-framework for asynchronous deep crawls and web scraping written in Python
10 versions - Latest release: over 4 years ago - 1 dependent repositories - 35 downloads last month - 13 stars on GitHub - 1 maintainer
geniusdotpy 1.1.1
Python wrapper for Genius API
24 versions - Latest release: almost 2 years ago - 85 downloads last month - 6 stars on GitHub - 1 maintainer
cesail 0.2.3
A comprehensive web automation and DOM parsing platform with AI-powered agents
6 versions - Latest release: about 1 month ago - 310 downloads last month - 0 stars on GitHub - 1 maintainer
pyreqwest-impersonate 0.5.3
HTTP client that can impersonate web browsers, mimicking their headers and `TLS/JA3/JA4/HTTP2` fi...
25 versions - Latest release: about 1 year ago - 30.8 thousand downloads last month - 349 stars on GitHub - 1 maintainer
proxar 0.7.0
A Python client for fetching public proxies from multiple sources.
8 versions - Latest release: 3 months ago - 44 downloads last month - 0 stars on GitHub - 1 maintainer
scrapy-beautifulsoup 0.0.2
Simple Scrapy middleware to process non-well-formed HTML with BeautifulSoup
2 versions - Latest release: about 9 years ago - 1 dependent repositories - 62 downloads last month - 21 stars on GitHub - 1 maintainer
raccy 2.0.0
Web Scraping Library Based on Selenium
11 versions - Latest release: almost 4 years ago - 1 dependent repositories - 40 downloads last month - 1 stars on GitHub - 1 maintainer
Top 9.5% on pypi.org
pylab-utils 0.5
python utility tools
2 versions - Latest release: about 4 years ago - 1 dependent repositories - 21 downloads last month - 58,160 stars on GitHub - 1 maintainer
betterhtmlchunking 0.9.3
A Python library for intelligent HTML segmentation and ROI extraction. It builds a DOM tree from ...
4 versions - Latest release: 4 months ago - 126 downloads last month - 44 stars on GitHub - 1 maintainer
cached-historical-data-fetcher 0.2.31 ๐Ÿ’ฐ
Python utility for fetching any historical data using caching.
33 versions - Latest release: 3 months ago - 234 downloads last month - 3 stars on GitHub - 1 maintainer
scrapy-ua-rotator 1.0.1
Flexible and modern User-Agent rotator middleware for Scrapy, supporting Faker, fake-useragent, a...
2 versions - Latest release: 3 months ago - 205 downloads last month - 2 stars on GitHub
ghostpeek 1.1.1
A stealthy domain reconnaissance tool for subdomain discovery and intelligence gathering
3 versions - Latest release: 30 days ago - 425 downloads last month - 0 stars on GitHub - 1 maintainer
top-github-scraper 0.1.1 ๐Ÿ’ฐ
Scrape top GitHub repositories and users based on keyword
15 versions - Latest release: over 4 years ago - 1 dependent repositories - 35 downloads last month - 88 stars on GitHub - 1 maintainer
campbells 0.3.0
A condensed web scraping library.
8 versions - Latest release: about 2 years ago - 1 dependent repositories - 68 downloads last month - 0 stars on GitHub - 1 maintainer
seo-sentinel 1.0.2
SEO Sentinel is a comprehensive automated SEO auditing tool that crawls websites, detects SEO iss...
4 versions - Latest release: 22 days ago - 108 downloads last month - 30 stars on GitHub - 1 maintainer
firecrawl-simple-client 0.1.3
Python client for Firecrawl-Simple
4 versions - Latest release: 11 months ago - 17 downloads last month - 1 stars on GitHub - 1 maintainer
cambridge 3.14.1
cambridge is a terminal version of Cambridge Dictionary.
119 versions - Latest release: 3 months ago - 2 dependent repositories - 947 downloads last month - 58 stars on GitHub - 1 maintainer
ioweb 0.0.29
Web Scraping Framework
26 versions - Latest release: almost 5 years ago - 1 dependent repositories - 28 downloads last month - 34 stars on GitHub - 1 maintainer
pinscrape 5.0.0
Pinterest | a simple data scraper for pinterest
20 versions - Latest release: about 1 month ago - 1 dependent repositories - 720 downloads last month - 111 stars on GitHub - 1 maintainer
spiderchef 0.1.1
A recipe based web scraping tool.
5 versions - Latest release: 5 months ago - 34 downloads last month - 0 stars on GitHub - 1 maintainer
tarzi 0.1.6
Rust-native lite search for AI applications
9 versions - Latest release: 14 days ago - 488 downloads last month - 2 stars on GitHub - 1 maintainer
torvend 0.0.1
A framework for public torrent vendor scrapers
2 versions - Latest release: almost 8 years ago - 1 dependent repositories - 8 downloads last month - 1 stars on GitHub - 1 maintainer
rtv-downloader 2.0.8
Command-line program to download videos from various sites.
5 versions - Latest release: over 6 years ago - 1 dependent repositories - 10 downloads last month - 4 stars on GitHub - 1 maintainer