pypi.org "data extraction" keyword
View the packages on the pypi.org package registry that are tagged with the "data extraction" keyword.
universal-scraper 1.3.0
AI-powered web scraping with customizable field extraction4 versions - Latest release: about 21 hours ago - 332 downloads last month - 0 stars on GitHub - 1 maintainer
browsernative 1.0.0
Lightning-fast web scraping Python SDK - 11x faster than traditional scrapers1 version - Latest release: 3 months ago - 19 downloads last month - 0 stars on GitHub - 1 maintainer
pyehr 0.1.0
A Python package for extract data from large EHR dataset.1 version - Latest release: 3 months ago - 13 downloads last month - 0 stars on GitHub - 1 maintainer
sour-cereal 1.0.9
An abstraction of data source for extraction applications usage10 versions - Latest release: almost 5 years ago - 1 dependent repositories - 15 downloads last month - 0 stars on GitHub - 1 maintainer
adetfs 1.2.0
Automated Data Extraction Tool for Fitbit Server2 versions - Latest release: over 2 years ago - 41 downloads last month - 0 stars on GitHub - 1 maintainer
quipucamayoc 0.1.2
Tools to extract information from digitized historical documents2 versions - Latest release: over 3 years ago - 1 dependent repositories - 8 downloads last month - 30 stars on GitHub - 1 maintainer
brightdata-sdk 1.1.2
Python SDK for Bright Data Web Scraping and SERP APIs11 versions - Latest release: 2 days ago - 1.95 thousand downloads last month - 14 stars on GitHub - 1 maintainer
space-packet-parser 5.0.1
A CCSDS telemetry packet decoding library based on the XTCE packet format description standard.23 versions - Latest release: 11 months ago - 8.76 thousand downloads last month - 31 stars on GitHub - 1 maintainer
web-extract-data 0.1.0
A Python package for extracting structured data from websites1 version - Latest release: 3 months ago - 49 downloads last month - 1 stars on GitHub - 1 maintainer
mdm 0.9.2
A middleware data manager for at least MOOS-IvP and ROS29 versions - Latest release: 11 months ago - 1 dependent repositories - 170 downloads last month - 4 stars on GitHub - 1 maintainer
scrapery 0.1.2
Scrapery: A fast, lightweight library to scrape HTML, XML, and JSON using XPath, CSS selectors, a...4 versions - Latest release: 7 days ago - 387 downloads last month - 1 maintainer
gi-scraper 0.4.6
Google Image Scraper.16 versions - Latest release: over 1 year ago - 2 dependent repositories - 83 downloads last month - 1 maintainer
spider-mcp-client 0.1.5
Official Python client for Spider MCP web scraping API6 versions - Latest release: 4 days ago - 542 downloads last month - 1 maintainer
novalad 0.1.15
Novalad: AI-powered platform for transforming unstructured documents like PDFs and PowerPoints in...16 versions - Latest release: 5 days ago - 43 downloads last month - 17 stars on GitHub - 1 maintainer
Top 9.5% on pypi.org
8 versions - Latest release: about 3 years ago - 4 dependent repositories - 3.16 thousand downloads last month - 94 stars on GitHub - 1 maintainer
wikitextprocessor 0.4.96
Parser and expander for Wikipedia, Wiktionary etc. dump files, with Lua execution support8 versions - Latest release: about 3 years ago - 4 dependent repositories - 3.16 thousand downloads last month - 94 stars on GitHub - 1 maintainer
dryjq 1.9.1
Drastically Reduced YAML / JSON Query23 versions - Latest release: almost 2 years ago - 98 downloads last month - 1 stars on gitlab.com - 1 maintainer
flipkart-scrapping 0.0.16
Flipkart Web Scrapping Package16 versions - Latest release: almost 4 years ago - 1 dependent repositories - 21 downloads last month - 0 stars on GitHub - 1 maintainer
auxn-agent 0.2.0
A web scraping and data extraction tool.1 version - Latest release: 7 months ago - 12 downloads last month - 1 maintainer
scielo-extractor 0.0.6
SciELO data extractor4 versions - Latest release: over 1 year ago - 26 downloads last month - 1 maintainer
pdf2csv 0.2.2
A python library and CLI tool to convert PDF files to CSV files.5 versions - Latest release: 8 months ago - 265 downloads last month - 30 stars on GitHub - 1 maintainer
python-dataservice 0.14.0
Lightweight async data gathering for Python40 versions - Latest release: 10 months ago - 110 downloads last month - 1 maintainer
hotpdf 0.5.2
Fast PDF Data Extraction library27 versions - Latest release: over 1 year ago - 617 downloads last month - 196 stars on GitHub - 1 maintainer
dataextractiontiket 0.0.1
Air Travel Data Extraction Python Module1 version - Latest release: about 4 years ago - 1 dependent repositories - 7 downloads last month - 1 maintainer
sculpt 0.1.35
Sculpt: Structuring unstructured data with LLMs3 versions - Latest release: 5 months ago - 1.04 thousand downloads last month - 34 stars on GitHub - 2 maintainers
xtweet 1.0.2
Es una biblioteca que te permite interactuar de manera eficiente con la API de Twitter.3 versions - Latest release: about 2 years ago - 11 downloads last month - 3 stars on GitHub - 1 maintainer
sculptor 0.2.0
Sculptor: Structuring unstructured data with LLMs7 versions - Latest release: 5 months ago - 32 downloads last month - 33 stars on GitHub - 1 maintainer
langchain-zenrows 0.1.0
A LangChain integration tool that provides reliable web scraping capabilities at any scale using ...1 version - Latest release: 3 months ago - 14 downloads last month - 1 maintainer
pricetag 1.0.0
A pure-Python library for extracting price and currency information from unstructured text1 version - Latest release: 16 days ago - 0 stars on GitHub
eagle-eye-scraper 1.3.5
eagle-eye-scraper 是一个高效的 Python 数据采集框架,支持分布式部署,适用于复杂页面和大规模数据采集。5 versions - Latest release: about 2 months ago - 326 downloads last month - 1 maintainer
webchameleon-tool 1.3.0
A powerful web scraping and API reversal tool1 version - Latest release: 5 months ago - 7 downloads last month - 0 stars on GitHub - 1 maintainer
srilanka-lottery 0.1.2
Scrape Sri Lanka lottery results from NLB and DLB websites with this Python package4 versions - Latest release: 21 days ago - 29 downloads last month - 2 stars on GitHub - 1 maintainer
pydocuprocess 0.0.2
Python library to extract the information from documents2 versions - Latest release: about 2 years ago - 13 downloads last month - 1 maintainer
indo-scraper 1.0.0
Library Python untuk scraping website Indonesia dengan mudah1 version - Latest release: about 1 month ago - 0 stars on GitHub - 1 maintainer
par-scrape 0.8.0
A versatile web scraping tool with options for Selenium or Playwright, featuring OpenAI-powered d...18 versions - Latest release: 30 days ago - 131 downloads last month - 10 stars on GitHub - 1 maintainer
easyscrapper 1.0.2
easyscrapper is a fast, lightweight Python package and CLI tool that lets developers, data scient...3 versions - Latest release: 2 months ago - 12 downloads last month - 0 stars on GitHub - 1 maintainer
easylenium 1.0.0
A powerful web scraping tool for automating data extraction1 version - Latest release: over 2 years ago - 8 downloads last month - 0 stars on GitHub - 1 maintainer
Related Keywords
web scraping
16
python
9
automation
6
nlp
4
structured data
4
text processing
3
ai
3
pdf
3
html parser
3
llm
3
unstructured data
3
data scraping
3
data transformation
3
html parsing
3
scraping
3
selenium
2
api
2
large language model
2
data sculpting
2
information extraction
2
AI
2
natural language processing
2
pipeline
2
text to structured data
2
web crawler
2
web automation
2
scraping framework
2
api client
2
json
2
csv
2
API
2
text-search
1
text-extraction
1
pdfquery
1
yaml
1
data analysis
1
data science
1
twitter
1
social media
1
media download
1
data-extraction
1
scrapping
1
query
1
extract
1
convert
1
merge
1
web scrapping
1
flipkart
1
beautiful soup
1
requests
1
bibliographic data
1
scientific literature
1
excel
1
pdf2csv
1
docling
1
csv-export
1
pandas
1
pdf-converter
1
async
1
data gathering
1
web crawling
1
crawling
1
data
1
text extraction
1
hotpdf
1
pdfminer
1
srilanka
1
web-scraping
1
pdf extraction python
1
invoice processing
1
indonesia
1
website
1
anthropic
1
groq
1
llamacpp
1
ollama
1
openai
1
openrouter
1
playwright
1
xai
1
easyscrapper
1
RAG
1
Text Chuncking
1
Selenium
1
web scraping tool
1
data scraping library
1
Python
1
website scraping
1
web data extraction
1
automated data collection
1
data scraping tool
1
data scraping automation
1
web scraping library
1
library
1
media-download
1
social-media
1
twitter-api
1
langchain
1
zenrows
1
ml
1