An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

npmjs.org "web-data-extraction" keyword

View the packages on the npmjs.org package registry that are tagged with the "web-data-extraction" keyword.

gnews-scraper 1.2.3
GNewsScraper is a TypeScript package that scrapes article data from Google News based on a keywor...
5 versions - Latest release: over 2 years ago - 1 dependent repositories - 24 downloads last month - 13 stars on GitHub - 1 maintainer
firecrawl 4.8.1
JavaScript SDK for Firecrawl API
112 versions - Latest release: 4 days ago - 22 thousand downloads last month - 68,253 stars on GitHub - 2 maintainers
@mendable/firecrawl 4.8.1
JavaScript SDK for Firecrawl API
52 versions - Latest release: 4 days ago - 1.43 thousand downloads last month - 68,253 stars on GitHub - 4 maintainers
@lightfeed/extractor 0.2.1
Use LLMs to robustly extract and enrich structured data from HTML and markdown
6 versions - Latest release: 2 months ago - 153 downloads last month - 46 stars on GitHub - 1 maintainer
@mendable/firecrawl-js 4.8.0
JavaScript SDK for Firecrawl API
187 versions - Latest release: 6 days ago - 874 thousand downloads last month - 68,253 stars on GitHub - 4 maintainers
@lightfeed/sdk 0.1.7
Lightfeed SDK for Node.js
1 version - Latest release: 6 months ago - 9 downloads last month - 5 stars on GitHub - 1 maintainer
js-harvester 0.3.14
Harvester is a lightweight and highly optimized javascript library for extracting data from the D...
27 versions - Latest release: 7 months ago - 101 downloads last month - 23 stars on GitHub - 1 maintainer
extract-site-metadata 1.3.1
Metadata extractor for the sprawling web
11 versions - Latest release: over 3 years ago - 1 dependent package - 2 dependent repositories - 31 downloads last month - 0 stars on GitHub - 1 maintainer
lightfeed 0.1.6 deprecated
Lightfeed API Client for Node.js
6 versions - Latest release: 6 months ago - 238 downloads last month - 5 stars on GitHub - 1 maintainer
lightfeed-extract 0.1.5 deprecated
Use LLMs to robustly extract and enrich structured data from HTML and markdown
5 versions - Latest release: 7 months ago - 357 downloads last month - 37 stars on GitHub - 1 maintainer
Related Keywords
crawler 8 web-scraping 8 ai-agents 7 llm 7 webscraping 7 data-extraction 6 html-to-markdown 5 markdown 5 scraping 5 structured-data 5 web 5 web-data 5 web-extraction 4 scraper 4 data-engineering 4 data-pipeline 4 etl 4 llm-extraction 4 llm-scraper 4 rag 4 api 3 mendable 3 sdk 3 ai 3 ai-crawler 3 ai-scraping 3 ai-search 3 web-crawler 3 web-scraper 3 web-search 3 extraction 3 html 3 firecrawl 3 gemini 2 openai 2 article-extractor 2 google-gemini 2 html-parser 2 nlp 2 web-data-management 2 vector-database 2 knowledge-base 2 extract 2 embedding-search 2 data-integration 2 business-intelligence 2 lightfeed 2 web-automation 2 rss-feed 2 google-news-scraper 1 pattern-based 1 visual-scraping-template 1 declarative-scraping 1 fuzzy-scraping 1 fuzzy 1 approximate-scraping 1 approximate 1 resilient-scraping 1 resilient 1 flexible-scraping 1 flexible 1 structure-agnostic-scraping 1 semantic-scraping 1 tree-template-scraping 1 tree-template 1 open-graph-protocol 1 metadata-extraction 1 schema.org 1 opengraph 1 metadata 1 seo 1 content-extraction 1 text-extraction 1 attribute-extraction 1 nested-data-extraction 1 hierarchical-data-extraction 1 frontend-scraping 1 dom-manipulation 1 dom-traversal 1 nodejs 1 node-js 1 nodejs-scraping 1 browser-scraping 1 npm-package 1 javascript 1 javascript-scraping 1 visual-template 1 indentation-based-template 1 string-template 1 string-template-scraping 1 pseudo-tree-template 1 google-news 1 gnews-api 1 data-scraping 1 article-extraction 1 google crawler 1 news crawler 1 web crawler 1 google 1 news 1