An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

npmjs.org "data-extraction" keyword

View the packages on the npmjs.org package registry that are tagged with the "data-extraction" keyword.

matter-yaml 1.1.0
YAML front-matter parser and combiner. Minimal and perfect
3 versions - Latest release: about 1 month ago - 4 downloads last month - 0 stars on GitHub - 1 maintainer
yolo-scraper 1.0.1
A simple way to structure your web scraper.
6 versions - Latest release: about 4 years ago - 1 dependent package - 3 dependent repositories - 63 downloads last month - 7 stars on GitHub - 1 maintainer
google-reviews-scraping 1.0.6
A nodejs tool for scraping Google Reviews using Puppeteer.
6 versions - Latest release: over 1 year ago - 28 downloads last month - 7 stars on GitHub - 1 maintainer
Top 1.9% on npmjs.org
pdfreader 3.0.7 💰
Read text and parse tables from PDF files. Supports tabular data with automatic column detection,...
58 versions - Latest release: 3 months ago - 30 dependent packages - 226 dependent repositories - 164 thousand downloads last month - 672 stars on GitHub - 1 maintainer
agentql-mcp 1.0.0
Model Context Protocol (MCP) server that integrates AgentQL data extraction capabilities.
3 versions - Latest release: about 1 month ago - 699 downloads last month - 51 stars on GitHub - 1 maintainer
@openlawnz/openlawnz-parsers 1.0.2
OpenLaw NZ Parsers used to extract information from New Zealand case law PDF files.
3 versions - Latest release: over 2 years ago - 1 dependent package - 1 dependent repositories - 13 downloads last month - 4 stars on GitHub - 2 maintainers
cwhite-pdfreader 0.0.0-development 💰
Read text and parse tables from PDF files. Supports tabular data with automatic column detection,...
1 version - Latest release: 4 months ago - 5 downloads last month - 672 stars on GitHub - 1 maintainer
parse-json-to-csv 1.0.0
This is a simple Javscript library that converts a JSON object to a CSV file.
1 version - Latest release: 11 months ago - 19 downloads last month - 1 maintainer
financial-data-extractors-pratiksha 1.0.9
Utilities for extracting financial data from various economic calendar websites
1 version - Latest release: 2 days ago - 1 maintainer
tpl-parser 1.0.2
A TypeScript library for parsing Photoshop TPL (Tool Preset) files and converting the data into a...
3 versions - Latest release: 8 months ago - 17 downloads last month - 1 stars on GitHub - 1 maintainer
Top 9.8% on npmjs.org
line-segmentation-algorithm-to-gcp-vision 1.0.2
Line segmentation algorithm for Google Vision API.
3 versions - Latest release: almost 6 years ago - 1 dependent package - 1 dependent repositories - 153 downloads last month - 97 stars on GitHub - 1 maintainer
js-harvester 0.2.9
Harvester is a lightweight and highly optimized library for extracting data from the DOM tree. It...
9 versions - Latest release: 4 days ago - 383 downloads last month - 11 stars on GitHub - 1 maintainer
exceltables4js 3.1.0
Convierte un objeto de tabla Excel a JSON.
14 versions - Latest release: 2 months ago - 41 downloads last month - 0 stars on GitHub - 1 maintainer
matter-toml 1.0.0
TOML front-matter parser and combiner. Minimal and perfect
2 versions - Latest release: about 1 month ago - 22 downloads last month - 0 stars on GitHub - 1 maintainer
@mseep/puremd-mcp 1.0.3
Model Context Protocol (MCP) server for pure.md, the markdown delivery network for LLMs
1 version - Latest release: 8 days ago - 43 downloads last month - 10 stars on GitHub - 1 maintainer
intelliextract 0.0.2 💰
Using LLMs to search and extract structured data from unstructured data
2 versions - Latest release: over 1 year ago - 1 dependent repositories - 12 downloads last month - 1 stars on GitHub - 1 maintainer
@pseem/tavily-mcp 0.1.4
MCP server for advanced web search using Tavily
1 version - Latest release: 9 days ago - 44 downloads last month - 1 maintainer
@pratiksha90/financial-data-extractors 1.0.8
Utilities for extracting financial data from various economic calendar websites
9 versions - Latest release: 6 days ago - 367 downloads last month - 1 maintainer
@mseep/tavily-mcp 0.1.4
MCP server for advanced web search using Tavily
1 version - Latest release: 7 days ago - 43 downloads last month - 1 maintainer
@mseep/mcp-omnisearch 0.0.5
MCP server for integrating Omnisearch with LLMs
1 version - Latest release: 8 days ago - 38 downloads last month - 1 maintainer
webmiddle-component-jsonselect-to-virtual 0.4.0 deprecated
> Similar to the [CheerioToVirtual](https://github.com/webmiddle/webmiddle/tree/master/packages/w...
2 versions - Latest release: over 6 years ago - 2 dependent packages - 1 dependent repositories - 15 downloads last month - 14 stars on GitHub - 1 maintainer
webmiddle-manager-cookie 0.5.1
> Wrapper on top of the tough-cookie library, acts as a cookie jar.
8 versions - Latest release: over 6 years ago - 2 dependent packages - 1 dependent repositories - 42 downloads last month - 14 stars on GitHub - 1 maintainer
webmiddle-service-cheerio-to-virtual 0.3.0 deprecated
> Service that converts a resource, whose content is parsed by the [Cheerio](https://github.com/...
5 versions - Latest release: about 7 years ago - 4 dependent packages - 1 dependent repositories - 21 downloads last month - 14 stars on GitHub - 1 maintainer
webmiddle-component-jsonselect-to-json 0.5.1
> Component that transforms a JSON resource into another JSON resource by using JSONSelect.
4 versions - Latest release: over 6 years ago - 1 dependent package - 1 dependent repositories - 27 downloads last month - 14 stars on GitHub - 1 maintainer
webmiddle-service-browser 0.3.0 deprecated
> Similar to the HttpRequest service, but it uses [Headless Chrome](https://developers.google.com...
5 versions - Latest release: about 7 years ago - 3 dependent packages - 1 dependent repositories - 19 downloads last month - 14 stars on GitHub - 1 maintainer
webmiddle-service-jsonselect-to-json 0.3.0 deprecated
> Service that converts a resource, whose content is parsed by the [JSONSelect](https://github.c...
5 versions - Latest release: about 7 years ago - 3 dependent packages - 1 dependent repositories - 21 downloads last month - 14 stars on GitHub - 1 maintainer
webmiddle-component-browser 0.5.1
> Similar to the HttpRequest component, but it uses Puppeteer to fetch html pages.
4 versions - Latest release: over 6 years ago - 1 dependent package - 1 dependent repositories - 18 downloads last month - 14 stars on GitHub - 1 maintainer
webmiddle-service-jsonselect-to-virtual 0.3.0 deprecated
> Similar to the [CheerioToVirtual](https://github.com/webmiddle/webmiddle/tree/master/packages/w...
5 versions - Latest release: about 7 years ago - 4 dependent packages - 1 dependent repositories - 18 downloads last month - 14 stars on GitHub - 1 maintainer
webmiddle 0.5.1
This module should be installed in any webmiddle application, as it provides the machinery to par...
9 versions - Latest release: over 6 years ago - 27 dependent packages - 1 dependent repositories - 46 downloads last month - 14 stars on GitHub - 1 maintainer
webmiddle-component-cheerio-to-json 0.5.1
> Component that transforms a HTML or XML resource into a JSON resource by using Cheerio.
4 versions - Latest release: over 6 years ago - 1 dependent package - 1 dependent repositories - 31 downloads last month - 14 stars on GitHub - 1 maintainer
webmiddle-service-arraymap 0.3.0 deprecated
> Maps an array into an array of resources by executing a callback on each item.
5 versions - Latest release: about 7 years ago - 3 dependent packages - 1 dependent repositories - 21 downloads last month - 14 stars on GitHub - 1 maintainer
webmiddle-service-parallel 0.3.0 deprecated
> Its purpose is similar to Pipe, i.e. the execution of services, but they are executed concurren...
5 versions - Latest release: about 7 years ago - 3 dependent packages - 1 dependent repositories - 21 downloads last month - 14 stars on GitHub - 1 maintainer
webmiddle-component-http-request 0.5.1
> Component built on top of the request library, it is used to perform http requests.
4 versions - Latest release: over 6 years ago - 1 dependent package - 1 dependent repositories - 16 downloads last month - 14 stars on GitHub - 1 maintainer
webmiddle-component-arraymap 0.4.0 deprecated
> Maps an array into an array of resources by executing a callback on each item.
2 versions - Latest release: over 6 years ago - 1 dependent package - 1 dependent repositories - 18 downloads last month - 14 stars on GitHub - 1 maintainer
webmiddle-service-pipe 0.3.0 deprecated
> Executes a sequence of services, piping their results (resources) to the subsequent services in...
5 versions - Latest release: about 7 years ago - 6 dependent packages - 1 dependent repositories - 19 downloads last month - 14 stars on GitHub - 1 maintainer
webmiddle-client 0.5.1
> Connects to a webmiddle-server, allowing remote services execution.
6 versions - Latest release: over 6 years ago - 1 dependent package - 1 dependent repositories - 21 downloads last month - 14 stars on GitHub - 1 maintainer
webmiddle-service-cheerio-to-json 0.3.0 deprecated
> Service that converts a resource, whose content is parsed by the [Cheerio](https://github.com/...
5 versions - Latest release: about 7 years ago - 3 dependent packages - 1 dependent repositories - 20 downloads last month - 14 stars on GitHub - 1 maintainer
webmiddle-component-cheerio-to-virtual 0.4.0 deprecated
> Component that converts a resource, whose content is parsed by the [Cheerio](https://github.com...
2 versions - Latest release: over 6 years ago - 2 dependent packages - 1 dependent repositories - 12 downloads last month - 14 stars on GitHub - 1 maintainer
webmiddle-component-resume 0.5.1
> A component that makes a task resumable by caching the result.
4 versions - Latest release: over 6 years ago - 1 dependent package - 1 dependent repositories - 29 downloads last month - 14 stars on GitHub - 1 maintainer
webmiddle-component-virtual-to-json 0.4.0 deprecated
> Converts a virtual resource into a JSON resource.
2 versions - Latest release: over 6 years ago - 3 dependent packages - 1 dependent repositories - 17 downloads last month - 14 stars on GitHub - 1 maintainer
webmiddle-service-resume 0.3.0 deprecated
> A service that makes its children **resumable** by caching the result.
4 versions - Latest release: about 7 years ago - 3 dependent packages - 1 dependent repositories - 13 downloads last month - 14 stars on GitHub - 1 maintainer
webmiddle-component-parallel 0.5.1
> Executes multiple tasks concurrently.
4 versions - Latest release: over 6 years ago - 1 dependent package - 1 dependent repositories - 25 downloads last month - 14 stars on GitHub - 1 maintainer
webmiddle-service-virtual-to-json 0.3.0 deprecated
> Converts a virtual resource into a JSON resource.
5 versions - Latest release: about 7 years ago - 5 dependent packages - 1 dependent repositories - 24 downloads last month - 14 stars on GitHub - 1 maintainer
webmiddle-server 0.5.1
> Easily turns webmiddle applications into REST APIs, allowing remote access via HTTP or WebSocket.
6 versions - Latest release: over 6 years ago - 2 dependent packages - 1 dependent repositories - 21 downloads last month - 14 stars on GitHub - 1 maintainer
webmiddle-service-http-request 0.3.0 deprecated
> Service built on top of the [request](https://github.com/request/request) library, it is used t...
5 versions - Latest release: about 7 years ago - 3 dependent packages - 1 dependent repositories - 17 downloads last month - 14 stars on GitHub - 1 maintainer
@typestream/core 0.0.8
Core library to be used in TypeStream projects
8 versions - Latest release: about 3 years ago - 3 dependent packages - 1 dependent repositories - 32 downloads last month - 53 stars on GitHub - 1 maintainer
@typestream/sdk 0.0.11
SDK for the rapid development of data transformation projects
11 versions - Latest release: about 3 years ago - 2 dependent packages - 39 downloads last month - 53 stars on GitHub - 1 maintainer
@typestream/core-protocol 0.0.5
Internal package used by TypeStream to communicate between the SDK and project code
5 versions - Latest release: about 3 years ago - 4 dependent packages - 1 dependent repositories - 20 downloads last month - 53 stars on GitHub - 1 maintainer
tyst 0.0.6
CLI that warns people about using tyst (TypeStream CLI) outside of TypeStream projects
6 versions - Latest release: about 3 years ago - 27 downloads last month - 53 stars on GitHub - 1 maintainer
create-typestream 0.0.8
Package to create a TypeStream project
8 versions - Latest release: about 3 years ago - 1 dependent package - 35 downloads last month - 53 stars on GitHub - 1 maintainer
matter-json 1.0.0
JSON front-matter parser and combiner. Minimal and perfect
2 versions - Latest release: about 1 month ago - 35 downloads last month - 1 maintainer
web-scrapify 1.0.10
A simple web scraper that can scrape product details from various e-commerce platforms.
11 versions - Latest release: 4 months ago - 83 downloads last month - 0 stars on GitHub - 1 maintainer
easy-csv-parser 1.0.8
easy-csv-parser simplifies CSV data parsing in Node.js. Fetch, extract headers, and convert CSV f...
6 versions - Latest release: over 1 year ago - 36 downloads last month - 1 stars on GitHub - 1 maintainer
ofx-data-extractor 1.4.7
A module written in TypeScript that provides a utility to extract data from an OFX file in Node.j...
19 versions - Latest release: 2 months ago - 8.07 thousand downloads last month - 16 stars on GitHub - 1 maintainer
agentql-mcp-server 0.0.2 unpublished
Model Context Protocol server to work with AgentQL
2 versions - Latest release: about 1 month ago - 59 downloads last month - 1 maintainer
parsera-ts 1.0.1
Official TypeScript SDK for Parsera.org API - Extract structured data from any webpage
2 versions - Latest release: 5 months ago - 6 downloads last month - 1 maintainer
tavily-mcp 0.1.4
MCP server for advanced web search using Tavily
5 versions - Latest release: 2 months ago - 13.2 thousand downloads last month - 1 maintainer
exfilms 2.0.1
A command-line interface tool to extract, filter, and standardise mass spectrometry data.
19 versions - Latest release: about 1 month ago - 183 downloads last month - 1 stars on GitHub - 1 maintainer
webmiddle-component-pipe 0.5.1
> Executes a sequence of tasks, piping the result of a task to the next task.
4 versions - Latest release: over 6 years ago - 4 dependent packages - 1 dependent repositories - 16 downloads last month - 14 stars on GitHub - 1 maintainer
mcp-agentql 0.0.1 unpublished
Model Context Protocol server to work with AgentQL
1 version - Latest release: about 2 months ago - 1 maintainer
npm-scrapper-faris 1.0.2
A lightweight package to scrape and extract metadata for npm packages.
3 versions - Latest release: 4 months ago - 23 downloads last month - 1 maintainer
mcp-omnisearch 0.0.5
MCP server for integrating Omnisearch with LLMs
5 versions - Latest release: 17 days ago - 362 downloads last month - 1 maintainer
hred 1.5.1
![hred](./.github/hred.svg)
17 versions - Latest release: 7 months ago - 1 dependent package - 56 downloads last month - 72 stars on GitHub - 1 maintainer
@mcpflow.io/mcp-tavily-mcp- 0.1.4
Tavily搜索 MCP 服务是一个兼容Model Context Protocol (MCP)协议的高级网络搜索工具,允许AI模型如Claude直接访问互联网上的实时信息。该服务提供两个核心工...
1 version - Latest release: 22 days ago - 53 downloads last month - 1 maintainer
social-profile-scraping 1.0.6
A lightweight library for scraping public social media profiles, providing profile information su...
6 versions - Latest release: 3 months ago - 24 downloads last month - 0 stars on GitHub - 1 maintainer
tavily_cursor_mcp 0.1.2
MCP server for advanced web search using Tavily on Cursor 0.45
1 version - Latest release: about 1 month ago - 1 maintainer
xscrape 2.0.0
A flexible and powerful library designed to extract and transform data from HTML documents using ...
7 versions - Latest release: 2 months ago - 114 downloads last month - 1 maintainer
lusail 0.8.1
JavaScript implementation of Lusail, a domain-specific language for extracting structured data fr...
12 versions - Latest release: almost 2 years ago - 2 downloads last month - 1 stars on GitHub - 1 maintainer
@promptapi/scraper-pkg 0.1.6
![Node](https://img.shields.io/badge/node-14.9.0-green.svg) [![npm version](https://badge.fury.io...
6 versions - Latest release: over 4 years ago - 1 dependent package - 2 downloads last month - 1 stars on GitHub - 1 maintainer
dex8-sdk 2.14.0 deprecated
DEX8 SDK is a library which helps developers to create DEX8 crawler, scraper tasks or automated s...
72 versions - Latest release: over 2 years ago - 1 dependent package - 1 dependent repositories - 166 downloads last month - 0 stars on GitHub - 1 maintainer
csv-to-js-object 1.0.2 unpublished
csv-to-js-object simplifies CSV data parsing in Node.js. Fetch, extract headers, and convert CSV ...
3 versions - Latest release: over 1 year ago
ofx-ts 1.0.6 unpublished
This is a `Node.js` module written in `TypeScript` that provides a utility for extracting data fr...
3 versions - Latest release: almost 2 years ago - 121 downloads last month - 0 stars on GitHub - 1 maintainer