An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

npmjs.org "document-processing" keyword

View the packages on the npmjs.org package registry that are tagged with the "document-processing" keyword.

@project-lakechain/sentence-transformers 0.10.0
Creates embeddings from text-oriented documents using Sentence Transformers models.
7 versions - Latest release: over 1 year ago - 12 downloads last month - 185 stars on GitHub - 1 maintainer
@project-lakechain/ollama-embedding-processor 0.10.0
Creates embeddings from documents using Ollama models.
3 versions - Latest release: over 1 year ago - 11 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/opensearch-domain 0.10.0
Creates an OpenSearch domain with Cognito authentication.
7 versions - Latest release: over 1 year ago - 7 downloads last month - 186 stars on GitHub - 1 maintainer
universal-documents-converter 1.0.1
Universal MCP Server for Multi-Rendering PDF Quality Assurance System with AI-powered optimization
1 version - Latest release: 5 months ago - 12 downloads last month - 1 maintainer
@nyazkhan/react-pdf-viewer 1.1.1
A comprehensive React TypeScript component library for viewing and interacting with PDF files usi...
5 versions - Latest release: 6 months ago - 40 downloads last month - 0 stars on GitHub - 1 maintainer
@project-lakechain/s3-storage-connector 0.10.0
Stores documents and their metadata in an S3 Bucket.
7 versions - Latest release: over 1 year ago - 3 downloads last month - 186 stars on GitHub - 1 maintainer
@abhi-arya1/mastra-minirag 1.0.1
Minimal recursive text chunking functionality extracted from @mastra/rag for edge deployments
2 versions - Latest release: 3 months ago - 1 maintainer
@equus-ai/sdk 1.1.0
Official JavaScript/TypeScript SDK for EQUUS AI Infrastructure Platform - Document Processing, RA...
2 versions - Latest release: 9 days ago - 147 downloads last month
@base64ai/n8n-nodes-base64ai 1.0.0
Official Base64.ai community node for n8n
2 versions - Latest release: 8 days ago - 130 downloads last month
@okrapdf/cli 0.2.11
OkraPDF command-line interface for PDF extraction and document chat
22 versions - Latest release: 4 days ago
@project-lakechain/opensearch-saved-object 0.10.0
Uploads a saved object to OpenSearch using AWS CDK.
7 versions - Latest release: over 1 year ago - 9 downloads last month - 186 stars on GitHub - 1 maintainer
Top 1.9% on npmjs.org
stopword 3.1.5
A module for node.js and the browser that takes in text and returns text that is stripped of stop...
76 versions - Latest release: 7 months ago - 96 dependent packages - 582 dependent repositories - 371 thousand downloads last month - 222 stars on GitHub - 2 maintainers
ppu-paddle-ocr 3.6.0
Blazing-fast and lightweight PaddleOCR library for Node.js and Bun. Perform accurate text detecti...
28 versions - Latest release: 23 days ago - 641 downloads last month - 19 stars on GitHub - 1 maintainer
@aidalinfo/pdf-processor 1.0.18
Powerful PDF data extraction library powered by AI vision models. Transform PDFs into structured,...
18 versions - Latest release: 5 months ago - 195 downloads last month - 6 stars on GitHub - 1 maintainer
@project-lakechain/sentence-text-splitter 0.10.0
Transforms text into chunks of tokens using a sentence text splitter.
7 versions - Latest release: over 1 year ago - 7 downloads last month - 186 stars on GitHub - 1 maintainer
expo-pdf-text-extract 1.0.0
Native PDF text extraction for React Native and Expo. Extract text content from PDF files using p...
1 version - Latest release: 10 days ago
@project-lakechain/pinecone-storage-connector 0.10.0
A data store connector for Pinecone.
7 versions - Latest release: over 1 year ago - 8 downloads last month - 186 stars on GitHub - 1 maintainer
@easyrag/sdk 0.1.1
Official JavaScript SDK for EasyRAG.com API
2 versions - Latest release: about 1 month ago - 17 downloads last month - 1 maintainer
@project-lakechain/sdk 0.10.0
An SDK providing helpers to create Lakechain middlewares in TypeScript.
9 versions - Latest release: over 1 year ago - 13 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/sqs-storage-connector 0.10.0
Stores documents and their metadata in an SQS queue.
7 versions - Latest release: over 1 year ago - 3 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/neo4j-storage-connector 0.10.0
A data store connector for Neo4j.
3 versions - Latest release: over 1 year ago - 2 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/opensearch-storage-connector 0.10.0
Stores document metadata in an OpenSearch index.
7 versions - Latest release: over 1 year ago - 4 downloads last month - 186 stars on GitHub - 1 maintainer
intelligent-text-chunking 1.0.3
An intelligent text chunking library that respects document structure and semantic boundaries
3 versions - Latest release: 4 months ago - 5 downloads last month - 1 maintainer
@project-lakechain/ffmpeg-processor 0.10.0
Processes media documents using FFMPEG.
5 versions - Latest release: over 1 year ago - 7 downloads last month - 186 stars on GitHub - 1 maintainer
n8n-nodes-solar 0.3.54
Solar LLM and Embeddings nodes for n8n
78 versions - Latest release: 3 months ago - 2.83 thousand downloads last month - 0 stars on GitHub - 1 maintainer
@project-lakechain/sharp-image-transform 0.10.0
A middleware transforming images using the sharp library.
7 versions - Latest release: over 1 year ago - 2 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/layers 0.10.0
Lambda layer library used by Project Lakechain.
7 versions - Latest release: over 1 year ago - 4 downloads last month - 186 stars on GitHub - 1 maintainer
@mastra/rag 2.0.0
The Retrieval-Augmented Generation (RAG) module contains document processing and embedding utilit...
564 versions - Latest release: 4 days ago - 98.1 thousand downloads last month - 18,513 stars on GitHub - 11 maintainers
lekana-gemini 1.0.0
A shared TypeScript library for Lekana microservices that provides AI-powered document processing...
1 version - Latest release: about 1 month ago - 1 maintainer
@promptbook/pdf 0.104.0
Promptbook: Turn your company's scattered knowledge into AI ready books
422 versions - Latest release: 23 days ago - 5.03 thousand downloads last month - 138 stars on GitHub - 1 maintainer
n8n-nodes-unicraft 2.1.8
UniCraft N8N custom nodes - Unified AI Model Router with Multi-Modal Support by CloudCraft Labs f...
9 versions - Latest release: 4 months ago - 98 downloads last month - 1 maintainer
@trsdn/mistraldocai-mcp-server 1.0.4
MCP server for document-to-Markdown conversion using Mistral AI OCR
5 versions - Latest release: 5 months ago - 59 downloads last month - 2 stars on GitHub - 1 maintainer
@scaly/dazzle 0.1.0
DSSSL processor - CLI wrapper for @scaly/openjade
1 version - Latest release: about 1 month ago - 1 maintainer
@majkapp/majk-chat-document-tools 1.0.26
Document processing tools for majk chat - PDF, Excel, Word, PowerPoint parsing and analysis
5 versions - Latest release: about 2 months ago - 340 downloads last month - 1 maintainer
@mixpeek/n8n-nodes-mixpeek 1.0.5
n8n community node for Mixpeek - multimodal data processing and semantic search API
6 versions - Latest release: 12 days ago
invoicify-json-craft 1.0.0
AI-powered invoice to JSON converter using Mistral AI with dynamic field detection and master sch...
1 version - Latest release: 7 months ago - 4 downloads last month - 1 maintainer
@project-lakechain/condition 0.10.0
A middleware allowing to express complex conditions in pipelines.
7 versions - Latest release: over 1 year ago - 3 downloads last month - 185 stars on GitHub - 1 maintainer
treechunk 1.1.0
Hierarchical markdown chunking for RAG systems with AI-powered context summarization
5 versions - Latest release: 6 months ago - 8 downloads last month - 0 stars on GitHub - 1 maintainer
@project-lakechain/service-linked-role 0.10.0
Creates a service linked role for a given service in an idempotent way.
3 versions - Latest release: over 1 year ago - 6 downloads last month - 186 stars on GitHub - 1 maintainer
@scaly/openjade 0.2.1
TypeScript port of OpenJade DSSSL engine
3 versions - Latest release: about 1 month ago - 1 maintainer
@project-lakechain/canny-edge-detector 0.10.0
Creates a new image with the edges detected using the Canny edge detector algorithm.
3 versions - Latest release: over 1 year ago - 10 downloads last month - 186 stars on GitHub - 1 maintainer
@praxlannister/mdexport-core 2.0.0
Core processing engine for MDExport
1 version - Latest release: 3 months ago - 5 downloads last month - 1 maintainer
uns-mcp-server 2.0.2
Pure JavaScript MCP server for Unstructured.io - No Python required!
4 versions - Latest release: 5 months ago - 16 downloads last month - 1 maintainer
n8n-nodes-docx-genie-pro 0.1.2
n8n node package for DOCX document manipulation and processing
3 versions - Latest release: 8 months ago - 4 downloads last month - 0 stars on GitHub - 1 maintainer
@promptbook/documents 0.104.0
Promptbook: Turn your company's scattered knowledge into AI ready books
420 versions - Latest release: 23 days ago - 3.92 thousand downloads last month - 138 stars on GitHub - 1 maintainer
@promptbook/legacy-documents 0.104.0
Promptbook: Turn your company's scattered knowledge into AI ready books
417 versions - Latest release: 23 days ago - 4.04 thousand downloads last month - 138 stars on GitHub - 1 maintainer
pdf-image-extractor 2.0.8
pdf
4 versions - Latest release: almost 2 years ago - 304 downloads last month - 5 stars on GitHub - 1 maintainer
n8n-nodes-docx-genie 0.1.0
n8n node package for DOCX document manipulation and processing
1 version - Latest release: 8 months ago - 8 downloads last month - 0 stars on GitHub - 1 maintainer
agentdb-weave 2.0.0 unpublished
The Platinum Orchestrator - Transform static assets into dynamic, self-healing knowledge with aut...
1 version - Latest release: about 1 month ago - 1 maintainer
n8n-nodes-vector-store-processor 1.8.15
n8n node for intelligent document chunking and processing for vector store ingestion with Smart Q...
49 versions - Latest release: 3 months ago - 2.54 thousand downloads last month - 1 maintainer
n8n-nodes-docx-converter-enhanced 1.0.0
Enhanced n8n community node for DOCX to text conversion with RAG capabilities, page-aware chunkin...
1 version - Latest release: 5 months ago - 20 downloads last month - 1 maintainer
n8n-nodes-inner-batched-chain-summarization 0.1.2
n8n community node with intelligent batched chain summarization for processing large documents ef...
3 versions - Latest release: 4 months ago - 9 downloads last month - 1 maintainer
n8n-nodes-pdf-accessibility 3.0.0
AI-powered PDF accessibility automation for N8N - comprehensive WCAG compliance analysis, intelli...
24 versions - Latest release: 8 months ago - 418 downloads last month - 0 stars on GitHub - 1 maintainer
knowledge-mgmt-mcp 1.1.0
Production-ready MCP server for document ingestion and knowledge management with vector search. S...
6 versions - Latest release: 4 months ago - 28 downloads last month - 1 maintainer
@raptor-data/ts-sdk 1.0.2
Official TypeScript SDK for RaptorData API
3 versions - Latest release: 2 months ago - 12 downloads last month - 1 maintainer
passport-ocr-api 1.1.5 💰
Passport OCR API client for extracting passport data from images and PDF files using OCR technology.
1 version - Latest release: 4 months ago - 7 downloads last month - 3,143 stars on GitHub - 1 maintainer
n8n-nodes-power-document-extractor 0.13.7
Power Document Extractor – universal local document parser for n8n
11 versions - Latest release: about 2 months ago - 112 downloads last month - 1 maintainer
@aidalinfo/office-to-markdown 1.0.2
Modern TypeScript library for converting Office documents (DOCX) to Markdown format, optimized fo...
3 versions - Latest release: 5 months ago - 18 downloads last month - 6 stars on GitHub - 1 maintainer
@dooor-ai/cortexdb 0.9.8
Official TypeScript/JavaScript SDK for CortexDB - Multi-modal RAG Platform with advanced document...
64 versions - Latest release: 16 days ago - 973 downloads last month - 1 maintainer
@iflow-mcp/doc-ops-mcp 0.3.8
MCP Document Converter Server — A Model Context Protocol server for seamless document format conv...
1 version - Latest release: about 2 months ago - 2 maintainers
docstrange 1.0.7
Official Node.js client for Docstrange API - Extract data from PDFs, images, and documents in mul...
8 versions - Latest release: 4 months ago - 117 downloads last month - 1 maintainer
@aismarttalk/anondocs-sdk 1.2.1
TypeScript SDK for AnonDocs API - Privacy-first text and document anonymization
5 versions - Latest release: 3 months ago - 77 downloads last month - 2 maintainers
@aivue/doc-intelligence 1.0.3
AI-powered document parser and extractor for Vue 3 - Upload PDFs/images, extract structured data ...
4 versions - Latest release: about 2 months ago - 1 maintainer
@jojihatzz/lemmedoc 2.0.0
A comprehensive Model Context Protocol (MCP) server for document processing, PDF manipulation, fo...
2 versions - Latest release: 6 months ago - 4 downloads last month - 1 maintainer
parze 0.1.1
TypeScript SDK for the Parze API
2 versions - Latest release: 3 months ago - 10 downloads last month - 1 maintainer
@tfw.in/structura-sdk 0.1.0
TypeScript SDK for Saral Structura, providing Zod schemas and validation for document processing ...
1 version - Latest release: 9 months ago - 100 downloads last month - 1 maintainer
pageindex-mcp 1.6.3
MCP server for PageIndex
29 versions - Latest release: 3 months ago - 718 downloads last month - 1 maintainer
docuglean-ocr 1.0.0
An SDK for intelligent document processing using State of the Art AI models.
1 version - Latest release: 5 months ago - 9 downloads last month - 5 stars on GitHub - 1 maintainer
@sylphx/pdf-reader-mcp 2.1.0
An MCP server providing tools to read PDF files.
13 versions - Latest release: about 1 month ago - 4.43 thousand downloads last month - 331 stars on GitHub - 1 maintainer
static-research-engine 1.0.2
Transform documents into structured, queryable span artifacts with intelligent search and ranking
2 versions - Latest release: 3 months ago - 1 maintainer
@ninjadoc-ai/sdk 1.0.9
TypeScript SDK for document processing with zero-friction framework adapters. Features intelligen...
10 versions - Latest release: 4 months ago - 43 downloads last month - 1 maintainer
asciidoctor-html-to-markdown 1.0.0-beta.1
A modular document processing system for converting HTML to Markdown
1 version - Latest release: 6 months ago - 4 downloads last month - 0 stars on GitHub - 1 maintainer
@project-lakechain/structured-entity-extractor 0.10.0
Extracts structured entities from processed documents.
3 versions - Latest release: over 1 year ago - 7 downloads last month - 186 stars on GitHub - 1 maintainer
@aivue/chatbot 2.5.4
AI-powered chat components for Vue.js with RAG (Retrieval-Augmented Generation) support
50 versions - Latest release: about 2 months ago - 389 downloads last month - 7 stars on GitHub - 1 maintainer
@wdelhagen/textprep 0.1.0
Document text extraction with pluggable extractors. Supports PDF, DOCX, DOC, RTF, TXT, and image ...
1 version - Latest release: 3 months ago - 1 maintainer
create-autollama 0.0.1
Placeholder scaffolder for AutoLlama. Creates a new folder (default: autollama) and points to the...
1 version - Latest release: 5 months ago - 9 downloads last month - 24 stars on GitHub - 1 maintainer
@lucianaib/word-cloud-mcp 3.0.0
一个专注于从文档内容制作词云图的 MCP 工具,支持 PDF、Word、TXT、MD 等多种格式的智能文字提取,具备优化的螺旋布局算法和多种输出格式
12 versions - Latest release: about 2 months ago - 586 downloads last month - 0 stars on GitHub - 1 maintainer
pdftotext-mcp 1.0.0
A reliable Model Context Protocol server for PDF text extraction using pdftotext from poppler-utils
1 version - Latest release: 7 months ago - 10 downloads last month - 0 stars on GitHub - 1 maintainer
koncile-js 0.1.4
JavaScript SDK for the Koncile Intelligent Document Processing API
5 versions - Latest release: 7 months ago - 14 downloads last month - 1 maintainer
@markdownkit/mdd 2.0.0
Semantic document layer for AI-to-Office pipeline. Transform markdown into professional PDF/DOCX ...
3 versions - Latest release: 19 days ago - 150 downloads last month - 1 maintainer
unsiloed-sdk 0.1.0
JavaScript/TypeScript SDK for Unsiloed Vision API - Parse, Extract, Classify, and Split documents
1 version - Latest release: 21 days ago - 102 downloads last month - 1 maintainer
n8n-nodes-extract-pdf 1.0.26
n8n node to extract text, images and tables from PDF with multilingual support, language detectio...
24 versions - Latest release: 10 months ago - 194 downloads last month - 1 maintainer
@jmndao/mongoose-ai 1.4.0
AI-powered Mongoose plugin for intelligent document processing with auto-summarization, semantic ...
10 versions - Latest release: 7 months ago - 172 downloads last month - 3 stars on GitHub - 1 maintainer
@lumina-ai-inc/chunkr-ai 0.0.1
Node.js client for Chunkr API
1 version - Latest release: about 1 year ago - 98 downloads last month - 2,864 stars on GitHub - 1 maintainer
context1000 0.1.8
**context1000** is a documentation format for software systems, designed for integration with art...
14 versions - Latest release: 5 months ago - 139 downloads last month - 1 maintainer
rag-lite-ts 2.2.0
Local-first TypeScript retrieval engine with Chameleon Multimodal Architecture for semantic searc...
12 versions - Latest release: 23 days ago - 333 downloads last month - 3 stars on GitHub - 1 maintainer
rag-system-pgvector 2.4.8
A complete Retrieval-Augmented Generation system using pgvector, LangChain, and LangGraph for Nod...
20 versions - Latest release: 23 days ago - 219 downloads last month - 1 maintainer
@docrouter/sdk 0.2.2
TypeScript SDK for DocRouter API
4 versions - Latest release: 29 days ago - 110 downloads last month - 1 maintainer
sprint-docx-mcp-server 1.0.0
GitHub Copilot agent for processing DOCX Sprint documents with hierarchical structure into Jira-f...
1 version - Latest release: 2 months ago - 1 maintainer
hashub-docapp-js 1.0.0
JavaScript/TypeScript SDK for Hashub Document Processing API
1 version - Latest release: 5 months ago - 3 downloads last month - 0 stars on GitHub - 1 maintainer
stopword-trainer 1.1.1
A module for creating stopword lists for any language, based on a set of documents.
29 versions - Latest release: over 3 years ago - 1 dependent package - 1 dependent repositories - 243 downloads last month - 15 stars on GitHub - 1 maintainer
n8n-nodes-puter-ai 2.0.4
Advanced n8n node for Puter.js AI with RAG agentic capabilities, document processing, audio trans...
8 versions - Latest release: 6 months ago - 59 downloads last month - 1 maintainer
@sybil-studio-devs/sdk 0.1.0
Official SDK for Sybil AI - Document processing, YouTube analysis, and AI-powered knowledge manag...
1 version - Latest release: 24 days ago - 1 maintainer
@project-lakechain/blip2-image-processor 0.10.0
A middleware extracting image captioning information from images using the Blip2 model.
7 versions - Latest release: over 1 year ago - 5 downloads last month - 186 stars on GitHub - 1 maintainer
n8n-nodes-mistral-ocr 1.0.0
n8n node for Mistral OCR API integration with structured annotations
18 versions - Latest release: 7 months ago - 166 downloads last month - 2 stars on GitHub - 1 maintainer
yq-pdf 0.0.2
High-performance PDF manipulation library with native processing capabilities. Supports encryptio...
2 versions - Latest release: 6 months ago - 100 downloads last month - 1 maintainer
@casemark/thurgood 2.0.1
Thurgood CLI - Legal engineer AI agent powered by Case.dev. Build legal applications with documen...
8 versions - Latest release: about 2 months ago - 667 downloads last month - 3 maintainers
@paulmeller/docflow 0.0.28
A developer-friendly transformation engine for programmatic document manipulation
26 versions - Latest release: 3 months ago - 2.72 thousand downloads last month - 1 maintainer
mcp-pdf 1.1.0
A Model Context Protocol server for PDF manipulation and operations
5 versions - Latest release: about 2 months ago - 1 maintainer
sensible-api 0.0.12
Javascript SDK for Sensible, the developer-first platform for extracting structured data from doc...
12 versions - Latest release: 5 months ago - 3.63 thousand downloads last month - 0 stars on GitHub - 1 maintainer