npmjs.org "document-processing" keyword
View the packages on the npmjs.org package registry that are tagged with the "document-processing" keyword.
@project-lakechain/sentence-transformers 0.10.0
Creates embeddings from text-oriented documents using Sentence Transformers models.7 versions - Latest release: over 1 year ago - 12 downloads last month - 185 stars on GitHub - 1 maintainer
@project-lakechain/ollama-embedding-processor 0.10.0
Creates embeddings from documents using Ollama models.3 versions - Latest release: over 1 year ago - 11 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/opensearch-domain 0.10.0
Creates an OpenSearch domain with Cognito authentication.7 versions - Latest release: over 1 year ago - 7 downloads last month - 186 stars on GitHub - 1 maintainer
universal-documents-converter 1.0.1
Universal MCP Server for Multi-Rendering PDF Quality Assurance System with AI-powered optimization1 version - Latest release: 5 months ago - 12 downloads last month - 1 maintainer
@nyazkhan/react-pdf-viewer 1.1.1
A comprehensive React TypeScript component library for viewing and interacting with PDF files usi...5 versions - Latest release: 6 months ago - 40 downloads last month - 0 stars on GitHub - 1 maintainer
@project-lakechain/s3-storage-connector 0.10.0
Stores documents and their metadata in an S3 Bucket.7 versions - Latest release: over 1 year ago - 3 downloads last month - 186 stars on GitHub - 1 maintainer
@abhi-arya1/mastra-minirag 1.0.1
Minimal recursive text chunking functionality extracted from @mastra/rag for edge deployments2 versions - Latest release: 3 months ago - 1 maintainer
@equus-ai/sdk 1.1.0
Official JavaScript/TypeScript SDK for EQUUS AI Infrastructure Platform - Document Processing, RA...2 versions - Latest release: 9 days ago - 147 downloads last month
@base64ai/n8n-nodes-base64ai 1.0.0
Official Base64.ai community node for n8n2 versions - Latest release: 8 days ago - 130 downloads last month
@okrapdf/cli 0.2.11
OkraPDF command-line interface for PDF extraction and document chat22 versions - Latest release: 4 days ago
@project-lakechain/opensearch-saved-object 0.10.0
Uploads a saved object to OpenSearch using AWS CDK.7 versions - Latest release: over 1 year ago - 9 downloads last month - 186 stars on GitHub - 1 maintainer
Top 1.9% on npmjs.org
76 versions - Latest release: 7 months ago - 96 dependent packages - 582 dependent repositories - 371 thousand downloads last month - 222 stars on GitHub - 2 maintainers
stopword 3.1.5
A module for node.js and the browser that takes in text and returns text that is stripped of stop...76 versions - Latest release: 7 months ago - 96 dependent packages - 582 dependent repositories - 371 thousand downloads last month - 222 stars on GitHub - 2 maintainers
ppu-paddle-ocr 3.6.0
Blazing-fast and lightweight PaddleOCR library for Node.js and Bun. Perform accurate text detecti...28 versions - Latest release: 23 days ago - 641 downloads last month - 19 stars on GitHub - 1 maintainer
@aidalinfo/pdf-processor 1.0.18
Powerful PDF data extraction library powered by AI vision models. Transform PDFs into structured,...18 versions - Latest release: 5 months ago - 195 downloads last month - 6 stars on GitHub - 1 maintainer
@project-lakechain/sentence-text-splitter 0.10.0
Transforms text into chunks of tokens using a sentence text splitter.7 versions - Latest release: over 1 year ago - 7 downloads last month - 186 stars on GitHub - 1 maintainer
expo-pdf-text-extract 1.0.0
Native PDF text extraction for React Native and Expo. Extract text content from PDF files using p...1 version - Latest release: 10 days ago
@project-lakechain/pinecone-storage-connector 0.10.0
A data store connector for Pinecone.7 versions - Latest release: over 1 year ago - 8 downloads last month - 186 stars on GitHub - 1 maintainer
@easyrag/sdk 0.1.1
Official JavaScript SDK for EasyRAG.com API2 versions - Latest release: about 1 month ago - 17 downloads last month - 1 maintainer
@project-lakechain/sdk 0.10.0
An SDK providing helpers to create Lakechain middlewares in TypeScript.9 versions - Latest release: over 1 year ago - 13 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/sqs-storage-connector 0.10.0
Stores documents and their metadata in an SQS queue.7 versions - Latest release: over 1 year ago - 3 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/neo4j-storage-connector 0.10.0
A data store connector for Neo4j.3 versions - Latest release: over 1 year ago - 2 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/opensearch-storage-connector 0.10.0
Stores document metadata in an OpenSearch index.7 versions - Latest release: over 1 year ago - 4 downloads last month - 186 stars on GitHub - 1 maintainer
intelligent-text-chunking 1.0.3
An intelligent text chunking library that respects document structure and semantic boundaries3 versions - Latest release: 4 months ago - 5 downloads last month - 1 maintainer
@project-lakechain/ffmpeg-processor 0.10.0
Processes media documents using FFMPEG.5 versions - Latest release: over 1 year ago - 7 downloads last month - 186 stars on GitHub - 1 maintainer
n8n-nodes-solar 0.3.54
Solar LLM and Embeddings nodes for n8n78 versions - Latest release: 3 months ago - 2.83 thousand downloads last month - 0 stars on GitHub - 1 maintainer
@project-lakechain/sharp-image-transform 0.10.0
A middleware transforming images using the sharp library.7 versions - Latest release: over 1 year ago - 2 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/layers 0.10.0
Lambda layer library used by Project Lakechain.7 versions - Latest release: over 1 year ago - 4 downloads last month - 186 stars on GitHub - 1 maintainer
@mastra/rag 2.0.0
The Retrieval-Augmented Generation (RAG) module contains document processing and embedding utilit...564 versions - Latest release: 4 days ago - 98.1 thousand downloads last month - 18,513 stars on GitHub - 11 maintainers
lekana-gemini 1.0.0
A shared TypeScript library for Lekana microservices that provides AI-powered document processing...1 version - Latest release: about 1 month ago - 1 maintainer
@promptbook/pdf 0.104.0
Promptbook: Turn your company's scattered knowledge into AI ready books422 versions - Latest release: 23 days ago - 5.03 thousand downloads last month - 138 stars on GitHub - 1 maintainer
n8n-nodes-unicraft 2.1.8
UniCraft N8N custom nodes - Unified AI Model Router with Multi-Modal Support by CloudCraft Labs f...9 versions - Latest release: 4 months ago - 98 downloads last month - 1 maintainer
@trsdn/mistraldocai-mcp-server 1.0.4
MCP server for document-to-Markdown conversion using Mistral AI OCR5 versions - Latest release: 5 months ago - 59 downloads last month - 2 stars on GitHub - 1 maintainer
@scaly/dazzle 0.1.0
DSSSL processor - CLI wrapper for @scaly/openjade1 version - Latest release: about 1 month ago - 1 maintainer
@majkapp/majk-chat-document-tools 1.0.26
Document processing tools for majk chat - PDF, Excel, Word, PowerPoint parsing and analysis5 versions - Latest release: about 2 months ago - 340 downloads last month - 1 maintainer
@mixpeek/n8n-nodes-mixpeek 1.0.5
n8n community node for Mixpeek - multimodal data processing and semantic search API6 versions - Latest release: 12 days ago
invoicify-json-craft 1.0.0
AI-powered invoice to JSON converter using Mistral AI with dynamic field detection and master sch...1 version - Latest release: 7 months ago - 4 downloads last month - 1 maintainer
@project-lakechain/condition 0.10.0
A middleware allowing to express complex conditions in pipelines.7 versions - Latest release: over 1 year ago - 3 downloads last month - 185 stars on GitHub - 1 maintainer
treechunk 1.1.0
Hierarchical markdown chunking for RAG systems with AI-powered context summarization5 versions - Latest release: 6 months ago - 8 downloads last month - 0 stars on GitHub - 1 maintainer
@project-lakechain/service-linked-role 0.10.0
Creates a service linked role for a given service in an idempotent way.3 versions - Latest release: over 1 year ago - 6 downloads last month - 186 stars on GitHub - 1 maintainer
@scaly/openjade 0.2.1
TypeScript port of OpenJade DSSSL engine3 versions - Latest release: about 1 month ago - 1 maintainer
@project-lakechain/canny-edge-detector 0.10.0
Creates a new image with the edges detected using the Canny edge detector algorithm.3 versions - Latest release: over 1 year ago - 10 downloads last month - 186 stars on GitHub - 1 maintainer
@praxlannister/mdexport-core 2.0.0
Core processing engine for MDExport1 version - Latest release: 3 months ago - 5 downloads last month - 1 maintainer
uns-mcp-server 2.0.2
Pure JavaScript MCP server for Unstructured.io - No Python required!4 versions - Latest release: 5 months ago - 16 downloads last month - 1 maintainer
n8n-nodes-docx-genie-pro 0.1.2
n8n node package for DOCX document manipulation and processing3 versions - Latest release: 8 months ago - 4 downloads last month - 0 stars on GitHub - 1 maintainer
@promptbook/documents 0.104.0
Promptbook: Turn your company's scattered knowledge into AI ready books420 versions - Latest release: 23 days ago - 3.92 thousand downloads last month - 138 stars on GitHub - 1 maintainer
@promptbook/legacy-documents 0.104.0
Promptbook: Turn your company's scattered knowledge into AI ready books417 versions - Latest release: 23 days ago - 4.04 thousand downloads last month - 138 stars on GitHub - 1 maintainer
pdf-image-extractor 2.0.8
pdf4 versions - Latest release: almost 2 years ago - 304 downloads last month - 5 stars on GitHub - 1 maintainer
n8n-nodes-docx-genie 0.1.0
n8n node package for DOCX document manipulation and processing1 version - Latest release: 8 months ago - 8 downloads last month - 0 stars on GitHub - 1 maintainer
agentdb-weave 2.0.0 unpublished
The Platinum Orchestrator - Transform static assets into dynamic, self-healing knowledge with aut...1 version - Latest release: about 1 month ago - 1 maintainer
n8n-nodes-vector-store-processor 1.8.15
n8n node for intelligent document chunking and processing for vector store ingestion with Smart Q...49 versions - Latest release: 3 months ago - 2.54 thousand downloads last month - 1 maintainer
n8n-nodes-docx-converter-enhanced 1.0.0
Enhanced n8n community node for DOCX to text conversion with RAG capabilities, page-aware chunkin...1 version - Latest release: 5 months ago - 20 downloads last month - 1 maintainer
n8n-nodes-inner-batched-chain-summarization 0.1.2
n8n community node with intelligent batched chain summarization for processing large documents ef...3 versions - Latest release: 4 months ago - 9 downloads last month - 1 maintainer
n8n-nodes-pdf-accessibility 3.0.0
AI-powered PDF accessibility automation for N8N - comprehensive WCAG compliance analysis, intelli...24 versions - Latest release: 8 months ago - 418 downloads last month - 0 stars on GitHub - 1 maintainer
knowledge-mgmt-mcp 1.1.0
Production-ready MCP server for document ingestion and knowledge management with vector search. S...6 versions - Latest release: 4 months ago - 28 downloads last month - 1 maintainer
@raptor-data/ts-sdk 1.0.2
Official TypeScript SDK for RaptorData API3 versions - Latest release: 2 months ago - 12 downloads last month - 1 maintainer
passport-ocr-api 1.1.5 💰
Passport OCR API client for extracting passport data from images and PDF files using OCR technology.1 version - Latest release: 4 months ago - 7 downloads last month - 3,143 stars on GitHub - 1 maintainer
n8n-nodes-power-document-extractor 0.13.7
Power Document Extractor – universal local document parser for n8n11 versions - Latest release: about 2 months ago - 112 downloads last month - 1 maintainer
@aidalinfo/office-to-markdown 1.0.2
Modern TypeScript library for converting Office documents (DOCX) to Markdown format, optimized fo...3 versions - Latest release: 5 months ago - 18 downloads last month - 6 stars on GitHub - 1 maintainer
@dooor-ai/cortexdb 0.9.8
Official TypeScript/JavaScript SDK for CortexDB - Multi-modal RAG Platform with advanced document...64 versions - Latest release: 16 days ago - 973 downloads last month - 1 maintainer
@iflow-mcp/doc-ops-mcp 0.3.8
MCP Document Converter Server — A Model Context Protocol server for seamless document format conv...1 version - Latest release: about 2 months ago - 2 maintainers
docstrange 1.0.7
Official Node.js client for Docstrange API - Extract data from PDFs, images, and documents in mul...8 versions - Latest release: 4 months ago - 117 downloads last month - 1 maintainer
@aismarttalk/anondocs-sdk 1.2.1
TypeScript SDK for AnonDocs API - Privacy-first text and document anonymization5 versions - Latest release: 3 months ago - 77 downloads last month - 2 maintainers
@aivue/doc-intelligence 1.0.3
AI-powered document parser and extractor for Vue 3 - Upload PDFs/images, extract structured data ...4 versions - Latest release: about 2 months ago - 1 maintainer
@jojihatzz/lemmedoc 2.0.0
A comprehensive Model Context Protocol (MCP) server for document processing, PDF manipulation, fo...2 versions - Latest release: 6 months ago - 4 downloads last month - 1 maintainer
parze 0.1.1
TypeScript SDK for the Parze API2 versions - Latest release: 3 months ago - 10 downloads last month - 1 maintainer
@tfw.in/structura-sdk 0.1.0
TypeScript SDK for Saral Structura, providing Zod schemas and validation for document processing ...1 version - Latest release: 9 months ago - 100 downloads last month - 1 maintainer
pageindex-mcp 1.6.3
MCP server for PageIndex29 versions - Latest release: 3 months ago - 718 downloads last month - 1 maintainer
docuglean-ocr 1.0.0
An SDK for intelligent document processing using State of the Art AI models.1 version - Latest release: 5 months ago - 9 downloads last month - 5 stars on GitHub - 1 maintainer
@sylphx/pdf-reader-mcp 2.1.0
An MCP server providing tools to read PDF files.13 versions - Latest release: about 1 month ago - 4.43 thousand downloads last month - 331 stars on GitHub - 1 maintainer
static-research-engine 1.0.2
Transform documents into structured, queryable span artifacts with intelligent search and ranking2 versions - Latest release: 3 months ago - 1 maintainer
@ninjadoc-ai/sdk 1.0.9
TypeScript SDK for document processing with zero-friction framework adapters. Features intelligen...10 versions - Latest release: 4 months ago - 43 downloads last month - 1 maintainer
asciidoctor-html-to-markdown 1.0.0-beta.1
A modular document processing system for converting HTML to Markdown1 version - Latest release: 6 months ago - 4 downloads last month - 0 stars on GitHub - 1 maintainer
@project-lakechain/structured-entity-extractor 0.10.0
Extracts structured entities from processed documents.3 versions - Latest release: over 1 year ago - 7 downloads last month - 186 stars on GitHub - 1 maintainer
@aivue/chatbot 2.5.4
AI-powered chat components for Vue.js with RAG (Retrieval-Augmented Generation) support50 versions - Latest release: about 2 months ago - 389 downloads last month - 7 stars on GitHub - 1 maintainer
@wdelhagen/textprep 0.1.0
Document text extraction with pluggable extractors. Supports PDF, DOCX, DOC, RTF, TXT, and image ...1 version - Latest release: 3 months ago - 1 maintainer
create-autollama 0.0.1
Placeholder scaffolder for AutoLlama. Creates a new folder (default: autollama) and points to the...1 version - Latest release: 5 months ago - 9 downloads last month - 24 stars on GitHub - 1 maintainer
@lucianaib/word-cloud-mcp 3.0.0
一个专注于从文档内容制作词云图的 MCP 工具,支持 PDF、Word、TXT、MD 等多种格式的智能文字提取,具备优化的螺旋布局算法和多种输出格式12 versions - Latest release: about 2 months ago - 586 downloads last month - 0 stars on GitHub - 1 maintainer
pdftotext-mcp 1.0.0
A reliable Model Context Protocol server for PDF text extraction using pdftotext from poppler-utils1 version - Latest release: 7 months ago - 10 downloads last month - 0 stars on GitHub - 1 maintainer
koncile-js 0.1.4
JavaScript SDK for the Koncile Intelligent Document Processing API5 versions - Latest release: 7 months ago - 14 downloads last month - 1 maintainer
@markdownkit/mdd 2.0.0
Semantic document layer for AI-to-Office pipeline. Transform markdown into professional PDF/DOCX ...3 versions - Latest release: 19 days ago - 150 downloads last month - 1 maintainer
unsiloed-sdk 0.1.0
JavaScript/TypeScript SDK for Unsiloed Vision API - Parse, Extract, Classify, and Split documents1 version - Latest release: 21 days ago - 102 downloads last month - 1 maintainer
n8n-nodes-extract-pdf 1.0.26
n8n node to extract text, images and tables from PDF with multilingual support, language detectio...24 versions - Latest release: 10 months ago - 194 downloads last month - 1 maintainer
@jmndao/mongoose-ai 1.4.0
AI-powered Mongoose plugin for intelligent document processing with auto-summarization, semantic ...10 versions - Latest release: 7 months ago - 172 downloads last month - 3 stars on GitHub - 1 maintainer
@lumina-ai-inc/chunkr-ai 0.0.1
Node.js client for Chunkr API1 version - Latest release: about 1 year ago - 98 downloads last month - 2,864 stars on GitHub - 1 maintainer
context1000 0.1.8
**context1000** is a documentation format for software systems, designed for integration with art...14 versions - Latest release: 5 months ago - 139 downloads last month - 1 maintainer
rag-lite-ts 2.2.0
Local-first TypeScript retrieval engine with Chameleon Multimodal Architecture for semantic searc...12 versions - Latest release: 23 days ago - 333 downloads last month - 3 stars on GitHub - 1 maintainer
rag-system-pgvector 2.4.8
A complete Retrieval-Augmented Generation system using pgvector, LangChain, and LangGraph for Nod...20 versions - Latest release: 23 days ago - 219 downloads last month - 1 maintainer
@docrouter/sdk 0.2.2
TypeScript SDK for DocRouter API4 versions - Latest release: 29 days ago - 110 downloads last month - 1 maintainer
sprint-docx-mcp-server 1.0.0
GitHub Copilot agent for processing DOCX Sprint documents with hierarchical structure into Jira-f...1 version - Latest release: 2 months ago - 1 maintainer
hashub-docapp-js 1.0.0
JavaScript/TypeScript SDK for Hashub Document Processing API1 version - Latest release: 5 months ago - 3 downloads last month - 0 stars on GitHub - 1 maintainer
stopword-trainer 1.1.1
A module for creating stopword lists for any language, based on a set of documents.29 versions - Latest release: over 3 years ago - 1 dependent package - 1 dependent repositories - 243 downloads last month - 15 stars on GitHub - 1 maintainer
n8n-nodes-puter-ai 2.0.4
Advanced n8n node for Puter.js AI with RAG agentic capabilities, document processing, audio trans...8 versions - Latest release: 6 months ago - 59 downloads last month - 1 maintainer
@sybil-studio-devs/sdk 0.1.0
Official SDK for Sybil AI - Document processing, YouTube analysis, and AI-powered knowledge manag...1 version - Latest release: 24 days ago - 1 maintainer
@project-lakechain/blip2-image-processor 0.10.0
A middleware extracting image captioning information from images using the Blip2 model.7 versions - Latest release: over 1 year ago - 5 downloads last month - 186 stars on GitHub - 1 maintainer
n8n-nodes-mistral-ocr 1.0.0
n8n node for Mistral OCR API integration with structured annotations18 versions - Latest release: 7 months ago - 166 downloads last month - 2 stars on GitHub - 1 maintainer
yq-pdf 0.0.2
High-performance PDF manipulation library with native processing capabilities. Supports encryptio...2 versions - Latest release: 6 months ago - 100 downloads last month - 1 maintainer
@casemark/thurgood 2.0.1
Thurgood CLI - Legal engineer AI agent powered by Case.dev. Build legal applications with documen...8 versions - Latest release: about 2 months ago - 667 downloads last month - 3 maintainers
@paulmeller/docflow 0.0.28
A developer-friendly transformation engine for programmatic document manipulation26 versions - Latest release: 3 months ago - 2.72 thousand downloads last month - 1 maintainer
mcp-pdf 1.1.0
A Model Context Protocol server for PDF manipulation and operations5 versions - Latest release: about 2 months ago - 1 maintainer
sensible-api 0.0.12
Javascript SDK for Sensible, the developer-first platform for extracting structured data from doc...12 versions - Latest release: 5 months ago - 3.63 thousand downloads last month - 0 stars on GitHub - 1 maintainer
Related Keywords
machine-learning
88
retrieval-augmented-generation
84
natural-language-processing
82
generative-ai
79
computer-vision
78
aws
77
serverless
77
hacktoberfest
76
aws-cdk
75
ai
73
pdf
55
typescript
45
ocr
42
n8n-community-node-package
39
n8n
33
rag
33
llm
32
embeddings
32
mcp
24
text-extraction
22
markdown
20
docx
19
model-context-protocol
18
semantic-search
18
automation
16
langchain
15
nlp
15
workflow
13
text-splitting
13
vector-search
13
openai
13
nodejs
12
sdk
12
cli
11
knowledge-base
11
pdf-processing
11
claude
11
html
10
document
10
chatbot
10
chunking
10
vector-database
10
web-scraping
9
data-extraction
9
cdk
9
api
9
pdf-parser
8
javascript
8
react
8
multimodal
8
monorepo
7
document-conversion
7
n8n-community-nodes
7
lakechain
7
text-processing
7
office-documents
6
n8n-node
6
developer-tools
6
extraction
6
pdf-extraction
5
sub-node
5
image-processing
5
ai-sdk
5
document-analysis
5
text-splitter
5
text-analysis
5
documents
5
batch-processing
5
vector-store
5
gemini
5
json
5
ai-workflow
5
browser
5
graphor
5
n8n-community-node
5
extract
4
upstage
4
text-generation
4
search
4
file-processing
4
pdf-tools
4
cross-platform
4
embedding
4
chatgpt
4
function-calling
4
latex
4
conversion
4
opensearch
4
tesseract
4
excel
4
csv
4
playwright
4
image-to-text
4
nextjs
4
language-model
4
document-automation
4
cross-provider
3
large-language-models
3
llmops
3
markdown-dsl
3