npmjs.org "document-processing" keyword
@project-lakechain/ffmpeg-processor 0.10.0
Processes media documents using FFMPEG.5 versions - Latest release: over 1 year ago - 7 downloads last month - 186 stars on GitHub - 1 maintainer
n8n-nodes-doclayer 0.1.0
n8n community node for Doclayer - AI-powered document processing, extraction, and search1 version - Latest release: 5 months ago - 1 maintainer
docqa-mcp 1.0.2
MCP server for DocQA — AI document verification, PDF extraction, OCR, and format conversion via a...3 versions - Latest release: about 2 months ago - 1 maintainer
parze 0.2.6
TypeScript SDK for the Parze API9 versions - Latest release: about 22 hours ago - 54 downloads last month - 1 maintainer
pdf-oxide-fips 0.3.50 💰
[FIPS 140-3 validated build] High-performance PDF parsing and text extraction library — prebuilt ...4 versions - Latest release: 1 day ago - 37 downloads last month - 731 stars on GitHub - 1 maintainer
n8n-nodes-contextual-document-loader 0.4.0
⚠️ DEPRECATED: Use n8n-nodes-semantic-splitter-with-context instead. This package has known issue...12 versions - Latest release: 12 months ago - 22 downloads last month - 8 stars on GitHub - 1 maintainer
@piwi.ai/business-schema-configurations 1.0.4
JSON schema configurations for intelligent document processing — document types, entity types, en...5 versions - Latest release: 3 months ago - 1 maintainer
@mixpeek/n8n-nodes-mixpeek 1.0.11
n8n community node for Mixpeek - multimodal data processing and semantic search API12 versions - Latest release: 2 months ago - 1 stars on GitHub - 2 maintainers
n8n-nodes-unstract 0.5.7
n8n nodes for Unstract services including LLMWhisperer and Unstract API19 versions - Latest release: 5 months ago - 665 downloads last month - 2 stars on GitHub - 2 maintainers
@project-lakechain/ollama-embedding-processor 0.10.0
Creates embeddings from documents using Ollama models.3 versions - Latest release: over 1 year ago - 11 downloads last month - 186 stars on GitHub - 1 maintainer
Top 1.9% on npmjs.org
76 versions - Latest release: 11 months ago - 96 dependent packages - 582 dependent repositories - 845 thousand downloads last month - 222 stars on GitHub - 2 maintainers
stopword 3.1.5
A module for node.js and the browser that takes in text and returns text that is stripped of stop...76 versions - Latest release: 11 months ago - 96 dependent packages - 582 dependent repositories - 845 thousand downloads last month - 222 stars on GitHub - 2 maintainers
lichat 1.0.0
A configurable chatbot and email processor that collects user-defined data points from both chat ...1 version - Latest release: 3 months ago - 8 downloads last month - 1 maintainer
@heripo/model 0.1.31 💰
Document models and type definitions for heripo engine32 versions - Latest release: 2 days ago - 1.37 thousand downloads last month - 8 stars on GitHub - 2 maintainers
@heripo/pdf-parser 0.1.31 💰
PDF parsing library using Docling SDK with OCR support for macOS32 versions - Latest release: 2 days ago - 1.47 thousand downloads last month - 8 stars on GitHub - 2 maintainers
@heripo/document-processor 0.1.31 💰
Document processor with LLM-based analysis for heripo engine32 versions - Latest release: 2 days ago - 1.17 thousand downloads last month - 8 stars on GitHub - 2 maintainers
@docrouter/sdk 1.0.0
TypeScript SDK for DocRouter API5 versions - Latest release: 4 months ago - 110 downloads last month - 1 maintainer
@sybil-studio-devs/sdk 0.2.0
Official SDK for Sybil AI - Embeddable wiki (Atlas), file storage (Nexus), document processing, Y...3 versions - Latest release: 4 months ago - 1 maintainer
@project-lakechain/bedrock-image-generators 0.10.0
Image generation using Amazon Bedrock models.7 versions - Latest release: over 1 year ago - 18 downloads last month - 186 stars on GitHub - 1 maintainer
@suparse/cli 1.0.0
Official CLI for the Suparse Document Processing API1 version - Latest release: 4 days ago - 1 maintainer
@suparse/sdk 1.0.0
Official TypeScript SDK for the Suparse Document Processing API1 version - Latest release: 4 days ago - 1 maintainer
n8n-nodes-docutray 0.5.2
n8n community nodes for Docutray OCR, document identification, and knowledge base search services9 versions - Latest release: 7 months ago - 63 downloads last month - 0 stars on GitHub - 1 maintainer
@project-lakechain/sdk 0.10.0
An SDK providing helpers to create Lakechain middlewares in TypeScript.9 versions - Latest release: over 1 year ago - 137 downloads last month - 187 stars on GitHub - 1 maintainer
@transloadit/mcp-server 0.3.22
Transloadit MCP server30 versions - Latest release: 3 days ago - 1.34 thousand downloads last month - 71 stars on GitHub - 3 maintainers
n8n-nodes-docdigitizer 0.2.0
n8n community node for the DocDigitizer document processing API2 versions - Latest release: 3 months ago - 18 downloads last month - 1 maintainer
@ruvector/scipix 0.1.1
OCR client for scientific documents - extract LaTeX, MathML from equations, research papers, and ...2 versions - Latest release: 4 months ago - 65 downloads last month - 1 maintainer
@aidalinfo/pdf-processor 1.0.18
Powerful PDF data extraction library powered by AI vision models. Transform PDFs into structured,...18 versions - Latest release: 9 months ago - 195 downloads last month - 6 stars on GitHub - 1 maintainer
n8n-nodes-structura 0.2.2
n8n community node for Structura — Transform documents (PDF, images) into structured JSON, Excel,...4 versions - Latest release: about 1 month ago - 67 downloads last month - 1 maintainer
@project-lakechain/elevenlabs-synthesizer 0.10.0
Synthesizes text into speech using the Elevenlabs API.3 versions - Latest release: over 1 year ago - 3 downloads last month - 186 stars on GitHub - 1 maintainer
@base64ai/n8n-nodes-base64ai 2.0.2
Official Base64.ai community node for n8n5 versions - Latest release: about 1 month ago - 130 downloads last month - 2 maintainers
ppu-doc-correction 1.0.1
Lightweight, type-safe document image correction for Node.js, Bun, and browsers. Provides orienta...2 versions - Latest release: about 1 month ago - 1 maintainer
rag-lite-ts 2.3.1
Local-first TypeScript retrieval engine with Chameleon Multimodal Architecture for semantic searc...14 versions - Latest release: 4 months ago - 333 downloads last month - 3 stars on GitHub - 1 maintainer
@23blocks/block-rag 2.1.0
RAG block for 23blocks SDK - vector search, document processing, image search, product identifica...4 versions - Latest release: 2 months ago - 336 downloads last month - 4 stars on GitHub - 1 maintainer
recursion-mcp-v2 1.0.2
MCP server for navigation-enabled recursive document analysis - works offline with no external APIs1 version - Latest release: about 1 month ago - 1 maintainer
n8n-nodes-pdf-api-hub 4.0.22
The most powerful PDF toolkit for n8n — HTML to PDF, sign PDF, OCR, extract tables, merge/split, ...31 versions - Latest release: 4 days ago - 996 downloads last month - 0 stars on GitHub - 1 maintainer
n8n-nodes-query-retriever-rerank 0.4.1
Advanced n8n community node for intelligent document retrieval with multi-step reasoning, reranki...2 versions - Latest release: 11 months ago - 346 downloads last month - 8 stars on GitHub - 2 maintainers
@lucianaib/word-cloud-mcp 3.0.0
一个专注于从文档内容制作词云图的 MCP 工具,支持 PDF、Word、TXT、MD 等多种格式的智能文字提取,具备优化的螺旋布局算法和多种输出格式12 versions - Latest release: 6 months ago - 586 downloads last month - 0 stars on GitHub - 1 maintainer
n8n-nodes-tika 0.1.2
n8n community node for Apache Tika — extract text, metadata, detect MIME types and languages from...3 versions - Latest release: about 1 month ago - 1 maintainer
@mastra/rag 2.2.1
The Retrieval-Augmented Generation (RAG) module contains document processing and embedding utilit...799 versions - Latest release: 27 days ago - 267 thousand downloads last month - 23,614 stars on GitHub - 9 maintainers
invoicify-json-craft 1.0.0
AI-powered invoice to JSON converter using Mistral AI with dynamic field detection and master sch...1 version - Latest release: 11 months ago - 4 downloads last month - 1 maintainer
knowledge-mgmt-mcp 1.1.0
Production-ready MCP server for document ingestion and knowledge management with vector search. S...6 versions - Latest release: 8 months ago - 28 downloads last month - 1 maintainer
@pspdfkit/pdf-to-markdown 0.2.2
Standalone CLI wrapper for Nutrient's PDF-to-Markdown extractor4 versions - Latest release: 25 days ago - 774 downloads last month - 5 maintainers
@promptbook/documents 0.110.0
Promptbook: Create persistent AI agents that turn your company's scattered knowledge into action509 versions - Latest release: 3 months ago - 6.39 thousand downloads last month - 159 stars on GitHub - 1 maintainer
@promptbook/pdf 0.110.0
Promptbook: Create persistent AI agents that turn your company's scattered knowledge into action504 versions - Latest release: 3 months ago - 5.73 thousand downloads last month - 159 stars on GitHub - 1 maintainer
@promptbook/legacy-documents 0.110.0
Promptbook: Create persistent AI agents that turn your company's scattered knowledge into action506 versions - Latest release: 3 months ago - 5.75 thousand downloads last month - 159 stars on GitHub - 1 maintainer
@project-lakechain/pdf-text-converter 0.10.0
Converts PDF documents into different formats.7 versions - Latest release: over 1 year ago - 3 downloads last month - 186 stars on GitHub - 1 maintainer
@caleblawson/rag 1.0.0
The Retrieval-Augmented Generation (RAG) module contains document processing and embedding utilit...1 version - Latest release: 11 months ago - 12 downloads last month - 1 maintainer
@aidalinfo/office-to-markdown 1.0.2
Modern TypeScript library for converting Office documents (DOCX) to Markdown format, optimized fo...3 versions - Latest release: 9 months ago - 18 downloads last month - 6 stars on GitHub - 1 maintainer
@easyrag/sdk 0.1.1
Official JavaScript SDK for EasyRAG.com API2 versions - Latest release: 5 months ago - 7 downloads last month - 0 stars on GitHub - 1 maintainer
n8n-nodes-doctr 0.1.7
Extract text from images using docTR OCR in n8n workflows8 versions - Latest release: 6 months ago - 176 downloads last month - 5,497 stars on GitHub - 1 maintainer
n8n-nodes-graphorlm 0.1.25
n8n community nodes for Graphor - Intelligent document processing, RAG pipelines, and document ch...21 versions - Latest release: 21 days ago - 512 downloads last month - 0 stars on GitHub - 1 maintainer
pdf-tax-reader-cl 1.0.0
PDF scraping library for Chilean tax documents. Extract emitter name, economic activities, and ad...1 version - Latest release: 10 months ago - 2 downloads last month - 0 stars on GitHub - 1 maintainer
@aismarttalk/anondocs-sdk 1.2.1
TypeScript SDK for AnonDocs API - Privacy-first text and document anonymization5 versions - Latest release: 6 months ago - 48 downloads last month - 2 maintainers
@project-lakechain/ecs-cluster 0.10.0
A managed ECS cluster construct with quick autoscaling and EFS storage.7 versions - Latest release: over 1 year ago - 17 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/firehose-storage-connector 0.10.0
Forwards document metadata to a Firehose delivery stream.7 versions - Latest release: over 1 year ago - 8 downloads last month - 186 stars on GitHub - 1 maintainer
n8n-nodes-google-gemini-embeddings-extended 0.2.3
n8n community sub-node for Google Gemini Embeddings with extended features like output dimensions...7 versions - Latest release: 7 months ago - 251 downloads last month - 8 stars on GitHub - 2 maintainers
@pageindex/mcp 1.6.3
MCP server for PageIndex2 versions - Latest release: 4 months ago - 1 maintainer
@docdigitizer/mcp-server 0.2.0
MCP server for DocDigitizer document processing API2 versions - Latest release: 3 months ago - 1 maintainer
pdf-oxide 0.3.47 💰
High-performance PDF parsing and text extraction library — prebuilt native bindings, no build too...20 versions - Latest release: 6 days ago - 2.96 thousand downloads last month - 696 stars on GitHub - 1 maintainer
pdf-oxide-wasm 0.3.47 💰
Fast, zero-dependency PDF toolkit for Node.js, browsers, and edge runtimes — text extraction, mar...35 versions - Latest release: 6 days ago - 6.08 thousand downloads last month - 731 stars on GitHub - 1 maintainer
pdf_oxide-linux-x64-gnu 0.3.27 unpublished 💰
Prebuilt native binding for pdf-oxide on linux-x64 (glibc)1 version - Latest release: about 1 month ago - 154 downloads last month - 696 stars on GitHub - 1 maintainer
@portablecore/document-processing 0.1.1
Shared document processing types, category detection, status machine, provider definitions, and c...2 versions - Latest release: 2 months ago - 53 downloads last month - 1 maintainer
@project-lakechain/bedrock-text-processors 0.10.0
Generative text processing using Amazon Bedrock models.7 versions - Latest release: over 1 year ago - 4 downloads last month - 186 stars on GitHub - 1 maintainer
@egintegrations/document-services 0.1.0
Document processing library with Google Cloud Vision OCR, text extraction from PDFs and images, a...1 version - Latest release: 4 months ago - 10 downloads last month - 1 maintainer
@sea-dev/widget-ng 0.1.25
Native Angular widget for Sea.dev document processing15 versions - Latest release: 2 months ago - 74 downloads last month - 2 maintainers
docx-edit 0.1.0
A JS library that parses DOCX into a virtual component tree and writes paragraph-level text chang...1 version - Latest release: about 2 months ago - 1 maintainer
docuprox-mcp 1.0.0
MCP server for DocuProx document processing API1 version - Latest release: about 1 month ago - 1 maintainer
uns-mcp-server 2.0.2
Pure JavaScript MCP server for Unstructured.io - No Python required!4 versions - Latest release: 9 months ago - 13 downloads last month - 1 maintainer
@llamaindex/liteparse 1.5.3
Open-source PDF parsing with spatial text extraction and OCR processing18 versions - Latest release: 19 days ago - 28.4 thousand downloads last month - 3,534 stars on GitHub - 8 maintainers
@sylphx/pdf-reader-mcp 2.4.0
An MCP server providing tools to read PDF files.17 versions - Latest release: 15 days ago - 8.18 thousand downloads last month - 331 stars on GitHub - 2 maintainers
@project-lakechain/opensearch-vector-storage-connector 0.10.0
Stores document embeddings in an OpenSearch vector index.7 versions - Latest release: over 1 year ago - 9 downloads last month - 186 stars on GitHub - 1 maintainer
rq-scan-mcp 1.2.1
MCP Server for RQ-SCAN - AI-powered document data extraction platform4 versions - Latest release: 4 months ago - 1 maintainer
lekana-gemini 1.0.0
A shared TypeScript library for Lekana microservices that provides AI-powered document processing...1 version - Latest release: 5 months ago - 1 maintainer
@agenson-horrowitz/document-parser-mcp 1.0.8
Multi-format document parser MCP server - extract text, tables, and metadata from PDFs, images, H...9 versions - Latest release: about 2 months ago - 1 maintainer
n8n-nodes-pdf-split-merge 2.0.3
n8n community node to merge and split PDFs using the PDF API Hub9 versions - Latest release: 4 months ago - 198 downloads last month - 1 maintainer
@project-lakechain/s3-storage-connector 0.10.0
Stores documents and their metadata in an S3 Bucket.7 versions - Latest release: over 1 year ago - 3 downloads last month - 186 stars on GitHub - 1 maintainer
@cdklabs/cdk-appmod-catalog-blueprints 1.17.0
Serverless infrastructure components organized by business use cases29 versions - Latest release: 28 days ago - 678 downloads last month - 11 stars on GitHub - 3 maintainers
n8n-nodes-docuprox 1.1.0
An n8n community node for AI-powered document processing via the DocuProx API. Extract structured...11 versions - Latest release: about 1 month ago - 211 downloads last month - 1 stars on GitHub - 1 maintainer
doc-to-readable 1.5.3
Universal document-to-markdown and section splitter for HTML, URLs, and PDFs.22 versions - Latest release: 10 months ago - 127 downloads last month - 6 stars on GitHub - 1 maintainer
@project-lakechain/text-transform-processor 0.10.0
A middleware providing a way to transform text documents at scale.7 versions - Latest release: over 1 year ago - 7 downloads last month - 186 stars on GitHub - 1 maintainer
ppu-paddle-ocr 5.4.0 💰
Blazing-fast and lightweight PaddleOCR library for Web, Node.js and Bun. Perform accurate text de...41 versions - Latest release: 10 days ago - 641 downloads last month - 56 stars on GitHub - 1 maintainer
@project-lakechain/image-layer-processor 0.10.0
Applies layer operations on images.7 versions - Latest release: over 1 year ago - 10 downloads last month - 186 stars on GitHub - 1 maintainer
devabase-sdk 0.5.6
Official Node.js SDK for Devabase - Backend for RAG/LLM Applications11 versions - Latest release: 2 months ago - 70 downloads last month - 0 stars on GitHub - 1 maintainer
md-anything 0.2.1
Local-first Markdown conversion for files, webpages, and media — CLI and MCP3 versions - Latest release: about 2 months ago - 336 downloads last month - 0 stars on GitHub - 1 maintainer
recursion-mcp 1.0.2
Recursive Language Model MCP server for unbounded document processing with local RAG3 versions - Latest release: about 1 month ago - 1 maintainer
secure-redact 1.0.4
Client-side PII detection and redaction React component. Upload documents, automatically detect s...5 versions - Latest release: 3 months ago - 40 downloads last month - 0 stars on GitHub - 2 maintainers
docdigitizer-ai 0.2.0
Vercel AI SDK tools for the DocDigitizer document processing API2 versions - Latest release: 3 months ago - 1 maintainer
@project-lakechain/rekognition-image-processor 0.10.0
Processes images using Amazon Rekognition.7 versions - Latest release: over 1 year ago - 2 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/condition 0.10.0
A middleware allowing to express complex conditions in pipelines.7 versions - Latest release: over 1 year ago - 3 downloads last month - 185 stars on GitHub - 1 maintainer
koncile-js 0.1.4
JavaScript SDK for the Koncile Intelligent Document Processing API5 versions - Latest release: 10 months ago - 14 downloads last month - 1 maintainer
@docrouter/mcp 1.0.0
TypeScript MCP server for DocRouter API8 versions - Latest release: 4 months ago - 232 downloads last month - 1 maintainer
mastra-browser-rag 0.0.9
The Retrieval-Augmented Generation (RAG) module contains document processing and embedding utilit...9 versions - Latest release: about 1 year ago - 95 downloads last month - 1 maintainer
@project-lakechain/layers 0.10.0
Lambda layer library used by Project Lakechain.7 versions - Latest release: over 1 year ago - 4 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/jmespath-processor 0.10.0
Applies JMESPath expressions to JSON documents.7 versions - Latest release: over 1 year ago - 3 downloads last month - 186 stars on GitHub - 1 maintainer
@paul_sizon/expo-pdf-text-extract 1.0.1
Native PDF text extraction for React Native and Expo. Extract text content from PDF files using p...1 version - Latest release: 13 days ago - 1 maintainer
@jmndao/mongoose-ai 1.6.0
AI-powered Mongoose plugin for intelligent document processing with auto-summarization, semantic ...12 versions - Latest release: 13 days ago - 358 downloads last month - 3 stars on GitHub - 1 maintainer
pdf2md-ai 1.0.2
AI-powered PDF to Markdown converter that preserves complete context: images, tables, and code bl...3 versions - Latest release: 4 months ago - 297 downloads last month - 1 maintainer
n8n-nodes-deep-ocr 1.5.14
n8n community node for Deep-OCR document processing API30 versions - Latest release: 11 days ago - 1.1 thousand downloads last month - 0 stars on GitHub - 1 maintainer
universal-documents-converter 1.0.1
Universal MCP Server for Multi-Rendering PDF Quality Assurance System with AI-powered optimization1 version - Latest release: 8 months ago - 12 downloads last month - 1 maintainer
@project-lakechain/opensearch-saved-object 0.10.0
Uploads a saved object to OpenSearch using AWS CDK.7 versions - Latest release: over 1 year ago - 20 downloads last month - 187 stars on GitHub - 1 maintainer
pdftotext-mcp 1.0.0
A reliable Model Context Protocol server for PDF text extraction using pdftotext from poppler-utils1 version - Latest release: 10 months ago - 10 downloads last month - 0 stars on GitHub - 1 maintainer
Related Keywords
ai
94
machine-learning
90
retrieval-augmented-generation
87
natural-language-processing
84
pdf
82
computer-vision
80
generative-ai
80
serverless
79
aws
78
hacktoberfest
76
ocr
76
aws-cdk
76
typescript
56
llm
53
n8n-community-node-package
50
rag
49
mcp
43
n8n
42
embeddings
38
text-extraction
32
markdown
31
model-context-protocol
31
semantic-search
24
docx
23
claude
22
sdk
21
pdf-parser
20
data-extraction
20
openai
20
nlp
19
automation
19
langchain
17
cli
16
vector-search
16
nodejs
15
workflow
15
document
14
chatbot
13
text-splitting
13
vector-database
13
chunking
12
knowledge-base
12
javascript
11
api
11
multimodal
11
pdf-processing
11
monorepo
10
web-scraping
10
html
10
cdk
10
document-conversion
9
pdf-to-markdown
9
extraction
9
react
9
structured-data
9
developer-tools
9
text-processing
8
image-processing
8
pdf-extraction
8
gemini
8
ai-agent
8
n8n-node
8
anthropic
8
document-analysis
7
browser
7
n8n-community-nodes
7
lakechain
7
ollama
7
mcp-server
7
pdf-to-text
7
batch-processing
6
word
6
image
6
chatgpt
6
gpt
6
graphor
6
pdf-library
6
ai-sdk
6
text-splitter
6
office-documents
6
image-to-text
6
document-ai
6
agent
6
ai-agents
6
document-automation
6
parser
6
invoice
6
image-extraction
5
nextjs
5
text-analysis
5
latex
5
vector-store
5
ai-workflow
5
embedding
5
documents
5
file-processing
5
conversational-ai
5
document-extraction
5
document-parsing
5
sub-node
5