npmjs.org "document-processing" keyword
@project-lakechain/ecs-cluster 0.10.0
A managed ECS cluster construct with quick autoscaling and EFS storage.7 versions - Latest release: over 1 year ago - 17 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/bedrock-image-generators 0.10.0
Image generation using Amazon Bedrock models.7 versions - Latest release: over 1 year ago - 10 downloads last month - 186 stars on GitHub - 1 maintainer
rag-lite-ts 2.3.1
Local-first TypeScript retrieval engine with Chameleon Multimodal Architecture for semantic searc...14 versions - Latest release: about 1 month ago - 333 downloads last month - 3 stars on GitHub - 1 maintainer
n8n-nodes-ninjadoc 1.0.1
Actions to access Ninjadoc AI API2 versions - Latest release: 5 months ago - 1 maintainer
ppu-paddle-ocr 4.1.0 💰
Blazing-fast and lightweight PaddleOCR library for Node.js and Bun. Perform accurate text detecti...33 versions - Latest release: about 5 hours ago - 641 downloads last month - 34 stars on GitHub - 1 maintainer
@songlingqing/mcp-batch-processing 0.0.1
MCP服务器用于分步审核可行性研究报告等文件的批处理1 version - Latest release: 9 months ago - 3 downloads last month - 1 maintainer
n8n-nodes-contextual-document-loader 0.4.0
⚠️ DEPRECATED: Use n8n-nodes-semantic-splitter-with-context instead. This package has known issue...12 versions - Latest release: 9 months ago - 22 downloads last month - 8 stars on GitHub - 1 maintainer
n8n-nodes-adverant-nexus 0.2.0
n8n community nodes for Adverant Nexus AI platform - GraphRAG knowledge management, MageAgent orc...1 version - Latest release: 4 days ago - 93 downloads last month - 1 maintainer
rq-scan-mcp 1.2.1
MCP Server for RQ-SCAN - AI-powered document data extraction platform4 versions - Latest release: about 2 months ago - 1 maintainer
@paulmeller/docflow 0.0.28
A developer-friendly transformation engine for programmatic document manipulation26 versions - Latest release: 5 months ago - 12 downloads last month - 1 maintainer
lcp-nodes 1.0.43
Node RED Custom Nodes for LCP1 version - Latest release: 9 months ago - 14 downloads last month - 1 maintainer
doc-ops-mcp 0.3.8
MCP Document Converter Server — A Model Context Protocol server for seamless document format conv...32 versions - Latest release: 7 months ago - 34 downloads last month - 31 stars on GitHub - 1 maintainer
@project-lakechain/hashing-image-processor 0.10.0
Computes the hashes of images using different algorithms.3 versions - Latest release: over 1 year ago - 2 downloads last month - 186 stars on GitHub - 1 maintainer
@sylphx/pdf-reader-mcp 2.3.0
An MCP server providing tools to read PDF files.15 versions - Latest release: about 1 month ago - 8.18 thousand downloads last month - 331 stars on GitHub - 2 maintainers
pdf-oxide-wasm 0.3.15 💰
Fast, zero-dependency PDF toolkit for Node.js, browsers, and edge runtimes — text extraction, mar...6 versions - Latest release: 1 day ago - 204 downloads last month - 154 stars on GitHub - 1 maintainer
devabase-sdk 0.5.6
Official Node.js SDK for Devabase - Backend for RAG/LLM Applications11 versions - Latest release: 1 day ago - 925 downloads last month - 0 stars on GitHub - 1 maintainer
@morphik/ui 0.1.6 deprecated
Modern UI component for Morphik - A powerful document processing and querying system6 versions - Latest release: 11 months ago - 288 downloads last month - 3,472 stars on GitHub - 1 maintainer
@mastra/rag 2.1.2
The Retrieval-Augmented Generation (RAG) module contains document processing and embedding utilit...673 versions - Latest release: 10 days ago - 239 thousand downloads last month - 21,320 stars on GitHub - 11 maintainers
@sin-gang-ya-rou-geng-hao-chih/pdf-mcp-server 1.0.5
Model Context Protocol (MCP) server for PDF processing with OCR support6 versions - Latest release: 2 months ago - 271 downloads last month - 1 maintainer
@iflow-mcp/pdftotext-mcp 1.0.0
A reliable Model Context Protocol server for PDF text extraction using pdftotext from poppler-utils1 version - Latest release: 4 months ago - 5 downloads last month - 0 stars on GitHub - 2 maintainers
Top 1.9% on npmjs.org
76 versions - Latest release: 9 months ago - 96 dependent packages - 582 dependent repositories - 645 thousand downloads last month - 222 stars on GitHub - 2 maintainers
stopword 3.1.5
A module for node.js and the browser that takes in text and returns text that is stripped of stop...76 versions - Latest release: 9 months ago - 96 dependent packages - 582 dependent repositories - 645 thousand downloads last month - 222 stars on GitHub - 2 maintainers
@project-lakechain/elevenlabs-synthesizer 0.10.0
Synthesizes text into speech using the Elevenlabs API.3 versions - Latest release: over 1 year ago - 3 downloads last month - 186 stars on GitHub - 1 maintainer
pdf-tax-reader-cl 1.0.0
PDF scraping library for Chilean tax documents. Extract emitter name, economic activities, and ad...1 version - Latest release: 7 months ago - 2 downloads last month - 0 stars on GitHub - 1 maintainer
@heripo/document-processor 0.1.12
Document processor with LLM-based analysis for heripo engine13 versions - Latest release: 2 days ago - 1 thousand downloads last month - 4 stars on GitHub - 1 maintainer
@heripo/pdf-parser 0.1.12
PDF parsing library using Docling SDK with OCR support for macOS13 versions - Latest release: 2 days ago - 1.01 thousand downloads last month - 4 stars on GitHub - 1 maintainer
@heripo/model 0.1.12
Document models and type definitions for heripo engine13 versions - Latest release: 2 days ago - 1.04 thousand downloads last month - 4 stars on GitHub - 1 maintainer
@project-lakechain/bedrock-text-processors 0.10.0
Generative text processing using Amazon Bedrock models.7 versions - Latest release: over 1 year ago - 4 downloads last month - 186 stars on GitHub - 1 maintainer
@egintegrations/document-services 0.1.0
Document processing library with Google Cloud Vision OCR, text extraction from PDFs and images, a...1 version - Latest release: about 1 month ago - 1 maintainer
@project-lakechain/opensearch-vector-storage-connector 0.10.0
Stores document embeddings in an OpenSearch vector index.7 versions - Latest release: over 1 year ago - 9 downloads last month - 186 stars on GitHub - 1 maintainer
n8n-nodes-query-retriever-rerank 0.4.1
Advanced n8n community node for intelligent document retrieval with multi-step reasoning, reranki...2 versions - Latest release: 9 months ago - 700 downloads last month - 8 stars on GitHub - 2 maintainers
n8n-nodes-google-gemini-embeddings-extended 0.2.3
n8n community sub-node for Google Gemini Embeddings with extended features like output dimensions...7 versions - Latest release: 5 months ago - 251 downloads last month - 8 stars on GitHub - 2 maintainers
@pageindex/mcp 1.6.3
MCP server for PageIndex2 versions - Latest release: about 2 months ago - 1 maintainer
llm-gen 1.0.3
A CLI tool to extract text from a static Next.js export and generate llm.txt for LLM ingestion.2 versions - Latest release: 7 months ago - 19 downloads last month - 3 stars on GitHub - 1 maintainer
n8n-nodes-graphorlm 0.1.11
n8n community nodes for Graphor - Intelligent document processing, RAG pipelines, and document ch...9 versions - Latest release: about 1 month ago - 1 maintainer
@23blocks/block-rag 2.0.1
RAG block for 23blocks SDK - vector search, document processing, image search, product identifica...2 versions - Latest release: 4 days ago - 3 stars on GitHub - 1 maintainer
n8n-nodes-docdigitizer 0.2.0
n8n community node for the DocDigitizer document processing API2 versions - Latest release: 4 days ago - 1 maintainer
docdigitizer 0.2.0
Official Node.js/TypeScript SDK for the DocDigitizer document processing API2 versions - Latest release: 4 days ago - 1 maintainer
docdigitizer-ai 0.2.0
Vercel AI SDK tools for the DocDigitizer document processing API2 versions - Latest release: 4 days ago - 1 maintainer
@nomad-e/reader-pdf 0.0.2
MCP server that exposes tools for PDF processing via a configurable backend. Connects to your API...2 versions - Latest release: 4 days ago - 1 maintainer
@docdigitizer/mcp-server 0.2.0
MCP server for DocDigitizer document processing API2 versions - Latest release: 4 days ago - 1 maintainer
@project-lakechain/transcribe-audio-processor 0.10.0
Transcribes audio files asynchronously using Amazon Transcribe.7 versions - Latest release: over 1 year ago - 5 downloads last month - 186 stars on GitHub - 1 maintainer
@ainoflow/n8n-nodes-ainoflow 1.0.7
n8n community nodes for Ainoflow API - Document conversion (OCR/transcription), file storage, and...8 versions - Latest release: about 1 month ago - 768 downloads last month - 1 maintainer
@project-lakechain/s3-event-trigger 0.10.0
Triggers pipelines upon events being emitted from S3 buckets.7 versions - Latest release: over 1 year ago - 17 downloads last month - 186 stars on GitHub - 1 maintainer
proofpudding 0.2.0
TypeScript SDK for the ProofPudding document processing API2 versions - Latest release: 4 days ago - 1 maintainer
@project-lakechain/text-transform-processor 0.10.0
A middleware providing a way to transform text documents at scale.7 versions - Latest release: over 1 year ago - 7 downloads last month - 186 stars on GitHub - 1 maintainer
@ruvector/scipix 0.1.1
OCR client for scientific documents - extract LaTeX, MathML from equations, research papers, and ...2 versions - Latest release: about 1 month ago - 1 maintainer
@project-lakechain/opensearch-index 0.10.0
Creates an OpenSearch index using AWS CDK.7 versions - Latest release: over 1 year ago - 5 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/image-layer-processor 0.10.0
Applies layer operations on images.7 versions - Latest release: over 1 year ago - 10 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/subtitle-processor 0.10.0
Parses subtitle documents into text and structured data.5 versions - Latest release: over 1 year ago - 28 downloads last month - 186 stars on GitHub - 1 maintainer
pdf2md-ai 1.0.2
AI-powered PDF to Markdown converter that preserves complete context: images, tables, and code bl...3 versions - Latest release: about 2 months ago - 297 downloads last month - 1 maintainer
mastra-browser-rag 0.0.9
The Retrieval-Augmented Generation (RAG) module contains document processing and embedding utilit...9 versions - Latest release: 11 months ago - 95 downloads last month - 1 maintainer
@project-lakechain/pdf-text-converter 0.10.0
Converts PDF documents into different formats.7 versions - Latest release: over 1 year ago - 3 downloads last month - 186 stars on GitHub - 1 maintainer
linchat 1.3.1
An intelligent AI-powered command-line chat assistant with document processing, code review, and ...9 versions - Latest release: 4 months ago - 21 downloads last month - 1 maintainer
@transloadit/mcp-server 0.3.7
Transloadit MCP server15 versions - Latest release: 5 days ago - 1.5 thousand downloads last month - 68 stars on GitHub - 4 maintainers
@project-lakechain/firehose-storage-connector 0.10.0
Forwards document metadata to a Firehose delivery stream.7 versions - Latest release: over 1 year ago - 8 downloads last month - 186 stars on GitHub - 1 maintainer
n8n-nodes-adp 0.2.2
n8n community nodes for ADP (Agentic Document Processor) - AI-powered document extraction5 versions - Latest release: 6 days ago - 0 stars on GitHub - 1 maintainer
@project-lakechain/recursive-character-text-splitter 0.10.0
Transforms text into chunks of tokens using Langchain's recursive character text splitter.7 versions - Latest release: over 1 year ago - 4 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/lancedb-storage-connector 0.10.0
A data store connector for LanceDB.3 versions - Latest release: over 1 year ago - 8 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/jmespath-processor 0.10.0
Applies JMESPath expressions to JSON documents.7 versions - Latest release: over 1 year ago - 3 downloads last month - 186 stars on GitHub - 1 maintainer
@nlptools/splitter 0.0.2
Text splitting utilities - LangChain.js text splitters wrapper for NLPTools2 versions - Latest release: 4 months ago - 13 downloads last month - 0 stars on GitHub - 1 maintainer
@buildel/ocr 0.1.1
Document processing application with CLI and API interfaces11 versions - Latest release: 8 months ago - 41 downloads last month - 3 maintainers
n8n-nodes-extract-monster 1.0.3
AI-powered data extraction from PDFs, images, documents, audio, and video. Extract invoices, rece...3 versions - Latest release: 4 months ago - 163 downloads last month - 1 maintainer
secure-redact 1.0.2
Client-side PII detection and redaction React component. Upload documents, automatically detect s...3 versions - Latest release: 8 days ago - 0 stars on GitHub - 1 maintainer
doc-to-readable 1.5.3
Universal document-to-markdown and section splitter for HTML, URLs, and PDFs.22 versions - Latest release: 8 months ago - 127 downloads last month - 6 stars on GitHub - 1 maintainer
@project-lakechain/translate-text-processor 0.10.0
Translates text documents asynchronously using Amazon Translate.7 versions - Latest release: over 1 year ago - 15 downloads last month - 186 stars on GitHub - 1 maintainer
@caleblawson/rag 1.0.0
The Retrieval-Augmented Generation (RAG) module contains document processing and embedding utilit...1 version - Latest release: 9 months ago - 12 downloads last month - 1 maintainer
ignidor-idp-mcp 1.0.4
MCP server for Ignidor IDP B2B API integration - enables Claude to process documents through Igni...5 versions - Latest release: 4 months ago - 19 downloads last month - 1 maintainer
markdoc-traverse 1.1.1
A simple and tiny traversal library for MarkDoc AST4 versions - Latest release: over 1 year ago - 19 downloads last month - 0 stars on GitHub - 1 maintainer
@project-lakechain/opensearch-saved-object 0.10.0
Uploads a saved object to OpenSearch using AWS CDK.7 versions - Latest release: over 1 year ago - 2 downloads last month - 186 stars on GitHub - 1 maintainer
@nyazkhan/react-pdf-viewer 1.1.1
A comprehensive React TypeScript component library for viewing and interacting with PDF files usi...5 versions - Latest release: 7 months ago - 86 downloads last month - 0 stars on GitHub - 1 maintainer
n8n-nodes-pdf-api-hub 4.0.11
n8n community node to parse, extract, merge, convert, and lock PDFs using the PDF API Hub20 versions - Latest release: 9 days ago - 996 downloads last month - 0 stars on GitHub - 1 maintainer
@project-lakechain/rembg-image-processor 0.10.0
Automatically remove background from images using Rembg.5 versions - Latest release: over 1 year ago - 2 downloads last month - 186 stars on GitHub - 1 maintainer
@easyrag/sdk 0.1.1
Official JavaScript SDK for EasyRAG.com API2 versions - Latest release: 3 months ago - 7 downloads last month - 0 stars on GitHub - 1 maintainer
@project-lakechain/bedrock-embedding-processors 0.10.0
Creates embeddings from documents using Amazon Bedrock models.7 versions - Latest release: over 1 year ago - 10 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/polly-synthesizer 0.10.0
Synthesizes text into speech using Amazon Polly.7 versions - Latest release: over 1 year ago - 12 downloads last month - 186 stars on GitHub - 1 maintainer
@nlptools/nlptools 0.0.2
Main NLPTools package - Complete suite of NLP algorithms, text distance, similarity, splitting, a...16 versions - Latest release: 4 months ago - 7 downloads last month - 0 stars on GitHub - 1 maintainer
@project-lakechain/bert-extractive-summarizer 0.10.0
Provides text summarization using the Bert Extractive Summarizer model.7 versions - Latest release: over 1 year ago - 19 downloads last month - 185 stars on GitHub - 1 maintainer
nanonets 2.0.1
Node.js SDK for the Nanonets API: OCR, document extraction, and workflow automation.14 versions - Latest release: 9 months ago - 2 dependent packages - 41 downloads last month - 0 stars on GitHub - 3 maintainers
lekana-gemini 1.0.0
A shared TypeScript library for Lekana microservices that provides AI-powered document processing...1 version - Latest release: 3 months ago - 1 maintainer
@project-lakechain/opensearch-domain 0.10.0
Creates an OpenSearch domain with Cognito authentication.7 versions - Latest release: over 1 year ago - 2 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/regexp-text-splitter 0.10.0
Transforms text into chunks of tokens based on regular expressions.5 versions - Latest release: over 1 year ago - 5 downloads last month - 186 stars on GitHub - 1 maintainer
@pageindex/sdk 0.5.0
PageIndex SDK - Document processing for AI applications via REST API and MCP3 versions - Latest release: 10 days ago - 1 maintainer
@instafill.ai/instafill 0.3.2
Instafill AI Node.js library for automating PDF form filling using AI-powered technology.3 versions - Latest release: 12 months ago - 46 downloads last month - 1 maintainer
@project-lakechain/panns-embedding-processor 0.10.0
A processor generating embeddings for audio documents using Pretrained Audio Neural Networks.7 versions - Latest release: over 1 year ago - 8 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/zip-processor 0.10.0
Inflates and deflates Zip documents from a source to a destination bucket.5 versions - Latest release: over 1 year ago - 7 downloads last month - 186 stars on GitHub - 1 maintainer
@wdelhagen/textprep 0.3.1
Document text extraction with pluggable extractors. Supports PDF, DOCX, DOC, RTF, TXT, and image ...6 versions - Latest release: 10 days ago - 301 downloads last month - 1 maintainer
treechunk 1.1.0
Hierarchical markdown chunking for RAG systems with AI-powered context summarization5 versions - Latest release: 8 months ago - 8 downloads last month - 0 stars on GitHub - 1 maintainer
pdf-utils-rust 0.1.1
PDF and image processing utilities compiled to WebAssembly - Fast, secure, client-side file proce...1 version - Latest release: 5 months ago - 22 downloads last month - 0 stars on GitHub - 1 maintainer
@project-lakechain/core 0.10.0
Core package for building middlewares with Project Lakechain.7 versions - Latest release: over 1 year ago - 5 downloads last month - 184 stars on GitHub - 1 maintainer
@project-lakechain/character-text-splitter 0.10.0
Transforms text into chunks of tokens using Langchain's character text splitter.7 versions - Latest release: over 1 year ago - 6 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/rekognition-image-processor 0.10.0
Processes images using Amazon Rekognition.7 versions - Latest release: over 1 year ago - 2 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/opensearch-storage-connector 0.10.0
Stores document metadata in an OpenSearch index.7 versions - Latest release: over 1 year ago - 4 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/ollama-embedding-processor 0.10.0
Creates embeddings from documents using Ollama models.3 versions - Latest release: over 1 year ago - 11 downloads last month - 186 stars on GitHub - 1 maintainer
n8n-nodes-docx-genie-pro 0.1.2
n8n node package for DOCX document manipulation and processing3 versions - Latest release: 9 months ago - 4 downloads last month - 0 stars on GitHub - 1 maintainer
universal-documents-converter 1.0.1
Universal MCP Server for Multi-Rendering PDF Quality Assurance System with AI-powered optimization1 version - Latest release: 6 months ago - 12 downloads last month - 1 maintainer
@project-lakechain/sentence-transformers 0.10.0
Creates embeddings from text-oriented documents using Sentence Transformers models.7 versions - Latest release: over 1 year ago - 12 downloads last month - 185 stars on GitHub - 1 maintainer
@project-lakechain/s3-storage-connector 0.10.0
Stores documents and their metadata in an S3 Bucket.7 versions - Latest release: over 1 year ago - 3 downloads last month - 186 stars on GitHub - 1 maintainer
@project-lakechain/scheduler-event-trigger 0.10.0
Triggers pipelines upon scheduling events.7 versions - Latest release: over 1 year ago - 5 downloads last month - 186 stars on GitHub - 1 maintainer
@equus-ai/sdk 1.1.0
Official JavaScript/TypeScript SDK for EQUUS AI Infrastructure Platform - Document Processing, RA...2 versions - Latest release: about 2 months ago - 147 downloads last month - 1 maintainer
@base64ai/n8n-nodes-base64ai 2.0.0
Official Base64.ai community node for n8n3 versions - Latest release: about 1 month ago - 130 downloads last month - 2 maintainers
Related Keywords
machine-learning
88
retrieval-augmented-generation
86
ai
85
natural-language-processing
82
generative-ai
79
serverless
78
computer-vision
78
aws
77
hacktoberfest
76
aws-cdk
75
pdf
66
ocr
58
typescript
53
n8n-community-node-package
45
llm
43
rag
39
n8n
37
embeddings
36
mcp
32
model-context-protocol
26
markdown
24
text-extraction
24
semantic-search
21
docx
19
sdk
18
automation
18
langchain
17
claude
17
nlp
16
workflow
15
openai
15
vector-search
14
text-splitting
13
nodejs
13
document
13
pdf-parser
12
data-extraction
12
vector-database
12
chunking
11
pdf-processing
11
knowledge-base
11
cli
11
api
11
chatbot
11
monorepo
10
html
10
react
9
web-scraping
9
cdk
9
multimodal
8
javascript
8
text-processing
8
extraction
8
n8n-community-nodes
7
document-conversion
7
browser
7
pdf-extraction
7
n8n-node
7
lakechain
7
gemini
7
image-processing
6
text-splitter
6
developer-tools
6
office-documents
6
ai-sdk
6
graphor
6
ai-workflow
5
json
5
agent
5
tesseract
5
document-analysis
5
sub-node
5
n8n-community-node
5
documents
5
vector-store
5
batch-processing
5
structured-data
5
file-processing
5
image
5
mcp-server
5
text-analysis
5
latex
5
invoice
5
document-automation
4
ai-agents
4
playwright
4
parser
4
tool
4
pdf-tools
4
pdf-reader
4
cross-platform
4
markdown-to-pdf
4
opensearch
4
pdf-merge
4
docdigitizer
4
text-generation
4
search
4
pageindex
4
embedding
4
docling
4