pypi.org "document processing" keyword
rainbow-pdf-processor 0.1.0
A powerful PDF processing tool with text extraction, table recognition, and image extraction capa...1 version - Latest release: about 1 year ago - 10 downloads last month - 0 stars on GitHub - 1 maintainer
detect-row 2.0.5
Hệ thống trích xuất bảng, hàng, cột hoàn chỉnh với AI và GPU support13 versions - Latest release: 10 months ago - 30 downloads last month - 1 maintainer
doctr-labeler 0.3.1
A Python package for labeling and annotating documents12 versions - Latest release: about 2 months ago - 168 downloads last month - 15 stars on GitHub - 1 maintainer
docu-devs-api-client 1.9.0
A client library for accessing DocuDevs API38 versions - Latest release: 19 days ago - 665 downloads last month - 1 maintainer
llmgraphtransformer 0.1.0
A powerful tool for transforming documents into graph-based structures using Large Language Model...3 versions - Latest release: about 1 year ago - 670 downloads last month - 6 stars on GitHub - 1 maintainer
bbox-align 0.2.8
A python library that reorders bounding boxes generated by OCR engines into the correct reading o...11 versions - Latest release: 10 months ago - 578 downloads last month - 11 stars on GitHub - 1 maintainer
pydocai 0.1.0
Extract text from PDFs using pypdfium2 with OCR fallback via pytesseract1 version - Latest release: 3 months ago - 24 downloads last month - 1 maintainer
docutray 0.2.0
Python library for the DocuTray API2 versions - Latest release: 19 days ago - 1 maintainer
docling-agent 0.1.0
A python library to simplify agentic operations on documents, such as writing, editing, summarizi...1 version - Latest release: 16 days ago
intelisys 0.5.6
Intelligence/AI services for the Lifsys Enterprise with enhanced max_history_words, efficient his...37 versions - Latest release: over 1 year ago - 143 downloads last month - 0 stars on GitHub - 1 maintainer
docling-ocr-onnxtr 0.2.1 💰
Onnx Text Recognition (OnnxTR) OCR plugin for docling6 versions - Latest release: 3 months ago - 27 thousand downloads last month - 11 stars on GitHub - 1 maintainer
chunklet-py 2.2.0
High-fidelity context-aware chunking and interactive visualization for RAG. Advanced segmentation...8 versions - Latest release: 2 months ago - 283 downloads last month - 62 stars on GitHub - 1 maintainer
onnxtr 0.8.1 💰
Onnx Text Recognition (OnnxTR): docTR Onnx-Wrapper for high-performance OCR on documents.18 versions - Latest release: 3 months ago - 83.2 thousand downloads last month - 159 stars on GitHub - 1 maintainer
arkeo 0.2.6
markdown archiver betasaurus8 versions - Latest release: 5 months ago - 53 downloads last month - 1 maintainer
docnav 1.0.1
AI-powered document querying with citations2 versions - Latest release: 3 months ago - 53 downloads last month - 1 maintainer
pdfsmith 0.2.0
PDF to Markdown conversion with multiple backend support1 version - Latest release: 5 months ago - 23 downloads last month - 1 stars on GitHub - 1 maintainer
file-parse-by-bajirao 0.1.0
Universal Document Processor for LLM Processing - extracts text, tables, numeric data, and metada...1 version - Latest release: 5 months ago - 15 downloads last month - 1 maintainer
textextraction 0.1.4
Extract and process text from images and PDFs5 versions - Latest release: about 1 year ago - 43 downloads last month - 0 stars on GitHub - 1 maintainer
llama-index-readers-layoutir 0.1.1
llama-index readers LayoutIR integration1 version - Latest release: 2 months ago - 135 downloads last month - 1 maintainer
lex-pdftotext 1.0.0
Extract and structure text from Brazilian legal PDF documents (PJe format)1 version - Latest release: 4 months ago - 19 downloads last month - 1 maintainer
chunklet 1.4.0
A smart multilingual text chunker for LLMs, RAG, and beyond.19 versions - Latest release: 8 months ago - 162 downloads last month - 23 stars on GitHub - 1 maintainer
doc2data 0.2.0
Integrated document processing with machine learning.3 versions - Latest release: over 3 years ago - 1 dependent repositories - 188 downloads last month - 10 stars on GitHub - 1 maintainer
py-document-chunker 0.3.0 removed
A state-of-the-art Python package for advanced text segmentation (chunking).2 versions - Latest release: 7 months ago - 265 downloads last month - 0 stars on GitHub - 1 maintainer
Related Keywords
ocr
9
pdf
6
text extraction
6
rag
5
OCR
5
natural language processing
4
ai
4
llm
4
markdown
3
computer vision
3
chunking
3
machine learning
3
nlp
3
deep learning
3
docTR
3
document analysis
2
text recognition
2
text detection
2
onnx
2
openai
2
docling
2
RAG
2
document AI
2
deep-learning
2
onnxruntime
2
text-detection
2
text-recognition
2
text-splitting
2
multilingual
2
text processing
2
data processing
2
information retrieval
2
semantic search
2
text-detection-recognition
2
onnxtr
2
table extraction
2
image processing
2
claude
1
layout analysis
1
layoutir
1
chunks-processing
1
legal documents
1
chunks-algorithm
1
brazilian law
1
pje
1
lex-intelligentia
1
pdf parsing
1
text chunking
1
text splitting
1
chunk-visualization
1
Retrieval-Augmented Generation
1
NLP
1
langchain
1
llamaindex
1
document
1
code structure
1
programming languages
1
source code analysis
1
gemini
1
citations
1
text analysis
1
document management
1
retrieval augmented generation
1
query
1
search
1
pdf-markdown-pdf-parser-ocr
1
corpus
1
docx
1
xlsx
1
indexing
1
archiving
1
optical-character-recognition
1
document-recognition
1
csv
1
visualization
1
table detection
1
natural-language-processing
1
document-chunking
1
code-structure
1
code-chunking
1
IR
1
PDF
1
code-chunker
1
lines
1
bounding boxes
1
document to graph
1
text to graph
1
graph transformation
1
LLM
1
document to data
1
docudevs client
1
ai document processing
1
labeling-tool
1
doctr
1
automation
1
OnnxTR
1
annotation
1
labeling
1
automated extraction
1
GPU
1