pypi.org "document processing" keyword
chunklet-py 2.2.0
High-fidelity context-aware chunking and interactive visualization for RAG. Advanced segmentation...8 versions - Latest release: 16 days ago - 494 downloads last month - 62 stars on GitHub - 1 maintainer
docnav 1.0.1
AI-powered document querying with citations2 versions - Latest release: about 2 months ago - 88 downloads last month - 1 maintainer
onnxtr 0.8.1 💰
Onnx Text Recognition (OnnxTR): docTR Onnx-Wrapper for high-performance OCR on documents.18 versions - Latest release: about 1 month ago - 56.4 thousand downloads last month - 159 stars on GitHub - 1 maintainer
chunklet 1.4.0
A smart multilingual text chunker for LLMs, RAG, and beyond.19 versions - Latest release: 6 months ago - 162 downloads last month - 23 stars on GitHub - 1 maintainer
lex-pdftotext 1.0.0
Extract and structure text from Brazilian legal PDF documents (PJe format)1 version - Latest release: 3 months ago - 19 downloads last month - 1 maintainer
arkeo 0.2.6
markdown archiver betasaurus8 versions - Latest release: 3 months ago - 185 downloads last month - 1 maintainer
pdfsmith 0.2.0
PDF to Markdown conversion with multiple backend support1 version - Latest release: 3 months ago - 23 downloads last month - 1 stars on GitHub - 1 maintainer
textextraction 0.1.4
Extract and process text from images and PDFs5 versions - Latest release: 11 months ago - 27 downloads last month - 0 stars on GitHub - 1 maintainer
doc2data 0.2.0
Integrated document processing with machine learning.3 versions - Latest release: over 3 years ago - 1 dependent repositories - 123 downloads last month - 10 stars on GitHub - 1 maintainer
llama-index-readers-layoutir 0.1.1
llama-index readers LayoutIR integration1 version - Latest release: 20 days ago - 84 downloads last month - 1 maintainer
file-parse-by-bajirao 0.1.0
Universal Document Processor for LLM Processing - extracts text, tables, numeric data, and metada...1 version - Latest release: 3 months ago - 42 downloads last month - 1 maintainer
docling-ocr-onnxtr 0.2.1 💰
Onnx Text Recognition (OnnxTR) OCR plugin for docling6 versions - Latest release: about 1 month ago - 27 thousand downloads last month - 11 stars on GitHub - 1 maintainer
doctr-labeler 0.2.1
A Python package for labeling and annotating documents9 versions - Latest release: 3 months ago - 168 downloads last month - 15 stars on GitHub - 1 maintainer
intelisys 0.5.6
Intelligence/AI services for the Lifsys Enterprise with enhanced max_history_words, efficient his...37 versions - Latest release: over 1 year ago - 218 downloads last month - 0 stars on GitHub - 1 maintainer
docu-devs-api-client 1.6.1
A client library for accessing DocuDevs API36 versions - Latest release: 29 days ago - 1.02 thousand downloads last month - 1 maintainer
detect-row 2.0.5
Hệ thống trÃch xuất bảng, hà ng, cá»™t hoà n chỉnh vá»›i AI và GPU support13 versions - Latest release: 9 months ago - 87 downloads last month - 1 maintainer
rainbow-pdf-processor 0.1.0
A powerful PDF processing tool with text extraction, table recognition, and image extraction capa...1 version - Latest release: 11 months ago - 11 downloads last month - 0 stars on GitHub - 1 maintainer
llmgraphtransformer 0.1.0
A powerful tool for transforming documents into graph-based structures using Large Language Model...3 versions - Latest release: about 1 year ago - 716 downloads last month - 6 stars on GitHub - 1 maintainer
bbox-align 0.2.8
A python library that reorders bounding boxes generated by OCR engines into the correct reading o...11 versions - Latest release: 8 months ago - 860 downloads last month - 2 stars on GitHub - 1 maintainer
pydocai 0.1.0
Extract text from PDFs using pypdfium2 with OCR fallback via pytesseract1 version - Latest release: about 1 month ago
py-document-chunker 0.3.0 removed
A state-of-the-art Python package for advanced text segmentation (chunking).2 versions - Latest release: 6 months ago - 265 downloads last month - 0 stars on GitHub - 1 maintainer
Related Keywords
ocr
9
pdf
6
text extraction
6
OCR
5
rag
5
llm
4
ai
4
natural language processing
4
deep learning
3
computer vision
3
docTR
3
nlp
3
chunking
3
machine learning
3
markdown
3
text-detection-recognition
2
text-detection
2
onnxruntime
2
text-recognition
2
deep-learning
2
document AI
2
document analysis
2
table extraction
2
text recognition
2
text detection
2
onnx
2
onnxtr
2
image processing
2
openai
2
text-splitting
2
data processing
2
text processing
2
information retrieval
2
RAG
2
semantic search
2
multilingual
2
bounding boxes
1
AI
1
vietnamese
1
column extraction
1
row detection
1
document to data
1
docudevs client
1
ai document processing
1
openrouter
1
groq
1
google
1
anthropic
1
intelligence
1
document
1
labeling-tool
1
doctr
1
automation
1
OnnxTR
1
brazilian law
1
document to graph
1
lines
1
reorder
1
pypdfium2
1
tesseract
1
docutray
1
data extraction
1
api
1
text chunking
1
text to graph
1
text splitting
1
Retrieval-Augmented Generation
1
graph transformation
1
NLP
1
langchain
1
llamaindex
1
LLM
1
image extraction
1
table recognition
1
automated extraction
1
GPU
1
legal documents
1
optical-character-recognition
1
document-recognition
1
claude
1
gemini
1
citations
1
text analysis
1
document management
1
retrieval augmented generation
1
query
1
search
1
chunk-visualization
1
code structure
1
programming languages
1
source code analysis
1
code-chunker
1
code chunking
1
document-chunker
1
sentence-splitting
1
annotation
1
labeling
1
document-processing
1
plugin
1
docling
1