An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "document processing" keyword

View the packages on the pypi.org package registry that are tagged with the "document processing" keyword.

py-document-chunker 0.3.0
A state-of-the-art Python package for advanced text segmentation (chunking).
2 versions - Latest release: 17 days ago - 265 downloads last month - 0 stars on GitHub - 1 maintainer
docling-ocr-onnxtr 0.2.0 💰
Onnx Text Recognition (OnnxTR) OCR plugin for docling
5 versions - Latest release: 3 months ago - 13 thousand downloads last month - 11 stars on GitHub - 1 maintainer
onnxtr 0.8.0 💰
Onnx Text Recognition (OnnxTR): docTR Onnx-Wrapper for high-performance OCR on documents.
17 versions - Latest release: about 2 months ago - 20.9 thousand downloads last month - 152 stars on GitHub - 1 maintainer
doctr-labeler 0.2.0
A Python package for labeling and annotating documents
8 versions - Latest release: 5 months ago - 103 downloads last month - 15 stars on GitHub - 1 maintainer
chunklet-py 1.4.0
A smart multilingual text chunker for LLMs, RAG, and beyond.
1 version - Latest release: about 1 month ago - 212 downloads last month - 34 stars on GitHub - 1 maintainer
arkeo 0.1.0
markdown archiver betasaurus
1 version - Latest release: 3 months ago - 26 downloads last month - 1 maintainer
textextraction 0.1.4
Extract and process text from images and PDFs
5 versions - Latest release: 6 months ago - 36 downloads last month - 0 stars on GitHub - 1 maintainer
docu-devs-api-client 1.1.0
A client library for accessing DocuDevs API
19 versions - Latest release: 12 days ago - 1.08 thousand downloads last month - 1 maintainer
chunklet 1.4.0
A smart multilingual text chunker for LLMs, RAG, and beyond.
19 versions - Latest release: about 1 month ago - 1.31 thousand downloads last month - 23 stars on GitHub - 1 maintainer
intelisys 0.5.6
Intelligence/AI services for the Lifsys Enterprise with enhanced max_history_words, efficient his...
37 versions - Latest release: about 1 year ago - 112 downloads last month - 0 stars on GitHub - 1 maintainer
llmgraphtransformer 0.1.0
A powerful tool for transforming documents into graph-based structures using Large Language Model...
3 versions - Latest release: 7 months ago - 65 downloads last month - 6 stars on GitHub - 1 maintainer
doc2data 0.2.0
Integrated document processing with machine learning.
3 versions - Latest release: almost 3 years ago - 1 dependent repositories - 25 downloads last month - 10 stars on GitHub - 1 maintainer
bbox-align 0.2.8
A python library that reorders bounding boxes generated by OCR engines into the correct reading o...
11 versions - Latest release: 3 months ago - 144 downloads last month - 2 stars on GitHub - 1 maintainer
detect-row 2.0.5
Hệ thống trích xuất bảng, hàng, cột hoàn chỉnh với AI và GPU support
13 versions - Latest release: 3 months ago - 60 downloads last month - 1 maintainer
rainbow-pdf-processor 0.1.0
A powerful PDF processing tool with text extraction, table recognition, and image extraction capa...
1 version - Latest release: 6 months ago - 11 downloads last month - 0 stars on GitHub - 1 maintainer