An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "documents" keyword

View the packages on the pypi.org package registry that are tagged with the "documents" keyword.

Top 1.7% on pypi.org
svglib 1.6.0
A pure-Python library for reading and converting SVG
23 versions - Latest release: about 2 months ago - 93 dependent packages - 1,712 dependent repositories - 3.25 million downloads last month - 315 stars on GitHub - 2 maintainers
pyugt 1.0.10
Universal Game Translator from on-screen text in Python
32 versions - Latest release: almost 2 years ago - 1 dependent repositories - 151 downloads last month - 32 stars on GitHub - 1 maintainer
Top 1.6% on pypi.org
pyocr 0.8.5
A Python wrapper for OCR engines (Tesseract, Cuneiform, etc)
30 versions - Latest release: about 2 years ago - 5 dependent packages - 255 dependent repositories - 29.1 thousand downloads last month - 7,831 stars on GitHub - 1 maintainer
docler 1.0.2 💰
Abstractions & Tools for OCR / document processing
30 versions - Latest release: about 1 month ago - 160 downloads last month - 2 stars on GitHub - 1 maintainer
Top 1.8% on pypi.org
jsonpath-ng 1.7.0
A final implementation of JSONPath for Python that aims to be standard compliant, including arith...
10 versions - Latest release: about 1 year ago - 214 dependent packages - 1,548 dependent repositories - 60 million downloads last month - 702 stars on GitHub - 2 maintainers
Top 1.0% on pypi.org
cerberus 1.3.8 💰
Lightweight, extensible schema and data validation tool for Pythondictionaries.
30 versions - Latest release: 9 days ago - 140 dependent packages - 2,028 dependent repositories - 5.3 million downloads last month - 3,114 stars on GitHub - 2 maintainers
opendataloader-pdf 1.2.0
A Python wrapper for the opendataloader-pdf Java CLI.
25 versions - Latest release: about 23 hours ago - 1.37 thousand downloads last month - 702 stars on GitHub - 1 maintainer
llama-index-readers-docling 0.4.1
llama-index readers docling integration
8 versions - Latest release: 2 months ago - 13.9 thousand downloads last month - 27,013 stars on GitHub - 1 maintainer
llama-index-node-parser-docling 0.4.1
llama-index node_parser docling integration
7 versions - Latest release: 2 months ago - 10.2 thousand downloads last month - 28,777 stars on GitHub - 1 maintainer
unoserver-appx 1.0.1
A server for file conversions with Libre Office
2 versions - Latest release: over 1 year ago - 8 downloads last month - 778 stars on GitHub - 1 maintainer
docdocgo 0.1.0
Insert pandas DataFrames as tables into Microsoft Word documents
1 version - Latest release: about 2 months ago - 28 downloads last month - 1 maintainer
unstructured-platform 0.4.3
Python SDK for the Unstructured Platform API
4 versions - Latest release: 10 months ago - 13 downloads last month - 1 maintainer
backboard-sdk 1.4.3
Python SDK for the Backboard API - Build conversational AI applications with persistent memory an...
17 versions - Latest release: 5 days ago - 373 downloads last month - 1 maintainer
pr2md 1.0.10 💰
Pull Request Markdown Generator
9 versions - Latest release: 5 days ago - 747 downloads last month - 0 stars on GitHub - 1 maintainer
docsumm-ai 0.1.0
Audience-aware document summarizer for PDF/DOCX/TXT — optimized for context retention, not token ...
1 version - Latest release: about 1 month ago - 56 downloads last month - 0 stars on GitHub - 1 maintainer
whakerkit 2.0
The implementation of a WhakerKit project website.
4 versions - Latest release: about 1 month ago - 83 downloads last month - 1 maintainer
expose-text 0.1.6
A Python module that exposes text for modification in multiple file types.
4 versions - Latest release: about 5 years ago - 1 dependent repositories - 24 downloads last month - 18 stars on GitHub - 1 maintainer
esg-classification 0.1.3
Pipeline for classifying sustainability reports with heuristics and LLM backends.
4 versions - Latest release: about 1 month ago - 73 downloads last month - 1 maintainer
nr-documents-app 1.0.8
Application package for NR documents site
9 versions - Latest release: over 3 years ago - 1 dependent repositories - 16 downloads last month - 0 stars on GitHub - 1 maintainer
frat 2.0.7
Fast Rectangle Annotation Tool
6 versions - Latest release: over 2 years ago - 1 dependent repositories - 21 downloads last month - 9 stars on GitHub - 1 maintainer
gdoc-down 0.0.10
Download Google documents to files
5 versions - Latest release: over 5 years ago - 1 dependent repositories - 37 downloads last month - 15 stars on GitHub - 2 maintainers
docling-google-ocr 2.13.1
SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for power...
2 versions - Latest release: 10 months ago - 109 downloads last month - 38,808 stars on GitHub - 1 maintainer
gradio-test2 0.0.2
This is a test component
1 version - Latest release: about 2 years ago - 13 downloads last month - 1 maintainer
pwr 2.6
This program helps you write html pages like documents in word processors
3 versions - Latest release: about 12 years ago - 4 dependent repositories - 42 downloads last month - 0 stars on GitHub - 1 maintainer
akoma2md 2.0.8
Convertitore da XML Akoma Ntoso a formato Markdown con download automatico delle leggi citate e c...
33 versions - Latest release: 10 days ago - 30.5 thousand downloads last month
dsw-tdk 4.24.0 💰
Data Stewardship Wizard Template Development Toolkit
121 versions - Latest release: 11 days ago - 1 dependent repositories - 1.03 thousand downloads last month - 4 stars on GitHub - 1 maintainer
peslac 0.1.4
A Python package for the Peslac API
5 versions - Latest release: 10 months ago - 22 downloads last month - 0 stars on GitHub - 1 maintainer
scipreprocess 0.1.2
A modular pipeline for preprocessing scientific documents (PDF, DOCX, TEX, XML, TXT)
3 versions - Latest release: about 1 month ago - 180 downloads last month - 1 maintainer
llama-index-readers-preprocess 0.4.1
llama-index readers preprocess integration
10 versions - Latest release: 2 months ago - 40 downloads last month - 4 stars on GitHub - 1 maintainer
modoboa-pdfcredentials 1.5.0 💰
Generate PDF documents containing user credentials
15 versions - Latest release: over 3 years ago - 2 dependent repositories - 130 downloads last month - 8 stars on GitHub - 1 maintainer
scrapontologies 1.1.0 💰
Library for extracting schemas and building ontologies from documents using LLM
2 versions - Latest release: about 1 year ago - 11 downloads last month - 18,931 stars on GitHub - 2 maintainers
django-docmgr 0.6.2
A pluggable Django app to reference documents (files) in models.
2 versions - Latest release: about 2 months ago - 35 downloads last month - 1 maintainer
ffmulticonverter 1.7.1
GUI File Format Converter
13 versions - Latest release: about 2 years ago - 2 dependent repositories - 8 downloads last month - 1 maintainer
docman 0.0.6
Document Manager
6 versions - Latest release: over 1 year ago - 33 downloads last month - 0 stars on GitHub - 1 maintainer
adamsnrc 2.0.13
A robust Python client for the Nuclear Regulatory Commission's ADAMS API
1 version - Latest release: about 2 months ago - 1 maintainer
docsvault 0.1.4
Web app used to securely version your documents on git
6 versions - Latest release: about 5 years ago - 1 dependent repositories - 31 downloads last month - 1 maintainer
snaketex 0.1.3
A LaTeX template system for large and multi-user projects.
5 versions - Latest release: over 9 years ago - 1 dependent repositories - 14 downloads last month - 1 stars on GitHub - 1 maintainer
cognify-sdk 0.1.0
Python SDK for Cognify AI platform
1 version - Latest release: 5 months ago - 22 downloads last month - 1 maintainer
docling 2.59.0
SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for power...
134 versions - Latest release: 16 days ago - 1.52 million downloads last month - 78 stars on GitHub - 1 maintainer
docling-enhanced 2.32.0
SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for power...
1 version - Latest release: 6 months ago - 34 downloads last month - 42,147 stars on GitHub - 1 maintainer
unoserver-fork 1.3.1
A server for file conversions with Libre Office. With function to update index
2 versions - Latest release: about 3 years ago - 9 downloads last month - 0 stars on GitHub - 1 maintainer
docling-sdg 0.4.0
Docling for Synthetic Data Generation (SDG) provides a set of tools to create artificial data fro...
6 versions - Latest release: 3 months ago - 183 downloads last month - 31 stars on GitHub - 1 maintainer
smartchunkllm 0.1.7
Advanced Legal Document Semantic Chunking System
8 versions - Latest release: 3 months ago - 39 downloads last month - 1 maintainer
doxx 0.9.4
A Simple, Flexible Text Templating, Build, & Project Distribution System
5 versions - Latest release: over 10 years ago - 2 dependent repositories - 50 downloads last month - 1 maintainer
ffconverter 2.4.6
File Format Converter with Qt GUI
22 versions - Latest release: about 1 year ago - 196 downloads last month - 5 stars on GitHub - 1 maintainer
pluggedinkit 1.0.1
Official Python SDK for Plugged.in Library API
2 versions - Latest release: about 2 months ago - 33 downloads last month - 0 stars on GitHub - 1 maintainer
dict-curation 0.0.3
A package for curating dictionaries (esp in babylon and stardict formats).
2 versions - Latest release: almost 5 years ago - 1 dependent repositories - 16 downloads last month - 1 maintainer
sleepyconvert 1.0.5
Converts data files, images and documents to different formats
9 versions - Latest release: 7 months ago - 35 downloads last month - 0 stars on GitHub - 1 maintainer
contextf 0.0.1
Efficient context builder
1 version - Latest release: 18 days ago - 0 stars on GitHub - 1 maintainer
Top 8.1% on pypi.org
stencila 2.0.0b2
Python SDK for Stencila
12 versions - Latest release: over 1 year ago - 1 dependent repositories - 76 downloads last month - 845 stars on GitHub - 1 maintainer
stencila_types 2.0.0b2
Python types for Stencila
7 versions - Latest release: over 1 year ago - 53 downloads last month - 845 stars on GitHub - 1 maintainer
stencila_plugin 2.0.0b3
Library for building Stencila Plugins
12 versions - Latest release: over 1 year ago - 57 downloads last month - 845 stars on GitHub - 1 maintainer
ashtadhyayi-data 0.0.3 💰
A package for curating doc file collections, with ability to sync with youtube and archive.org do...
2 versions - Latest release: about 3 years ago - 20 downloads last month - 0 stars on GitHub - 1 maintainer
papermerge-core 2.1.5
Open source document management system for digital archives
77 versions - Latest release: almost 3 years ago - 4 dependent repositories - 280 downloads last month - 388 stars on GitHub - 1 maintainer
mimeogram 1.5
Exchange of file collections with LLMs.
19 versions - Latest release: 4 months ago - 69 downloads last month - 4 stars on GitHub - 1 maintainer
svglibwheel 0.1
A pure-Python library for reading and converting SVG
1 version - Latest release: over 1 year ago - 9 downloads last month - 347 stars on GitHub - 1 maintainer
stencila-pyla 0.3.1
Python interpreter for executable documents
9 versions - Latest release: almost 5 years ago - 1 dependent repositories - 28 downloads last month - 2 stars on GitHub - 1 maintainer
docowling 1.0.17
SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for power...
17 versions - Latest release: 10 months ago - 73 downloads last month - 3 stars on GitHub - 1 maintainer
wdoc 4.1.1
A perfect AI powered RAG for document query and summary. Supports ~all LLM and ~all filetypes (ur...
112 versions - Latest release: 19 days ago - 1.11 thousand downloads last month - 46 stars on GitHub - 1 maintainer
similar-documents 0.1.4
Generate similarity scores for documents from cli
6 versions - Latest release: over 4 years ago - 1 dependent repositories - 22 downloads last month - 8 stars on GitHub - 1 maintainer
nr-documents-records-model-builder 1.0.0
OARepo model builder extension for NR document records
2 versions - Latest release: about 3 years ago - 15 downloads last month - 1 maintainer
libreserver 0.1.1
A server for file conversions with Libre Office
2 versions - Latest release: almost 2 years ago - 106 downloads last month - 0 stars on GitHub - 1 maintainer
doc-curation 0.1.20 💰
A package for curating doc file collections, with ability to sync with youtube and archive.org do...
28 versions - Latest release: almost 2 years ago - 1 dependent package - 2 dependent repositories - 967 downloads last month - 7 stars on GitHub - 1 maintainer
doctoolsllm 0.99.0
(Now winston_doc) A perfect AI powered RAG for document query and summary. Supports ~all LLM and ...
21 versions - Latest release: over 1 year ago - 128 downloads last month - 480 stars on GitHub - 1 maintainer
unstructured-expanded 0.17.2
Expansion to the unstructured package, adding support for image extraction.
10 versions - Latest release: 6 months ago - 147 downloads last month - 0 stars on GitHub - 1 maintainer
redacted-py 1.0.8
Redacting classified documents
4 versions - Latest release: 10 months ago - 32 downloads last month - 3 stars on GitHub - 1 maintainer
cdpdumpingutils 0.2.1
Download all your courses filles from cahier de prepa
2 versions - Latest release: over 2 years ago - 24 downloads last month - 18 stars on GitHub - 1 maintainer
docdump 1.0.4
A package to extract text from common document types.
5 versions - Latest release: almost 5 years ago - 1 dependent repositories - 45 downloads last month - 0 stars on GitHub - 1 maintainer
tei-chunker 0.1.0
Hierarchical document chunking for TEI XML documents
1 version - Latest release: 9 months ago - 7 downloads last month - 1 maintainer
classify-bills 1.0.1
Automatically sort and archive PDF bills and statements
2 versions - Latest release: over 6 years ago - 1 dependent repositories - 10 downloads last month - 3 stars on GitHub - 1 maintainer
docuver 0.1.0
A meta tool for version control of Office documents (docx, xlsx, pptx, odt, ods, odp)
1 version - Latest release: 3 months ago - 15 downloads last month
extended-docling 2.12.1
SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for power...
1 version - Latest release: 11 months ago - 22 downloads last month - 42,147 stars on GitHub - 1 maintainer
docx-footer-extractor 1.0.0
A Python library for extracting metadata from DOCX file footers using parallel processing
1 version - Latest release: 4 months ago - 83 downloads last month - 0 stars on GitHub - 1 maintainer
quickdocs 1.6.3
Creates HTML docs from a project's readme and sphinx-apidoc.
10 versions - Latest release: over 4 years ago - 1 dependent package - 1 dependent repositories - 38 downloads last month - 1 stars on GitHub - 1 maintainer
dedoc 2.3.2
Extract content and logical tree structure from textual documents
32 versions - Latest release: 11 months ago - 1.3 thousand downloads last month - 592 stars on GitHub - 1 maintainer
jsonpath-ig 1.5.3
A final implementation of JSONPath for Python that aims to be standard compliant, including arith...
2 versions - Latest release: over 4 years ago - 1 dependent repositories - 20 downloads last month - 589 stars on GitHub - 1 maintainer
jsonpath-ng-i 1.0.3
A final implementation of JSONPath for Python that aims to be standard compliant, including arith...
4 versions - Latest release: over 4 years ago - 1 dependent repositories - 25 downloads last month - 590 stars on GitHub - 1 maintainer
nr-documents-records 1.0.0
NR documents data model
2 versions - Latest release: about 3 years ago - 14 downloads last month - 1 maintainer
docoskin 0.1.0
"Onion-skin" visual differences between a reference document image and a scanned copy
1 version - Latest release: almost 6 years ago - 1 dependent repositories - 14 downloads last month - 2 stars on GitHub - 1 maintainer
blazingdocs 1.0.1
BlazingDocs Python client
1 version - Latest release: about 4 years ago - 1 dependent repositories - 24 downloads last month - 2 stars on GitHub - 1 maintainer
vecsync 0.7.0
A simple command-line utility for synchronizing documents to vector storage for LLM interaction.
8 versions - Latest release: 4 months ago - 41 downloads last month - 0 stars on GitHub - 1 maintainer
oldp 0.8.0
Open Legal Data Platform
2 versions - Latest release: over 6 years ago - 1 dependent repositories - 16 downloads last month - 121 stars on GitHub - 1 maintainer
egnyte-langchain-connector 0.0.4
LangChain retriever integration for Egnyte document search and retrieval
4 versions - Latest release: 24 days ago - 269 downloads last month - 1 maintainer
edocuments 1.1.0
eDocuments - a simple and productive personal documents library
10 versions - Latest release: over 7 years ago - 2 dependent repositories - 58 downloads last month - 2 stars on GitHub - 1 maintainer
zen_document_parser 0.11
A library for parsing various government documents as well as general PDFs
1 version - Latest release: over 9 years ago - 2 dependent repositories - 10 downloads last month - 2 stars on GitHub - 1 maintainer
Top 6.3% on pypi.org
boxdetect 1.0.2 💰
boxdetect is a Python package based on OpenCV which allows you to easily detect rectangular shape...
5 versions - Latest release: almost 3 years ago - 1 dependent package - 3 dependent repositories - 8.27 thousand downloads last month - 111 stars on GitHub - 1 maintainer
invenio-documents 0.1.0.post1
Invenio module that adds filesystem abstraction.
3 versions - Latest release: about 2 years ago - 1 dependent package - 2 dependent repositories - 18 downloads last month - 1 stars on GitHub - 4 maintainers
draftable-compare-api 1.4.3
Client library for the Draftable document comparison API
20 versions - Latest release: 3 months ago - 2 dependent repositories - 2.95 thousand downloads last month - 9 stars on GitHub - 2 maintainers
smog 0.0.4 removed
simple media organizer
4 versions - Latest release: about 3 years ago - 19 downloads last month - 1 stars on GitHub - 1 maintainer
marshmallow-br 0.1.1
An unofficial extension to Marshmallow fields and validators for Brazilian documents
3 versions - Latest release: almost 3 years ago - 20 downloads last month - 0 stars on GitHub - 1 maintainer
mongoengine_fuel 1.0.3
Factory for MongoDB documents created with mongoengine
4 versions - Latest release: about 2 years ago - 2 dependent repositories - 4 downloads last month - 25 stars on GitHub - 1 maintainer
srsparser 1.4.9
A library that translates semi-structured documents into a structured form and contains natural l...
64 versions - Latest release: over 3 years ago - 1 dependent repositories - 73 downloads last month - 2 stars on GitHub - 1 maintainer
rustpy-toolkit 0.1.1
High-performance Polars expressions for Brazilian document validation and text processing
2 versions - Latest release: 5 months ago - 32 downloads last month - 1 maintainer
ragdex 0.2.3
RAG-powered document indexing and search for MCP (Model Context Protocol)
8 versions - Latest release: 24 days ago - 804 downloads last month - 0 stars on GitHub
iddt 0.1.13
Internet Document Discovery Tool
12 versions - Latest release: almost 10 years ago - 2 dependent repositories - 15 downloads last month - 0 stars on GitHub - 1 maintainer
dotloop 1.2.0
Python wrapper for Dotloop API - Real estate transaction management and document handling
4 versions - Latest release: 3 months ago - 113 downloads last month - 1 maintainer
kindlepy 1.0.0
CLI tool for mailing your documents to your kindle device.
1 version - Latest release: almost 9 years ago - 1 dependent repositories - 7 downloads last month - 2 stars on GitHub - 1 maintainer
epaper 0.0.0
A simple and productive personal documents library
1 version - Latest release: about 2 years ago - 2 dependent repositories - 2 stars on GitHub - 1 maintainer
word2html 0.3.0
A quick and dirty script to convert a Word (docx) document to html.
5 versions - Latest release: over 4 years ago - 1 dependent repositories - 27 downloads last month - 53 stars on GitHub - 1 maintainer
bundestag-api 1.3.0
Python wrapper for the official Bundestag-API
10 versions - Latest release: 26 days ago - 82 downloads last month - 3 stars on GitHub - 1 maintainer