pypi.org "document" keyword
View the packages on the pypi.org package registry that are tagged with the "document" keyword.
ppmd 0.1.1
A tool to convert from PowerPoint OOXML files to Markdown Presentations2 versions - Latest release: over 8 years ago - 1 dependent repositories - 133 downloads last month - 4 stars on GitHub - 1 maintainer
stencila_types 2.0.0b2
Python types for Stencila7 versions - Latest release: 9 months ago - 217 downloads last month - 820 stars on GitHub - 1 maintainer
Top 8.1% on pypi.org
12 versions - Latest release: 9 months ago - 1 dependent repositories - 2.3 thousand downloads last month - 820 stars on GitHub - 1 maintainer
stencila 2.0.0b2
Python SDK for Stencila12 versions - Latest release: 9 months ago - 1 dependent repositories - 2.3 thousand downloads last month - 820 stars on GitHub - 1 maintainer
stencila_plugin 2.0.0b3
Library for building Stencila Plugins12 versions - Latest release: 9 months ago - 312 downloads last month - 820 stars on GitHub - 1 maintainer
Top 6.3% on pypi.org
42 versions - Latest release: about 2 months ago - 6 dependent repositories - 1.3 thousand downloads last month - 1,489 stars on GitHub - 1 maintainer
papis 0.14.1
Powerful and highly extensible command-line based document and bibliography manager42 versions - Latest release: about 2 months ago - 6 dependent repositories - 1.3 thousand downloads last month - 1,489 stars on GitHub - 1 maintainer
Top 0.8% on pypi.org
26 versions - Latest release: almost 3 years ago - 21 dependent packages - 189 dependent repositories - 3.12 million downloads last month - 1 maintainer
nested-lookup 0.2.25
Python functions for working with deeply nested documents (lists and dicts)26 versions - Latest release: almost 3 years ago - 21 dependent packages - 189 dependent repositories - 3.12 million downloads last month - 1 maintainer
etherpump 0.0.20 π°
Pumping text from etherpads into publications20 versions - Latest release: about 4 years ago - 1 dependent repositories - 518 downloads last month - 17,317 stars on GitHub - 2 maintainers
hycli 0.6.0
Hypatos cli tool to batch extract documents through the API and to compare the results.40 versions - Latest release: over 3 years ago - 1 dependent repositories - 400 downloads last month - 1 maintainer
rer.structured_content 1.9.3
A simple folderish page content for Plone11 versions - Latest release: over 8 years ago - 2 dependent repositories - 174 downloads last month - 3 maintainers
doxhooks 0.6.0
Abstract away the content and maintenance of files in your project.5 versions - Latest release: over 9 years ago - 1 dependent repositories - 229 downloads last month - 1 stars on GitHub - 1 maintainer
lds 0.0.1
A package written in Python that mimics some of the functionality provided by MongoDB.1 version - Latest release: about 7 years ago - 1 dependent repositories - 67 downloads last month - 0 stars on GitHub - 1 maintainer
Top 0.8% on pypi.org
50 versions - Latest release: 6 months ago - 37 dependent packages - 337 dependent repositories - 615 thousand downloads last month - 323 stars on GitHub - 1 maintainer
yattag 1.16.1
Generate HTML or XML in a pythonic way. Pure python alternative to web template engines.Can fill ...50 versions - Latest release: 6 months ago - 37 dependent packages - 337 dependent repositories - 615 thousand downloads last month - 323 stars on GitHub - 1 maintainer
doc2html 0.0.1
Convert documents between formats. For example docx to html or pdf to html2 versions - Latest release: 9 months ago - 114 downloads last month - 0 stars on GitHub - 1 maintainer
sphinxjp.usaturn 0.2.0
a sphinx extension to add new admonition named `usaturn`2 versions - Latest release: over 10 years ago - 3 dependent repositories - 91 downloads last month - 1 maintainer
docoskin 0.1.0
"Onion-skin" visual differences between a reference document image and a scanned copy1 version - Latest release: about 5 years ago - 1 dependent repositories - 28 downloads last month - 2 stars on GitHub - 1 maintainer
pydelorean 1.5.4
A package to convert between markup documents and a forest data structure for efficient processing.12 versions - Latest release: over 1 year ago - 304 downloads last month - 1 stars on GitHub - 1 maintainer
panpdf 0.7.1
Transform Jupyter Notebooks and Markdown into publication-quality PDF documents with high-fidelit...40 versions - Latest release: 1 day ago - 1.91 thousand downloads last month - 0 stars on GitHub - 1 maintainer
fedfold 0.0.2
2 versions - Latest release: 11 months ago - 85 downloads last month - 1 maintainerany-parser 0.0.22
Parser for all.15 versions - Latest release: 2 days ago - 748 downloads last month - 120 stars on GitHub - 1 maintainer
smalldb 1.1
A lightweight (NoSQL) document oriented database, simple and easy to integrate into small apps1 version - Latest release: over 4 years ago - 1 dependent repositories - 27 downloads last month - 2 stars on GitHub - 1 maintainer
axa-fr-splitter 1.0.0
This package splits PDF and TIFF files into separate PNGs and extracts text from input files.10 versions - Latest release: 6 months ago - 527 downloads last month - 3 stars on GitHub - 1 maintainer
autodocx 0.1.0
A tool for automating Microsoft Word Documents generation1 version - Latest release: over 2 years ago - 62 downloads last month - 0 stars on GitHub - 1 maintainer
kodb 0.1.6
kmaasrud's opinionated document builder7 versions - Latest release: about 4 years ago - 1 dependent repositories - 254 downloads last month - 15 stars on GitHub - 1 maintainer
pixparse 0.1.0.dev0
1 version - Latest release: almost 2 years ago - 52 downloads last month - 1 maintainerdocext 0.1.12
Onprem information extraction from documents10 versions - Latest release: 9 days ago - 753 downloads last month - 100 stars on GitHub - 1 maintainer
wia_scan 0.8.0
wia_scan 0.8.014 versions - Latest release: about 2 years ago - 509 downloads last month - 0 stars on GitHub - 1 maintainer
docfilter 0.1.0
The Python package docfilter is used to detect and remove inappropriate information from text.1 version - Latest release: about 2 years ago - 62 downloads last month - 2 stars on GitHub - 1 maintainer
docnetdb 0.6.1
A pure Python document and graph database engine7 versions - Latest release: almost 5 years ago - 1 dependent repositories - 230 downloads last month - 2 stars on GitHub - 1 maintainer
rsv 1.5.3
A module for reading and writing an RSV document file.7 versions - Latest release: 3 days ago - 750 downloads last month - 2 stars on GitHub - 1 maintainer
xyseg 0.0.2
Recursive Segmentation Algorithm2 versions - Latest release: 6 months ago - 61 downloads last month - 1 stars on GitHub - 1 maintainer
extractflow-parser 1.0.1
Parse PDF documents into markdown formatted content using Vision LLMs5 versions - Latest release: 4 months ago - 190 downloads last month - 8 stars on GitHub - 1 maintainer
docuseal 1.0.4
DocuSeal Python API client5 versions - Latest release: about 2 months ago - 1.81 thousand downloads last month - 1 stars on GitHub - 1 maintainer
llama-cloud-services 0.6.12
Tailored SDK clients for LlamaCloud services.13 versions - Latest release: 8 days ago - 2.54 million downloads last month - 3,878 stars on GitHub - 1 maintainer
craft-text-detector 0.4.3 π°
Fast and accurate text detection library built on CRAFT implementation13 versions - Latest release: almost 3 years ago - 1 dependent package - 8 dependent repositories - 5.35 thousand downloads last month - 12 stars on GitHub - 1 maintainer
blobstash 0.1.0
BlobStash client1 version - Latest release: over 6 years ago - 1 dependent repositories - 36 downloads last month - 2 stars on GitHub - 1 maintainer
docp 0.2.0
A basic document parsing and loading utility.3 versions - Latest release: 2 months ago - 121 downloads last month - 0 stars on GitHub - 1 maintainer
pyjquery 2.3.4
The Write Less, Do More, JavaScript Library for DicksonUI2 versions - Latest release: over 4 years ago - 1 dependent repositories - 87 downloads last month - 1 maintainer
django-asset-convoy 0.1.3 π°
Asset packager for Django applications4 versions - Latest release: almost 11 years ago - 2 dependent repositories - 143 downloads last month - 36,832 stars on GitHub - 1 maintainer
aztex 0.0.3
A zippy markdown to latex compiler3 versions - Latest release: almost 10 years ago - 2 dependent repositories - 106 downloads last month - 0 stars on GitHub - 1 maintainer
pdf-craft 0.0.19
PDF craft can convert PDF files into various other formats. This project will focus on processing...15 versions - Latest release: 8 days ago - 5.09 thousand downloads last month - 1,983 stars on GitHub - 1 maintainer
relo 0.7-beta
Recursive Document Content Search in Python4 versions - Latest release: over 13 years ago - 1 dependent repositories - 186 downloads last month - 5 stars on GitHub - 1 maintainer
docts 0.2.3
python package docts11 versions - Latest release: almost 2 years ago - 1 dependent repositories - 401 downloads last month - 18 stars on GitHub - 1 maintainer
docling-ibm-models 3.4.1
This package contains the AI models used by the Docling PDF conversion package38 versions - Latest release: about 2 months ago - 289 thousand downloads last month - 1,044 stars on GitHub - 1 maintainer
kooki 0.19.3
The ultimate document generator.56 versions - Latest release: over 5 years ago - 1 dependent repositories - 969 downloads last month - 1 stars on gitlab.com - 1 maintainer
pxy 0.0.8
tools of python8 versions - Latest release: almost 4 years ago - 2 dependent packages - 1 dependent repositories - 273 downloads last month - 1 maintainer
docxer 0.0.1b0
The best way to work with DOCX in Python.1 version - Latest release: about 2 years ago - 61 downloads last month - 4 stars on GitHub - 1 maintainer
contextgem 0.1.1
Easier and faster way to build LLM extraction workflows through powerful abstractions3 versions - Latest release: 12 days ago - 338 downloads last month - 35 stars on GitHub - 1 maintainer
pyzerox-impacte 0.0.8
ocr documents using vision models from all popular providers like OpenAI, Azure OpenAI, Anthropic...2 versions - Latest release: 4 months ago - 103 downloads last month - 1,701 stars on GitHub - 1 maintainer
mswordtree 0.1.1
Get the parsed microsoft word document in a hierarchical tree structure.8 versions - Latest release: over 5 years ago - 1 dependent repositories - 439 downloads last month - 6 stars on GitHub - 1 maintainer
llm-parse 0.1.4
Parse data from documents optimised for downstream llm tasks.5 versions - Latest release: 7 months ago - 258 downloads last month - 3,859 stars on GitHub - 1 maintainer
Top 6.8% on pypi.org
68 versions - Latest release: 22 days ago - 3 dependent packages - 1 dependent repositories - 188 thousand downloads last month - 438 stars on GitHub - 4 maintainers
amazon-textract-textractor 1.9.1
A package to use AWS Textract services.68 versions - Latest release: 22 days ago - 3 dependent packages - 1 dependent repositories - 188 thousand downloads last month - 438 stars on GitHub - 4 maintainers
docling 2.29.0
SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for power...95 versions - Latest release: 9 days ago - 405 thousand downloads last month - 78 stars on GitHub - 1 maintainer
limberer 0.9.2
A flexible document generator based on weasyprint, mustache templates, and pandoc.28 versions - Latest release: about 2 months ago - 569 downloads last month - 2 stars on GitHub - 1 maintainer
disseminate 2.3.9
A document processor and generation engine9 versions - Latest release: almost 4 years ago - 1 dependent repositories - 349 downloads last month - 5 stars on GitHub - 2 maintainers
fx-doc 0.12.0
Build reStructuredText to HTML, PDF and text17 versions - Latest release: over 6 years ago - 1 dependent repositories - 537 downloads last month - 1 stars on GitHub - 1 maintainer
driveline-video 0.14.1
1533 Systems Driveline video decoder2 versions - Latest release: about 6 years ago - 1 dependent repositories - 54 downloads last month - 1 maintainer
beancount-docverif 1.0.1
Document verification plugin for Beancount8 versions - Latest release: about 3 years ago - 1 dependent repositories - 269 downloads last month - 3 stars on GitHub - 1 maintainer
beancount-payeeverif 1.0.2
Payee verification plugin for Beancount2 versions - Latest release: about 3 years ago - 1 dependent repositories - 83 downloads last month - 1 stars on GitHub - 1 maintainer
docsilhouette 0.1.0
Document aesthetics and text extractor1 version - Latest release: almost 3 years ago - 1 dependent repositories - 22 downloads last month - 1 maintainer
quicksand 2.0.13
QuickSand is a module to scan streams inside documents with Yara7 versions - Latest release: over 3 years ago - 2 dependent packages - 1 dependent repositories - 1.66 thousand downloads last month - 116 stars on GitHub - 1 maintainer
docbucket 0.1
A simple document manager for individuals written with Django1 version - Latest release: over 14 years ago - 2 dependent repositories - 59 downloads last month - 1 maintainer
groupdocs-annotation-cloud 23.12
GroupDocs.Annotation Cloud Python SDK7 versions - Latest release: over 1 year ago - 1 dependent repositories - 294 downloads last month - 0 stars on GitHub - 1 maintainer
extended-docling 2.12.1
SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for power...1 version - Latest release: 4 months ago - 74 downloads last month - 26,056 stars on GitHub - 1 maintainer
imio.zamqp.dms 0.21.0
zamqp consumer for imio.dms.mail14 versions - Latest release: about 2 months ago - 372 downloads last month - 0 stars on GitHub - 2 maintainers
taguette 1.4.1
Free and open source qualitative research tool27 versions - Latest release: about 2 years ago - 1 dependent repositories - 1.15 thousand downloads last month - 70 stars on gitlab.com - 2 maintainers
Top 2.3% on pypi.org
40 versions - Latest release: 8 days ago - 6 dependent packages - 51 dependent repositories - 138 thousand downloads last month - 118 stars on GitHub - 1 maintainer
aspose-words 25.4.0
Aspose.Words for Python is a Document Processing library that allows developers to work with docu...40 versions - Latest release: 8 days ago - 6 dependent packages - 51 dependent repositories - 138 thousand downloads last month - 118 stars on GitHub - 1 maintainer
dink 0.0.5
('The DInk Python library provides a pythonic interface to the DInk API.',)6 versions - Latest release: over 3 years ago - 1 dependent repositories - 146 downloads last month - 0 stars on GitHub - 1 maintainer
llm-document-analysis 0.1.1
A Python library for LLM-powered document analysis and processing2 versions - Latest release: 2 months ago - 98 downloads last month - 1 maintainer
koncile 0.1.0
Official Python SDK for Koncile API1 version - Latest release: about 1 month ago - 97 downloads last month
page2struct 1.0.0
Repo to extract structure from document page.1 version - Latest release: over 1 year ago - 57 downloads last month - 1 maintainer
repozitory 0.2.1
Simple document versioning for web apps, especially Pyramid apps.7 versions - Latest release: over 13 years ago - 2 dependent repositories - 146 downloads last month - 3 stars on GitHub - 1 maintainer
sortstream 1.2.2
NLP Document Classification Tool.3 versions - Latest release: over 4 years ago - 1 dependent repositories - 71 downloads last month - 0 stars on GitHub - 1 maintainer
imongo-orm 0.0.4
MongoDB ORM interface for Python4 versions - Latest release: over 2 years ago - 190 downloads last month - 1 maintainer
Top 8.7% on pypi.org
104 versions - Latest release: about 1 month ago - 2 dependent packages - 9 dependent repositories - 15.9 thousand downloads last month - 25 stars on GitHub - 1 maintainer
docsig 0.69.3
Check signature params for proper documentation104 versions - Latest release: about 1 month ago - 2 dependent packages - 9 dependent repositories - 15.9 thousand downloads last month - 25 stars on GitHub - 1 maintainer
pdf-to-markdown-cli 0.2.0
CLI tool to convert PDF files (and other documents) to markdown using the Marker API.1 version - Latest release: 8 days ago - 5 stars on GitHub - 1 maintainer
dittostore 0.0.8
Simple and easy to use ODM for Google Datastore8 versions - Latest release: over 6 years ago - 1 dependent repositories - 275 downloads last month - 2 stars on GitHub - 1 maintainer
wikit-chatintents 0.0.5
ChatIntents automatically clusters and labels short text intent messages. Spacy model changed fro...1 version - Latest release: almost 2 years ago - 52 downloads last month - 172 stars on GitHub - 1 maintainer
chatintents 0.0.1
ChatIntents automatically clusters and labels short text intent messages.1 version - Latest release: over 3 years ago - 1 dependent repositories - 115 downloads last month - 172 stars on GitHub - 1 maintainer
driveline 0.14.1
Driveline client2 versions - Latest release: about 6 years ago - 1 dependent repositories - 92 downloads last month - 1 maintainer
Top 2.6% on pypi.org
37 versions - Latest release: 5 months ago - 2 dependent packages - 28 dependent repositories - 11.7 thousand downloads last month - 3,021 stars on GitHub - 1 maintainer
top2vec 1.0.36
Top2Vec learns jointly embedded topic, document and word vectors.37 versions - Latest release: 5 months ago - 2 dependent packages - 28 dependent repositories - 11.7 thousand downloads last month - 3,021 stars on GitHub - 1 maintainer
indoxrag 0.1.1
Indox Retrieval Augmentation5 versions - Latest release: 3 months ago - 189 downloads last month - 18 stars on GitHub - 1 maintainer
indox 0.1.31
Indox Retrieval Augmentation29 versions - Latest release: 7 months ago - 839 downloads last month - 18 stars on GitHub - 2 maintainers
indoxminer 0.1.5
Indox Data Extraction19 versions - Latest release: 2 months ago - 632 downloads last month - 18 stars on GitHub - 2 maintainers
indoxgen 0.2.0
Indox Synthetic Data Generation14 versions - Latest release: 4 months ago - 504 downloads last month - 18 stars on GitHub - 2 maintainers
indoxarcg 0.0.14
Indox Retrieval Augmentation13 versions - Latest release: 21 days ago - 1.06 thousand downloads last month - 18 stars on GitHub - 2 maintainers
indoxraghelper 0.0.3
Indox Retrieval Augmentation3 versions - Latest release: 2 months ago - 156 downloads last month - 18 stars on GitHub - 1 maintainer
trytond-edocument-uncefact 7.4.0
Tryton module for electronic document UN/CEFACT29 versions - Latest release: 6 months ago - 1 dependent package - 1 dependent repositories - 828 downloads last month - 2 maintainers
documentdownloader 1.1.0
book118ζζ‘£δΈθ½½ε¨3 versions - Latest release: over 3 years ago - 1 dependent repositories - 98 downloads last month - 79 stars on GitHub - 1 maintainer
pyeurovoc 1.3.0
Python API for multilingual legal document classification with EuroVoc descriptors using BERT mod...18 versions - Latest release: over 2 years ago - 1 dependent repositories - 419 downloads last month - 24 stars on GitHub - 1 maintainer
Top 6.6% on pypi.org
10 versions - Latest release: over 4 years ago - 1 dependent package - 11 dependent repositories - 406 downloads last month - 5,946 stars on GitHub - 1 maintainer
parsr-client 3.2.3
Python client for Parsr - Transforms PDF, Documents and Images into Enriched Structured Data10 versions - Latest release: over 4 years ago - 1 dependent package - 11 dependent repositories - 406 downloads last month - 5,946 stars on GitHub - 1 maintainer
htmlfactory 0.0.1
A simple way to produce HTML with Python.1 version - Latest release: about 4 years ago - 1 dependent repositories - 59 downloads last month - 1 stars on GitHub - 1 maintainer
docling-google-ocr 2.13.1
SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for power...2 versions - Latest release: 3 months ago - 117 downloads last month - 26,056 stars on GitHub - 1 maintainer
llama-index-packs-multi-document-agents 0.4.0
llama-index packs multi_document_agents integration7 versions - Latest release: 5 months ago - 213 downloads last month - 3,442 stars on GitHub - 1 maintainer
pyjondb 2.0
A lightweight, encrypted JSON-based database with support for collections, document operations, a...11 versions - Latest release: 11 months ago - 177 downloads last month - 0 stars on GitHub - 1 maintainer
shegox-oidc-test 0.3.4
Python client library for convenient usage of SAP Business Document Processing services1 version - Latest release: over 1 year ago - 49 downloads last month - 22 stars on GitHub - 1 maintainer
sap-business-document-processing 0.4.1
Python client library for convenient usage of SAP Business Document Processing services9 versions - Latest release: 12 months ago - 1 dependent repositories - 2.16 thousand downloads last month - 22 stars on GitHub - 3 maintainers
pylatexparse 2019.2
A parser and document tree for LaTeX documents2 versions - Latest release: over 5 years ago - 1 dependent repositories - 49 downloads last month - 0 stars on GitHub - 1 maintainer
palimpzest 0.7.1
Palimpzest is a system which enables anyone to process AI-powered analytical queries simply by de...34 versions - Latest release: 25 days ago - 1.38 thousand downloads last month - 96 stars on GitHub - 1 maintainer
imprint 0.1.0a3
Program for creating MS Word reports from data and content templates4 versions - Latest release: about 2 years ago - 1 dependent repositories - 206 downloads last month - 0 stars on gitlab.com - 1 maintainer
django-documents 0.0.3
Attach documents to django models3 versions - Latest release: almost 13 years ago - 2 dependent repositories - 113 downloads last month - 17 stars on GitHub - 1 maintainer
Related Keywords
pdf
50
python
38
html
19
docx
18
ai
18
markdown
17
nlp
15
database
14
documents
14
llm
14
word
13
api
12
text
12
structured-data
12
ocr
12
ml
11
convert
11
unstructured-data
11
pptx
10
parser
10
template
10
doc
9
django
8
rag
8
machine learning
8
latex
8
extraction
8
json
7
tables
7
PDF
7
processing
7
segmentation
7
file
7
python3
7
classification
7
index
7
nosql
6
generator
6
mongodb
6
machine-learning
6
document-parsing
6
pandoc
6
xml
6
search
6
excel
6
powerpoint
6
management
6
RAG
6
layout model
5
system
5
crawler
5
content
5
office
5
LLM
5
data
5
NLP
5
AI
5
deep learning
5
language models
5
tryton
5
learning
5
information
5
converter
5
pip
5
table structure
5
docling
5
table former
5
deep-learning
5
pdf-to-text
5
pdf-to-json
5
pdf-document-processor
5
parsing
5
document-parser
5
dms
5
images
5
xlsx
5
compare
4
opencv
4
reader
4
split
4
markup
4
query
4
legal
4
windows
4
parse
4
report
4
epub
4
odt
4
natural language processing
4
image
4
retrieval-augmented generation
4
orm
4
η¬θ«
4
OCR
4
sdk
4
wiki
4
η₯θ―εΊ
4
ζζ‘£
4
microsoft
4
research
4