pypi.org "table-extraction" keyword
View the packages on the pypi.org package registry that are tagged with the "table-extraction" keyword.
Top 1.4% on pypi.org
135 versions - Latest release: 13 days ago - 206 dependent packages - 1,798 dependent repositories - 15.1 million downloads last month - 7,909 stars on GitHub - 1 maintainer
pymupdf 1.26.4
A high performance Python library for data extraction, analysis, conversion & manipulation of PDF...135 versions - Latest release: 13 days ago - 206 dependent packages - 1,798 dependent repositories - 15.1 million downloads last month - 7,909 stars on GitHub - 1 maintainer
Top 0.8% on pypi.org
73 versions - Latest release: 3 months ago - 118 dependent packages - 1,210 dependent repositories - 5.9 million downloads last month - 8,213 stars on GitHub - 1 maintainer
pdfplumber 0.11.7
Plumb a PDF for detailed information about each char, rectangle, and line.73 versions - Latest release: 3 months ago - 118 dependent packages - 1,210 dependent repositories - 5.9 million downloads last month - 8,213 stars on GitHub - 1 maintainer
quipucamayoc 0.1.2
Tools to extract information from digitized historical documents2 versions - Latest release: over 3 years ago - 1 dependent repositories - 8 downloads last month - 30 stars on GitHub - 1 maintainer
kreuzberg 3.13.0
Document intelligence framework for Python - Extract text, metadata, and structured data from div...46 versions - Latest release: 3 days ago - 179 thousand downloads last month - 2,329 stars on GitHub - 1 maintainer
tablecv 0.1.1
Table extraction from image.2 versions - Latest release: almost 2 years ago - 175 downloads last month - 10 stars on GitHub - 1 maintainer
docstrange 1.1.5
Extract and Convert PDF, Word, PowerPoint, Excel, images, URLs into multiple formats (Markdown, J...16 versions - Latest release: 5 days ago - 1.93 thousand downloads last month - 493 stars on GitHub - 1 maintainer
Top 1.8% on pypi.org
18 versions - Latest release: about 1 year ago - 4 dependent packages - 133 dependent repositories - 1.62 million downloads last month - 7,854 stars on GitHub - 1 maintainer
pymupdfb 1.24.10
MuPDF shared libraries for PyMuPDF.18 versions - Latest release: about 1 year ago - 4 dependent packages - 133 dependent repositories - 1.62 million downloads last month - 7,854 stars on GitHub - 1 maintainer
aqpymupdf 1.23.7
A high performance Python library for data extraction, analysis, conversion & manipulation of PDF...1 version - Latest release: over 1 year ago - 44 downloads last month - 7,854 stars on GitHub - 1 maintainer
pdfplumber-aemc 0.11.3
Plumb a PDF for detailed information about each char, rectangle, and line.16 versions - Latest release: over 1 year ago - 1 dependent repositories - 133 downloads last month - 8,179 stars on GitHub - 1 maintainer
llm-data-converter 2.2.0
Best open-source document to markdown converter for LLM training data. Convert PDF, Word, PowerPo...23 versions - Latest release: about 1 month ago - 297 downloads last month - 3 stars on GitHub - 1 maintainer
document-data-extractor 1.0.4
Best open-source document to markdown extractor for LLM training data. Convert PDF, Word, PowerPo...5 versions - Latest release: about 1 month ago - 78 downloads last month - 3 stars on GitHub - 1 maintainer
Top 7.1% on pypi.org
57 versions - Latest release: 28 days ago - 1 dependent package - 4 dependent repositories - 35.1 thousand downloads last month - 788 stars on GitHub - 1 maintainer
img2table 1.4.2
img2table is a table identification and extraction Python Library for PDF and images, based on Op...57 versions - Latest release: 28 days ago - 1 dependent package - 4 dependent repositories - 35.1 thousand downloads last month - 788 stars on GitHub - 1 maintainer
table-transformer 1.0.6
Table Transformer5 versions - Latest release: 12 months ago - 499 downloads last month - 2,682 stars on GitHub - 1 maintainer
extractable 1.0.2
Extract tables from PDFs124 versions - Latest release: over 1 year ago - 1 dependent repositories - 1.3 thousand downloads last month - 30 stars on GitHub - 1 maintainer
pyany2json 0.1.3
Python binding to Any2Json4 versions - Latest release: over 1 year ago - 12 downloads last month - 0 stars on GitHub - 1 maintainer
pdfmod 0.1.5
A tool for PDF file manipulation.1 version - Latest release: 10 months ago - 20 downloads last month - 7,114 stars on GitHub - 1 maintainer
docext 0.1.14
Onprem information extraction from documents12 versions - Latest release: 2 months ago - 914 downloads last month - 1,660 stars on GitHub - 1 maintainer
mseep-kreuzberg 3.8.2
Document intelligence framework for Python - Extract text, metadata, and structured data from div...1 version - Latest release: about 2 months ago
depdf 0.2.2
PDF table & paragraph extractor4 versions - Latest release: about 5 years ago - 1 dependent repositories - 63 downloads last month - 11 stars on GitHub - 1 maintainer
extracttable 2.4.0
Extract table data from images and scanned PDFs. Easily convert image to excel, convert pdf to table16 versions - Latest release: about 3 years ago - 1 dependent repositories - 1.24 thousand downloads last month - 279 stars on GitHub - 1 maintainer
pdftablr 0.1.0
Python3 implementation of Kyle Cronan's pdftable module, with unit tests1 version - Latest release: almost 8 years ago - 1 dependent repositories - 47 downloads last month - 2 stars on GitHub - 1 maintainer
markdrop 3.5.0
A comprehensive PDF processing toolkit that converts PDFs to markdown with advanced AI-powered fe...20 versions - Latest release: 2 months ago - 365 downloads last month - 116 stars on GitHub - 2 maintainers
krank 0.0.1
Fetch psychology datasets from remote sources.2 versions - Latest release: almost 3 years ago - 1 dependent repositories - 13 downloads last month - 0 stars on GitHub - 1 maintainer
Related Keywords
pdf
13
ocr
12
python
9
rag
6
document-processing
6
tesseract
6
text-extraction
5
llm
4
markdown
4
image-processing
4
pdf-to-markdown
4
xps
4
data-science
4
epub
4
extract-data
4
font
4
mupdf
4
pdf-documents
4
pymupdf
4
text-shaping
4
text-processing
4
document-conversion
3
batch-document-processing
3
word-to-markdown
3
powerpoint-to-markdown
3
intelligent-document-processing
3
document-understanding
3
ai-training-data
3
unstructured-alternative
3
docling-alternative
3
marker-alternative
3
markitdown-alternative
3
mineru-alternative
3
paddleocr-alternative
3
excel-to-markdown
3
tesseract-alternative
3
document-to-markdown
3
html-to-markdown
3
local-document-processing
3
structured-data-extraction
3
layout-detection
3
llm-ready-data
3
document-ai
3
document-analysis
3
pdf-parsing
3
structured-data
3
async
2
document-intelligence
2
extensible
2
information-extraction
2
mcp
2
metadata-extraction
2
model-context-protocol
2
pandoc
2
pdf-extraction
2
pdfium
2
plugin-architecture
2
retrieval-augmented-generation
2
opencv
2
tables
2
offline-document-extractor
2
ppt-to-markdown
2
ai
2
offline-document-converter
2
image-table-recognition
1
extracttable
1
pdftk
1
pdf-table-extract
1
tabular-data
1
python3
1
csv
1
liwc
1
paragraph-extraction
1
pdf-to-html
1
mseep
1
vlms
1
unstructured-data
1
onpremise
1
onprem-vision
1
onprem-ocr
1
onprem
1
ocr-onpremise
1
ocr-benchmark
1
nlp
1
meta-analysis
1
LIWC
1
text
1
datasets
1
data
1
table-to-text
1
pypi-package
1
pdf-to-text
1
open-source
1
markitdown
1
marker
1
markdrop
1
image-to-text
1
docling
1
agents
1
openai
1