An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "ocr" keyword

View the packages on the pypi.org package registry that are tagged with the "ocr" keyword.

surya-ocr 0.13.1
OCR, layout, reading order, and table recognition in 90+ languages
64 versions - Latest release: 23 days ago - 1 dependent package - 79.2 thousand downloads last month - 17,097 stars on GitHub - 1 maintainer
regula-documentreader-webclient 7.7.313
Regula's Document Reader python client
306 versions - Latest release: 2 days ago - 17.1 thousand downloads last month - 19 stars on GitHub - 1 maintainer
aspose-ocr-python-java 25.2.0
Aspose.OCR for Python via .Java is a powerful, while easy-to-use optical character recognition (O...
5 versions - Latest release: 2 months ago - 1 dependent package - 135 downloads last month - 1 stars on GitHub - 1 maintainer
Top 2.0% on pypi.org
paddle2onnx 2.0.1
Export PaddlePaddle to ONNX
54 versions - Latest release: about 19 hours ago - 9 dependent packages - 787 dependent repositories - 36 thousand downloads last month - 645 stars on GitHub - 2 maintainers
kfe 1.2.5
File Explorer and Search Engine for locally stored multimedia
19 versions - Latest release: 2 months ago - 581 downloads last month - 2 stars on GitHub - 1 maintainer
mc-pdf2txt 0.3.0 💰
Multi-column PDF to Text
3 versions - Latest release: almost 2 years ago - 1 dependent repositories - 134 downloads last month - 5 stars on GitHub - 1 maintainer
mpxpy 0.0.1
Official Mathpix client for Python
1 version - Latest release: about 11 hours ago - 1 maintainer
Top 1.0% on pypi.org
ddddocr 1.5.6
带带弟弟OCR
24 versions - Latest release: 6 months ago - 27 dependent packages - 153 dependent repositories - 114 thousand downloads last month - 8,469 stars on GitHub - 1 maintainer
tesseractrapidfuzz 0.10
Performs OCR on a list of images using Tesseract and performs fuzzy string matching with a given ...
1 version - Latest release: over 1 year ago - 47 downloads last month - 1 stars on GitHub - 1 maintainer
Top 1.6% on pypi.org
pyocr 0.8.5
A Python wrapper for OCR engines (Tesseract, Cuneiform, etc)
30 versions - Latest release: over 1 year ago - 5 dependent packages - 255 dependent repositories - 20.5 thousand downloads last month - 7,831 stars on GitHub - 1 maintainer
Top 1.0% on pypi.org
ocrmypdf 16.10.0 💰
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
243 versions - Latest release: about 2 months ago - 10 dependent packages - 108 dependent repositories - 230 thousand downloads last month - 13,986 stars on GitHub - 1 maintainer
pytesseract-cli 1.2.0
A pytesseract wrapper enabling OCR on images and directories.
1 version - Latest release: almost 4 years ago - 1 dependent repositories - 117 downloads last month - 1 stars on GitHub - 1 maintainer
elem-hasplib 1.0.0
bisheng-rt-enterprice authorize module
1 version - Latest release: over 1 year ago - 27 downloads last month - 8,069 stars on GitHub - 1 maintainer
ubii-processing-module-ocr 0.2.0
"Ubi Interact Processing Module to perform OCR tasks via Tesseract"
9 versions - Latest release: over 2 years ago - 1 dependent repositories - 240 downloads last month - 0 stars on GitHub - 1 maintainer
Top 1.9% on pypi.org
layoutparser 0.3.4
A unified toolkit for Deep Learning Based Document Image Analysis
11 versions - Latest release: about 3 years ago - 5 dependent packages - 77 dependent repositories - 244 thousand downloads last month - 4,869 stars on GitHub - 1 maintainer
Top 9.0% on pypi.org
pix2text 1.1.3
An Open-Source Python3 tool for recognizing layouts, tables, math formulas, and text in images, c...
34 versions - Latest release: 3 days ago - 1 dependent repositories - 7.55 thousand downloads last month - 2,325 stars on GitHub - 1 maintainer
tesseractmultiprocessing 0.10
Multiprocessing OCR with Tesseract
1 version - Latest release: about 2 years ago - 1 dependent package - 84 downloads last month - 0 stars on GitHub - 1 maintainer
Top 0.6% on pypi.org
paddleocr 2.10.0
Awesome OCR toolkits based on PaddlePaddle(8.6M ultra-lightweight pre-trained model, support trai...
48 versions - Latest release: about 1 month ago - 43 dependent packages - 549 dependent repositories - 384 thousand downloads last month - 42,174 stars on GitHub - 1 maintainer
filemac 1.1.7
Open source Python CLI toolkit for conversion, manipulation, Analysis of files (All major file op...
14 versions - Latest release: 21 days ago - 502 downloads last month - 1 stars on GitHub - 1 maintainer
yomitoku 0.9.0
Yomitoku is an AI-powered document image analysis package designed specifically for the Japanese ...
15 versions - Latest release: about 17 hours ago - 2.83 thousand downloads last month - 581 stars on GitHub - 1 maintainer
hycli 0.6.0
Hypatos cli tool to batch extract documents through the API and to compare the results.
40 versions - Latest release: over 3 years ago - 1 dependent repositories - 400 downloads last month - 1 maintainer
papermerge-core 2.1.5
Open source document management system for digital archives
77 versions - Latest release: about 2 years ago - 4 dependent repositories - 1.55 thousand downloads last month - 347 stars on GitHub - 1 maintainer
Top 1.6% on pypi.org
tesserocr 2.8.0
A simple, Pillow-friendly, Python wrapper around tesseract-ocr API using Cython
22 versions - Latest release: 2 months ago - 9 dependent packages - 201 dependent repositories - 114 thousand downloads last month - 1,986 stars on GitHub - 1 maintainer
Top 1.4% on pypi.org
pymupdf 1.25.5
A high performance Python library for data extraction, analysis, conversion & manipulation of PDF...
131 versions - Latest release: 18 days ago - 206 dependent packages - 1,798 dependent repositories - 8.36 million downloads last month - 6,889 stars on GitHub - 1 maintainer
batukh 0.1.1
Document recognizer for multiple languages.
5 versions - Latest release: over 4 years ago - 231 downloads last month - 5 stars on GitHub - 3 maintainers
transkribus-to-prima 0.0.1
Convert Transkribus PAGE-XML to standard PAGE-XML
1 version - Latest release: over 3 years ago - 1 dependent repositories - 39 downloads last month - 12 stars on GitHub - 1 maintainer
pdfautonup 1.11.0
Convert PDF files to 'n-up' PDF files, guessing the output layout.
23 versions - Latest release: 4 months ago - 1 dependent package - 1 dependent repositories - 946 downloads last month - 6,889 stars on GitHub - 1 maintainer
aqpymupdf 1.23.7
A high performance Python library for data extraction, analysis, conversion & manipulation of PDF...
1 version - Latest release: about 1 year ago - 67 downloads last month - 6,889 stars on GitHub - 1 maintainer
pdfmod 0.1.5
A tool for PDF file manipulation.
1 version - Latest release: 5 months ago - 62 downloads last month - 6,368 stars on GitHub - 1 maintainer
paperap 0.0.11 💰
Python library for interacting with the Paperless NGX REST API
11 versions - Latest release: 23 days ago - 870 downloads last month - 26,357 stars on GitHub - 1 maintainer
magic-pdf 1.3.5
A practical tool for converting PDF to Markdown
41 versions - Latest release: 2 days ago - 44.5 thousand downloads last month - 30,848 stars on GitHub - 1 maintainer
latex-toolkit 0.1
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
1 version - Latest release: about 1 month ago - 48 downloads last month - 30,848 stars on GitHub - 1 maintainer
lazyllm-magic-pdf 0.9.0
A practical tool for converting PDF to Markdown
1 version - Latest release: about 1 month ago - 48 downloads last month - 30,848 stars on GitHub - 1 maintainer
xh-pdf-parser 1.3.1.2
A practical tool for converting PDF to Markdown
5 versions - Latest release: 8 days ago - 407 downloads last month - 30,848 stars on GitHub - 1 maintainer
mseep-pdf2md 0.1.0
PDF to Markdown MCP服务器
1 version - Latest release: 4 days ago - 30,172 stars on GitHub - 1 maintainer
unstructured-cpu 0.15.1
A library that prepares raw documents for downstream ML tasks.
13 versions - Latest release: 8 months ago - 368 downloads last month - 10,877 stars on GitHub - 1 maintainer
easy-paddle-ocr 0.0.5
This a clean and easy-to-use implementation of Paddle OCR. Made with ❤️ by Theos AI.
5 versions - Latest release: about 2 years ago - 1 dependent repositories - 208 downloads last month - 48,243 stars on GitHub - 1 maintainer
akaocr 2.1.4
akaOCR Package Tools
14 versions - Latest release: 7 months ago - 294 downloads last month - 48,243 stars on GitHub - 1 maintainer
je-paddleocr 2.9.1
Awesome OCR toolkits based on PaddlePaddle(8.6M ultra-lightweight pre-trained model, support trai...
1 version - Latest release: about 1 month ago - 199 downloads last month - 48,243 stars on GitHub - 1 maintainer
ppocrlabel-japan 0.0.2
PPOCRLabelv2 is a semi-automatic graphic annotation tool suitable for OCR field, with built-in PP...
2 versions - Latest release: almost 2 years ago - 106 downloads last month - 48,243 stars on GitHub - 1 maintainer
paddleocr-onnx 0.5.1
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, ...
9 versions - Latest release: 11 months ago - 385 downloads last month - 48,243 stars on GitHub - 1 maintainer
paddleocrwordleveldetection 2.6.1.0
Awesome OCR toolkits based on PaddlePaddle (8.6M ultra-lightweight pre-trained model, support tra...
1 version - Latest release: almost 2 years ago - 46 downloads last month - 44,705 stars on GitHub - 1 maintainer
omnidocs 0.1.1
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, ...
2 versions - Latest release: 5 months ago - 119 downloads last month - 48,243 stars on GitHub - 1 maintainer
anlab-paddleocr 0.0.1
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, ...
1 version - Latest release: 11 months ago - 39 downloads last month - 48,243 stars on GitHub - 1 maintainer
fadoudou2 2.8.1
Awesome OCR toolkits based on PaddlePaddle(8.6M ultra-lightweight pre-trained model, support trai...
16 versions - Latest release: about 1 month ago - 667 downloads last month - 42,174 stars on GitHub - 1 maintainer
numericvision 0.1.0
Detects numeric displays in images using OpenCV
1 version - Latest release: over 5 years ago - 1 dependent repositories - 87 downloads last month - 1 stars on GitHub - 1 maintainer
rapid-videocr 3.0.0 💰
Tool for extracting hard subtitles from videos.
48 versions - Latest release: 1 day ago - 897 downloads last month - 386 stars on GitHub - 1 maintainer
epam.imago 2.0.0rc1
Imago, chemical structures optical recognition tool
1 version - Latest release: over 2 years ago - 21 downloads last month - 8 stars on GitHub - 1 maintainer
Top 0.7% on pypi.org
easyocr 1.7.2
End-to-End Multi-Lingual Optical Character Recognition (OCR) Solution
33 versions - Latest release: 7 months ago - 90 dependent packages - 671 dependent repositories - 821 thousand downloads last month - 26,321 stars on GitHub - 1 maintainer
Top 0.8% on pypi.org
baidu-aip 4.16.13
Baidu AIP SDK
68 versions - Latest release: over 1 year ago - 9 dependent packages - 1,035 dependent repositories - 32.3 thousand downloads last month - 3 maintainers
r2r 3.5.13
SciPhi R2R
212 versions - Latest release: 1 day ago - 20.2 thousand downloads last month - 1,103 stars on GitHub - 1 maintainer
wrilytextaligner 0.10
Aligns text lines and characters within an image
1 version - Latest release: over 1 year ago - 41 downloads last month - 0 stars on GitHub - 1 maintainer
normcap 0.5.9
OCR-powered screen-capture tool to capture information instead of images.
57 versions - Latest release: 5 months ago - 1 dependent repositories - 2.54 thousand downloads last month - 2,108 stars on GitHub - 1 maintainer
myoons-ocr 0.1
ocr functions for myoon
1 version - Latest release: about 4 years ago - 1 dependent repositories - 27 downloads last month - 1 maintainer
dinglehopper 0.10.0
The OCR evaluation tool
9 versions - Latest release: 2 days ago - 296 downloads last month - 65 stars on GitHub - 1 maintainer
Top 9.7% on pypi.org
nocv2easyocr 0.1.1
This is a fork of the EasyOCR library without the opencv requirement
2 versions - Latest release: about 2 years ago - 105 downloads last month - 20,452 stars on GitHub - 1 maintainer
ybc-idcard-ocr 1.1.2
Recognize ID Card By Ocr.
11 versions - Latest release: almost 6 years ago - 1 dependent repositories - 253 downloads last month - 1 maintainer
pyimageocr 1.0.1
Python machine learning ocr.
2 versions - Latest release: almost 7 years ago - 1 dependent repositories - 51 downloads last month - 0 stars on GitHub - 1 maintainer
ocrtextxy 0.0.2
Get constellation
1 version - Latest release: over 7 years ago - 1 dependent repositories - 28 downloads last month - 1 maintainer
eynollah 0.4.0
Document Layout Analysis
15 versions - Latest release: 2 days ago - 1 dependent repositories - 531 downloads last month - 364 stars on GitHub - 1 maintainer
img2otxt 0.2
A package to convert images to text using OCR.
2 versions - Latest release: 11 months ago - 101 downloads last month - 0 stars on GitHub - 1 maintainer
mindee 4.21.0
Mindee API helper library for Python
76 versions - Latest release: 2 days ago - 2 dependent repositories - 18.4 thousand downloads last month - 40 stars on GitHub - 2 maintainers
tabularocr 0.1.0
TabularOCR is a Python library that provides an easy-to-use Optical Character Recognition (OCR) s...
1 version - Latest release: about 1 year ago - 207 downloads last month - 6 stars on GitHub - 1 maintainer
pixparse 0.1.0.dev0
1 version - Latest release: almost 2 years ago - 52 downloads last month - 1 maintainer
movatalk 0.1.40
libs for cameramonit, ocr, fin-officer, cfo, and other projects
1 version - Latest release: 3 days ago - 0 stars on GitHub - 1 maintainer
pypxml 3.1.0
A python library for parsing, converting and modifying PageXML files.
5 versions - Latest release: 3 days ago - 135 downloads last month - 1 stars on GitHub - 1 maintainer
docext 0.1.12
Onprem information extraction from documents
10 versions - Latest release: 9 days ago - 753 downloads last month - 100 stars on GitHub - 1 maintainer
imgtotxt 0.1.2
OCR app running locally with native UI
3 versions - Latest release: about 2 years ago - 142 downloads last month - 0 stars on GitHub - 1 maintainer
p8hub 0.0.4
Private AI Hub - Host your own Generative AI Services
4 versions - Latest release: over 1 year ago - 160 downloads last month - 27 stars on GitHub - 1 maintainer
easypaddleocr 0.2.1
A simple, optional tool for PaddleOCR Detection, direction classification and recognition on CPU ...
7 versions - Latest release: 12 months ago - 412 downloads last month - 11 stars on GitHub - 1 maintainer
gptparse 0.3.0
A tool for converting PDF documents to Markdown using OCR and vision language models
5 versions - Latest release: 5 months ago - 275 downloads last month - 12 stars on GitHub - 1 maintainer
ocrodjvu 0.13.2
OCR for DjVu (Python 3 fork)
4 versions - Latest release: 5 months ago - 3 dependent repositories - 178 downloads last month - 5 stars on GitHub - 1 maintainer
cua-som 0.1.2
Computer Vision and OCR library for detecting and analyzing UI elements
3 versions - Latest release: 3 days ago - 600 downloads last month - 3,979 stars on GitHub - 1 maintainer
pyfunc2 0.1.24
libs for cameramonit, ocr, fin-officer, cfo, and other projects
6 versions - Latest release: 3 days ago - 228 downloads last month - 0 stars on GitHub - 1 maintainer
craft-text-detector 0.4.3 💰
Fast and accurate text detection library built on CRAFT implementation
13 versions - Latest release: almost 3 years ago - 1 dependent package - 8 dependent repositories - 5.35 thousand downloads last month - 12 stars on GitHub - 1 maintainer
social-arsenal 0.3.0
Sort screenshots based on rules or through individual review.
5 versions - Latest release: about 2 years ago - 204 downloads last month - 30 stars on GitHub - 1 maintainer
pdfplucker 0.3.5
Docling wrapper for PDF parsing
9 versions - Latest release: 4 days ago - 334 downloads last month - 0 stars on GitHub - 1 maintainer
parsee-pdf-reader 0.1.3
Tesseract Open Source OCR Engine (main repository)
17 versions - Latest release: about 1 year ago - 1 dependent package - 587 downloads last month - 60,749 stars on GitHub - 1 maintainer
usseg 0.7.1
Tools to segment doppler ultrasound signals from scan images.
9 versions - Latest release: over 1 year ago - 300 downloads last month - 60,749 stars on GitHub - 2 maintainers
pdf-language-detector 0.0.11
A python script to iterate over a list of PDF in a directory and try to guess their language with...
12 versions - Latest release: almost 2 years ago - 517 downloads last month - 60,749 stars on GitHub - 2 maintainers
hocr-tools-lib 1.1.0
Advanced tools for hOCR integration (library version)
5 versions - Latest release: 9 months ago - 239 downloads last month - 2 stars on GitHub - 1 maintainer
find-keyword-xtvu 5.7.3
A package to find keywords in .pdf, .docx, .odt, and .rtf files, with support for multiple langua...
53 versions - Latest release: 7 months ago - 758 downloads last month - 65,960 stars on GitHub - 2 maintainers
form-tools 0.2.0
Tesseract Open Source OCR Engine (main repository)
6 versions - Latest release: 9 months ago - 1 dependent repositories - 597 downloads last month - 65,960 stars on GitHub - 1 maintainer
filecabinet 2.1.0
A local, offline document archive
3 versions - Latest release: almost 2 years ago - 116 downloads last month - 65,960 stars on GitHub - 1 maintainer
textmater 0.1
Extract Structured Data from text
1 version - Latest release: 11 months ago - 56 downloads last month - 1 maintainer
pdftoprompt 0.1.2
Python library to abbreviate a PDF file to GPT 8k prompt length
3 versions - Latest release: about 2 years ago - 132 downloads last month - 60,749 stars on GitHub - 1 maintainer
huixiangdou 0.1.0
Overcoming Group Chat Scenarios with LLM-based Technical Assistance
3 versions - Latest release: over 1 year ago - 85 downloads last month - 826 stars on GitHub - 1 maintainer
ocr-gls-g6 0.1.1
OCR_GLS_G6 - Optical character recognition and QR codes. Support Multiple PDF page file.
14 versions - Latest release: 8 months ago - 355 downloads last month - 1 maintainer
power5p-ocr-gpt 0.0.1
PYPI tutorial package by power5p
1 version - Latest release: over 1 year ago - 63 downloads last month - 1 maintainer
Top 8.4% on pypi.org
cleanit 0.4.8 💰
Subtitles extremely clean
14 versions - Latest release: 10 months ago - 1 dependent package - 5 dependent repositories - 33 thousand downloads last month - 21 stars on GitHub - 1 maintainer
Top 3.5% on pypi.org
pix2tex 0.1.4
pix2tex: Using a ViT to convert images of equations into LaTeX code.
33 versions - Latest release: 3 months ago - 2 dependent packages - 5 dependent repositories - 5.98 thousand downloads last month - 13,990 stars on GitHub - 1 maintainer
Top 9.4% on pypi.org
axcelocr 1.6.3
End-to-End Multi-Lingual Optical Character Recognition (OCR) Solution
1 version - Latest release: about 2 years ago - 29 downloads last month - 26,321 stars on GitHub - 1 maintainer
easyocr-itgn 1.2.3
Modified Easyorc By IntoThatGoodNight
3 versions - Latest release: over 1 year ago - 176 downloads last month - 20,429 stars on GitHub - 1 maintainer
asoen-ocr 1.0.0
End-to-End Multi-Lingual Optical Character Recognition (OCR) Solution
1 version - Latest release: about 2 years ago - 29 downloads last month - 22,979 stars on GitHub - 1 maintainer
Top 7.4% on pypi.org
asone-ocr 1.6.2
End-to-End Multi-Lingual Optical Character Recognition (OCR) Solution
2 versions - Latest release: about 2 years ago - 2 dependent packages - 1 dependent repositories - 130 downloads last month - 20,452 stars on GitHub - 1 maintainer
myeasyocr 1.2.3
End-to-End Multi-Lingual Optical Character Recognition (OCR) Solution
1 version - Latest release: about 4 years ago - 1 dependent repositories - 76 downloads last month - 20,452 stars on GitHub - 1 maintainer
tahweel 0.1.0
تحويل ملفات PDF إلى Word و TXT
14 versions - Latest release: 6 months ago - 1.6 thousand downloads last month - 7 stars on GitHub - 1 maintainer
ealocr 1.4.7
EasyAimLock forked edition / End-to-End Multi-Lingual Optical Character Recognition (OCR) Solution
6 versions - Latest release: about 3 years ago - 1 dependent repositories - 262 downloads last month - 0 stars on GitHub - 1 maintainer
receipt-parser-core 0.2.5 💰
A supermarket receipt parser written in Python using tesseract OCR
13 versions - Latest release: almost 4 years ago - 1 dependent repositories - 480 downloads last month - 790 stars on GitHub - 1 maintainer
vnhtr 0.1.8
Encoder-Decoder base for Vietnamese handwriting recognition
10 versions - Latest release: over 1 year ago - 373 downloads last month - 10 stars on GitHub - 1 maintainer