An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "tesseract" keyword

View the packages on the pypi.org package registry that are tagged with the "tesseract" keyword.

imagetocsv 1.0.0
Converts An Image to a CSV. This exists because Chorus 3.0 are bat-shit and only show images for ...
4 versions - Latest release: almost 2 years ago - 122 downloads last month - 5 stars on GitHub - 1 maintainer
cyandroemu 1.0.1
Very fast Android automation framework with 7 backends (uiautomator 1+2, fragment parser, window ...
2 versions - Latest release: about 1 month ago - 120 downloads last month - 77 stars on GitHub - 1 maintainer
cropyble 1.2.0
Cropyble is a module that allows a user to easily perform crops on an image containing recognizab...
4 versions - Latest release: over 5 years ago - 1 dependent repositories - 113 downloads last month - 0 stars on GitHub - 1 maintainer
readmrz 0.0.2
Machine readable zone reader on ID cards
2 versions - Latest release: over 2 years ago - 1 dependent repositories - 542 downloads last month - 16 stars on GitHub - 1 maintainer
mc-pdf2txt 0.3.0 💰
Multi-column PDF to Text
3 versions - Latest release: almost 2 years ago - 1 dependent repositories - 134 downloads last month - 5 stars on GitHub - 1 maintainer
tesseractrapidfuzz 0.10
Performs OCR on a list of images using Tesseract and performs fuzzy string matching with a given ...
1 version - Latest release: over 1 year ago - 47 downloads last month - 1 stars on GitHub - 1 maintainer
Top 1.0% on pypi.org
ocrmypdf 16.10.0 💰
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
243 versions - Latest release: about 2 months ago - 10 dependent packages - 108 dependent repositories - 230 thousand downloads last month - 13,986 stars on GitHub - 1 maintainer
pytesseract-cli 1.2.0
A pytesseract wrapper enabling OCR on images and directories.
1 version - Latest release: almost 4 years ago - 1 dependent repositories - 117 downloads last month - 1 stars on GitHub - 1 maintainer
ubii-processing-module-ocr 0.2.0
"Ubi Interact Processing Module to perform OCR tasks via Tesseract"
9 versions - Latest release: over 2 years ago - 1 dependent repositories - 240 downloads last month - 0 stars on GitHub - 1 maintainer
tesseractmultiprocessing 0.10
Multiprocessing OCR with Tesseract
1 version - Latest release: about 2 years ago - 1 dependent package - 84 downloads last month - 0 stars on GitHub - 1 maintainer
Top 1.6% on pypi.org
tesserocr 2.8.0
A simple, Pillow-friendly, Python wrapper around tesseract-ocr API using Cython
22 versions - Latest release: 2 months ago - 9 dependent packages - 201 dependent repositories - 114 thousand downloads last month - 1,986 stars on GitHub - 1 maintainer
Top 1.4% on pypi.org
pymupdf 1.25.5
A high performance Python library for data extraction, analysis, conversion & manipulation of PDF...
131 versions - Latest release: 18 days ago - 206 dependent packages - 1,798 dependent repositories - 8.36 million downloads last month - 6,889 stars on GitHub - 1 maintainer
pdfautonup 1.11.0
Convert PDF files to 'n-up' PDF files, guessing the output layout.
23 versions - Latest release: 4 months ago - 1 dependent package - 1 dependent repositories - 946 downloads last month - 6,889 stars on GitHub - 1 maintainer
aqpymupdf 1.23.7
A high performance Python library for data extraction, analysis, conversion & manipulation of PDF...
1 version - Latest release: about 1 year ago - 67 downloads last month - 6,889 stars on GitHub - 1 maintainer
pdfmod 0.1.5
A tool for PDF file manipulation.
1 version - Latest release: 5 months ago - 62 downloads last month - 6,368 stars on GitHub - 1 maintainer
parsee-pdf-reader 0.1.3
Tesseract Open Source OCR Engine (main repository)
17 versions - Latest release: about 1 year ago - 1 dependent package - 587 downloads last month - 60,749 stars on GitHub - 1 maintainer
usseg 0.7.1
Tools to segment doppler ultrasound signals from scan images.
9 versions - Latest release: over 1 year ago - 300 downloads last month - 60,749 stars on GitHub - 2 maintainers
pdf-language-detector 0.0.11
A python script to iterate over a list of PDF in a directory and try to guess their language with...
12 versions - Latest release: almost 2 years ago - 517 downloads last month - 60,749 stars on GitHub - 2 maintainers
find-keyword-xtvu 5.7.3
A package to find keywords in .pdf, .docx, .odt, and .rtf files, with support for multiple langua...
53 versions - Latest release: 7 months ago - 758 downloads last month - 65,960 stars on GitHub - 2 maintainers
form-tools 0.2.0
Tesseract Open Source OCR Engine (main repository)
6 versions - Latest release: 9 months ago - 1 dependent repositories - 597 downloads last month - 65,960 stars on GitHub - 1 maintainer
filecabinet 2.1.0
A local, offline document archive
3 versions - Latest release: almost 2 years ago - 116 downloads last month - 65,960 stars on GitHub - 1 maintainer
pdftoprompt 0.1.2
Python library to abbreviate a PDF file to GPT 8k prompt length
3 versions - Latest release: about 2 years ago - 132 downloads last month - 60,749 stars on GitHub - 1 maintainer
tesserhocr2df 0.10
tesseract hocr to pandas DataFrame
1 version - Latest release: about 1 year ago - 45 downloads last month - 1 stars on GitHub - 1 maintainer
kreuzberg 3.1.3
A text extraction library supporting PDFs, images, office documents and more
19 versions - Latest release: 9 days ago - 6.53 thousand downloads last month - 1,736 stars on GitHub - 1 maintainer
wagtail-textract 1.2
Allow searching for text in Documents in the Wagtail content management system
8 versions - Latest release: over 5 years ago - 1 dependent repositories - 253 downloads last month - 31 stars on GitHub - 2 maintainers
bangla-pdf-ocr 0.1.1
A package to extract Bengali text from PDFs using OCR
2 versions - Latest release: 6 months ago - 137 downloads last month - 8 stars on GitHub - 1 maintainer
pmworker 1.2.0 💰
Papermerge worker - extract OCR text documents
4 versions - Latest release: about 5 years ago - 1 dependent repositories - 113 downloads last month - 3 stars on GitHub - 1 maintainer
pdf2dataset 0.5.3
Easily convert a subdirectory with big volume of PDF documents into a dataset, supports extractin...
15 versions - Latest release: over 4 years ago - 1 dependent repositories - 645 downloads last month - 19 stars on GitHub - 1 maintainer
motionpdf 0.0.1
A script built on Tesseract-OCR for converting .pdf to .txt
1 version - Latest release: about 3 years ago - 1 dependent repositories - 59 downloads last month - 0 stars on GitHub - 1 maintainer
textshot 0.1.2
Python tool for grabbing text via screenshot
3 versions - Latest release: 4 months ago - 259 downloads last month - 1,757 stars on GitHub - 1 maintainer
tesserwrap 0.1.6
Basic python bindings to the Tesseract C++ API
11 versions - Latest release: over 10 years ago - 5 dependent repositories - 143 downloads last month - 66 stars on GitHub - 1 maintainer
hocr-utils 0.0.3
Package containing utility function for hOCR and tesseract
2 versions - Latest release: about 2 years ago - 1 dependent repositories - 99 downloads last month - 2 stars on GitHub - 1 maintainer
tesstrain 0.2.0
Training utils for Tesseract
6 versions - Latest release: about 1 month ago - 478 downloads last month - 1 stars on GitHub - 1 maintainer
a-pandas-ex-tesseract-multirow-regex-fuzz 0.11
Regex/Fuzz search across multiple rows/Tesseract to pandas.DataFrame
2 versions - Latest release: over 2 years ago - 2 dependent packages - 178 downloads last month - 0 stars on GitHub - 1 maintainer
autoocr 0.0.3
A Python wrapper for cross platform tesseract OCR engine with multiple languages (e.g. Bangla)
1 version - Latest release: almost 6 years ago - 1 dependent repositories - 63 downloads last month - 17 stars on GitHub - 1 maintainer
gpyocr 1.6
Python wrapper for Tesseract OCR and Google Vision OCR
11 versions - Latest release: almost 3 years ago - 1 dependent repositories - 397 downloads last month - 12 stars on GitHub - 1 maintainer
ocyara 1.0.1
A Yara rule engine that scans images for matches using Optical Character Recognition (OCR). See t...
5 versions - Latest release: about 8 years ago - 1 dependent repositories - 95 downloads last month - 41 stars on GitHub - 2 maintainers
spark-pdf-python 0.1.1
PDF DataSource for Apache Spark in Python
3 versions - Latest release: 2 months ago - 93 downloads last month - 45 stars on GitHub - 1 maintainer
pyspark-pdf 0.1.0rc9
Spark-Pdf is a library for processing documents using Apache Spark
8 versions - Latest release: 5 months ago - 254 downloads last month - 45 stars on GitHub - 1 maintainer
pytessy 0.1.0
Tesseract-OCR, faster
1 version - Latest release: about 5 years ago - 1 dependent repositories - 95 downloads last month - 13 stars on GitHub - 1 maintainer
pyslibtesseract 0.0.15
Integration of Tesseract for Python using a shared library
12 versions - Latest release: about 9 years ago - 2 dependent repositories - 225 downloads last month - 12 stars on GitHub - 1 maintainer
ocrd-fork-tesserocr 3.0.0rc2
A simple, Pillow-friendly, Python wrapper around tesseract-ocr API using Cython
2 versions - Latest release: almost 6 years ago - 1 dependent package - 1 dependent repositories - 41 downloads last month - 1,986 stars on GitHub - 1 maintainer
Top 4.6% on pypi.org
rpa 1.50.0
RPA for Python is a Python package for RPA (robotic process automation)
48 versions - Latest release: almost 2 years ago - 1 dependent package - 10 dependent repositories - 2.85 thousand downloads last month - 4,857 stars on GitHub - 2 maintainers
Top 4.3% on pypi.org
tagui 1.50.0
RPA for Python is a Python package for RPA (robotic process automation)
88 versions - Latest release: almost 2 years ago - 22 dependent repositories - 3.83 thousand downloads last month - 4,279 stars on GitHub - 2 maintainers
tesserpy 1.1.2
Python interface to the Tesseract library
3 versions - Latest release: over 10 years ago - 2 dependent repositories - 46 downloads last month - 20 stars on GitHub - 2 maintainers
adf2pdf 0.8.3
Automate the workflow around ADF scanning, OCR and PDF creation
4 versions - Latest release: over 1 year ago - 1 dependent repositories - 139 downloads last month - 6 stars on GitHub - 1 maintainer
nkocr 2.5.1 💰
This is a module to make specifics OCRs at food products and nutricional tables.
16 versions - Latest release: 6 months ago - 1 dependent repositories - 497 downloads last month - 34 stars on GitHub - 2 maintainers
axa-fr-ocr 0.0.8
AXA France OCR library
6 versions - Latest release: over 1 year ago - 249 downloads last month - 2 stars on GitHub - 1 maintainer
verifytweet 0.6.0
A tool to verify Tweet screenshots
2 versions - Latest release: about 5 years ago - 1 dependent repositories - 109 downloads last month - 20 stars on GitHub - 1 maintainer
djtesseract 0.0.6
A small app providing a tesseract field for django 3.1.2
4 versions - Latest release: over 4 years ago - 1 dependent repositories - 165 downloads last month - 0 stars on GitHub - 1 maintainer
fastmrz 2.1.1
Extracts the Machine Readable Zone (MRZ) data from document images
9 versions - Latest release: about 2 months ago - 808 downloads last month - 19 stars on GitHub - 1 maintainer
python-ocr 0.1.5
Input Adaptor to verify file extension
6 versions - Latest release: over 2 years ago - 320 downloads last month - 1 maintainer
screen-ocr 0.5.0
Library for processing screen contents using OCR
7 versions - Latest release: about 2 years ago - 1 dependent package - 4 dependent repositories - 300 downloads last month - 43 stars on GitHub - 1 maintainer
ocr-with-format 0.13
Wrapper to pytesseract to preserve space and formatting
10 versions - Latest release: 19 days ago - 266 downloads last month - 0 stars on GitHub - 1 maintainer
betterocr 1.2.0 💰
Better text detection by combining OCR engines with LLM.
6 versions - Latest release: over 1 year ago - 283 downloads last month - 527 stars on GitHub - 1 maintainer
django-tesseractfield 0.0.2
A small app providing a tesseract field for django
2 versions - Latest release: about 6 years ago - 1 dependent repositories - 67 downloads last month - 1 stars on GitHub - 1 maintainer
pysseract 1.3.1
Python binding to Tesseract API
16 versions - Latest release: over 5 years ago - 1 dependent repositories - 1.66 thousand downloads last month - 1 stars on GitHub - 1 maintainer
saram 1.0.2
A library to fetch images from a directory and get OCR and store in txt with orientation rotation...
9 versions - Latest release: about 7 years ago - 1 dependent repositories - 231 downloads last month - 51 stars on GitHub - 1 maintainer
tesseract-python 3.5.1
Self-contained Python module to Tesseract.
1 version - Latest release: almost 7 years ago - 1 dependent repositories - 49 downloads last month - 2 stars on GitHub - 1 maintainer
samagra-docparser 0.1.2
Document Parser built to extract information from pdfs.
3 versions - Latest release: over 1 year ago - 47 downloads last month - 1 maintainer
nlpknowledge 0.0.2
Package to make sense of images with text information
9 versions - Latest release: over 5 years ago - 1 dependent repositories - 161 downloads last month - 6,176 stars on GitHub - 1 maintainer
rezolve-ai-ingestion 0.1.4
A private package for ingesting and processing SharePoint data with AI capabilities
4 versions - Latest release: 7 months ago - 104 downloads last month - 6,176 stars on GitHub - 1 maintainer
tessy 0.5.2
A Python wrapper for Tesseract-OCR.
2 versions - Latest release: about 1 year ago - 1 dependent repositories - 79 downloads last month - 1 stars on GitHub - 1 maintainer
mementor 1.0.4
A library to fetch images from directory to fix orientation and pull OCR from the images along wi...
5 versions - Latest release: almost 7 years ago - 1 dependent repositories - 121 downloads last month - 80 stars on GitHub - 1 maintainer
northern-lights-forecast 4.1.4
A simple web scraping northern lights forecast that automatically send a telegram notification du...
14 versions - Latest release: over 2 years ago - 1 dependent repositories - 325 downloads last month - 1 stars on GitHub - 1 maintainer
tesseract-window-scanner 0.12
OCR on screenshots with tesseract - Windows only
3 versions - Latest release: over 2 years ago - 133 downloads last month - 1 stars on GitHub - 1 maintainer
tesseracttrainer 0.1.1
A small framework taking over the manual tesseract training process described in the Tesseract Wiki
2 versions - Latest release: over 12 years ago - 1 dependent repositories - 77 downloads last month - 130 stars on GitHub - 1 maintainer
winrtocr 0.10
Multiprocessing library for OCR with WinRT
1 version - Latest release: about 2 years ago - 59 downloads last month - 0 stars on GitHub - 1 maintainer
tesserparsing 0.10
Image Processing and Text Extraction with Tesseract - multiprocessing
1 version - Latest release: over 1 year ago - 48 downloads last month - 1 stars on GitHub - 1 maintainer
aiopytesseract 0.14.0 💰
asyncio tesseract wrapper for Tesseract-OCR
15 versions - Latest release: about 1 year ago - 1 dependent repositories - 2.01 thousand downloads last month - 17 stars on GitHub - 1 maintainer
multitessiocr 0.13
Performs a very fast OCR on a list of images (file path, url, base64, bytes, numpy, PIL ...) usin...
4 versions - Latest release: over 1 year ago - 1 dependent package - 171 downloads last month - 0 stars on GitHub - 1 maintainer
easyocr-window-scanner 0.10
OCR on screenshots with EasyOCR - Windows only
1 version - Latest release: over 2 years ago - 82 downloads last month - 1 stars on GitHub - 1 maintainer
tesseract-sdk 0.8.4
Python SDK for Tesseract Models
17 versions - Latest release: over 1 year ago - 234 downloads last month - 2 maintainers
tightocr 0.4.4
Thin and pleasant wrapper for Tesseract OCR.
5 versions - Latest release: almost 11 years ago - 2 dependent repositories - 124 downloads last month - 24 stars on GitHub - 1 maintainer
polybiblioglot 0.2.0 💰
A tool to translate scanned books
2 versions - Latest release: about 4 years ago - 1 dependent repositories - 95 downloads last month - 3 stars on GitHub - 1 maintainer
stb-automator 0.1.0
A library for automated control & testing of set-top boxes
1 version - Latest release: about 4 years ago - 1 dependent repositories - 47 downloads last month - 3 stars on GitHub - 1 maintainer
tesseract-positional 0.1.2
Tool to save positional OCR data to a text file
3 versions - Latest release: over 1 year ago - 1 dependent repositories - 114 downloads last month - 0 stars on GitHub - 1 maintainer
chronotva 1.0.1
ChronoTVA (The Chronomancer's Tesseract Visualization Aid) is a Python 3.9+ command-line tool des...
1 version - Latest release: over 1 year ago - 72 downloads last month - 1 stars on GitHub - 1 maintainer
targimo 0.0.1
Targimo: An Artifical Intelligence Model that Revolutionizes Sentiment Analysis
1 version - Latest release: over 1 year ago - 144 downloads last month - 54,235 stars on GitHub - 1 maintainer
Top 1.8% on pypi.org
pymupdfb 1.24.10
MuPDF shared libraries for PyMuPDF.
18 versions - Latest release: 8 months ago - 4 dependent packages - 133 dependent repositories - 1.62 million downloads last month - 6,195 stars on GitHub - 1 maintainer