pypi.org : spark-pdf-python
PDF DataSource for Apache Spark in Python
Registry
-
Source
- Homepage
- Documentation
- JSON
purl: pkg:pypi/spark-pdf-python
Keywords:
big-data
, data-engineering
, data-extraction
, data-science
, ocr
, ocr-recognition
, pdf
, pdf-document
, pdf-document-processor
, spark
, spark-datasource
, tesseract
, tesseract-ocr
License: AGPL-3.0
Latest release: 2 months ago
First release: 2 months ago
Downloads: 93 last month
Stars: 45 on GitHub
Forks: 4 on GitHub
See more repository details: repos.ecosyste.ms
Last synced: 13 days ago
pyspark-pdf 0.1.0rc9
Spark-Pdf is a library for processing documents using Apache Spark8 versions - Latest release: 6 months ago - 254 downloads last month - 45 stars on GitHub - 1 maintainer