pypi.org "format-detection" keyword
View the packages on the pypi.org package registry that are tagged with the "format-detection" keyword.
magicconvert 0.1.3
MagicConvert is a Python library that converts various document formats (PDF, DOCX, XLSX, PPTX, H...3 versions - Latest release: 3 months ago - 43 downloads last month - 1 stars on GitHub - 1 maintainer
tikara 0.1.6
The metadata and text content extractor for almost every file type.6 versions - Latest release: 7 months ago - 22 downloads last month - 4 stars on GitHub - 1 maintainer
Related Keywords
ocr
2
image-to-text
2
text-extraction
2
document-processing
2
file-conversion
2
text-processing
2
retrieval-augmented-generation
1
office-documents
1
mime-type
1
metadata
1
language-detection
1
information-extraction
1
image-extraction
1
format-identification
1
file-type
1
file-reader
1
file-processing
1
file-parsing
1
file-identification
1
file-format
1
file-analysis
1
excel
1
docx
1
document-understanding
1
pdf-to-text
1
natural-language-processing
1
ml
1
metadata-extraction
1
llm
1
java
1
word-documents
1
unstructured-data
1
tika
1
text-recognition
1
text-parsing
1
text-mining
1
text-analytics
1
structured-data
1
powerpoint
1
pdf-parsing
1
pdf
1
document-text
1
content-processing
1
content-parsing
1
content-management
1
content-intelligence
1
content-indexing
1
content-extraction
1
content-detection
1
apache-tika
1
tesseract-ocr
1
python-library
1
html-to-markdown
1
pptx-to-markdown
1
xlsx-to-markdown
1
docx-to-markdown
1
pdf-to-markdown
1
markdown
1
document-conversion
1
document-reader
1
document-parsing
1
document-ocr
1
document-metadata
1
document-management
1
document-intelligence
1
document-indexing
1
document-extraction
1
document-converter
1
document-classification
1
document-automation
1
document-analysis
1
document-ai
1
data-processing
1
data-parsing
1
data-extraction
1
content-type
1