pypi.org "format-identification" keyword
View the packages on the pypi.org package registry that are tagged with the "format-identification" keyword.
tikara 0.1.6
The metadata and text content extractor for almost every file type.6 versions - Latest release: 7 months ago - 22 downloads last month - 4 stars on GitHub - 1 maintainer
brunnhilde 1.9.6
A Siegfried-based digital archives reporting tool for directories and disk images22 versions - Latest release: over 2 years ago - 1 dependent repositories - 222 downloads last month - 82 stars on GitHub - 1 maintainer
formatscaper 0.1.0
Tool for building an overview of the file format landscape in a research data repository.1 version - Latest release: about 1 year ago - 17 downloads last month - 223 stars on GitHub - 1 maintainer
jsonid 0.10.0
jsonid a json identification tool21 versions - Latest release: 20 days ago - 202 downloads last month - 8 stars on GitHub - 1 maintainer
pygfried 0.12.0
Siegfried as a Python extension13 versions - Latest release: 3 months ago - 1 dependent package - 1 dependent repositories - 1.53 thousand downloads last month - 7 stars on GitHub - 3 maintainers
sqlitefid 3.0.0rc2 💰
Library and executable for converting format identification reports such as DROID and Siegfried t...2 versions - Latest release: almost 3 years ago - 1 dependent repositories - 11 downloads last month - 4 stars on GitHub - 1 maintainer
Related Keywords
digital-preservation
5
code4lib
5
pronom
4
siegfried
2
file-analysis
2
archives
2
glam
2
digipres
2
llm
1
java
1
image-to-text
1
word-documents
1
unstructured-data
1
tika
1
text-recognition
1
text-processing
1
text-parsing
1
text-mining
1
text-extraction
1
text-analytics
1
structured-data
1
powerpoint
1
pdf-parsing
1
pdf
1
office-documents
1
document-metadata
1
siegfried-to-sqlite
1
file-format-id
1
droid-to-sqlite
1
droid
1
python
1
yaml
1
toml
1
file-formats
1
json
1
disk-image
1
diskimages
1
identification
1
characterization
1
reporting
1
retrieval-augmented-generation
1
pdf-to-text
1
natural-language-processing
1
ml
1
metadata-extraction
1
document-management
1
document-intelligence
1
document-indexing
1
document-extraction
1
document-converter
1
document-classification
1
document-automation
1
document-analysis
1
document-ai
1
data-processing
1
data-parsing
1
data-extraction
1
content-type
1
content-processing
1
content-parsing
1
content-management
1
content-intelligence
1
content-indexing
1
content-extraction
1
content-detection
1
apache-tika
1
ocr
1
mime-type
1
metadata
1
language-detection
1
information-extraction
1
image-extraction
1
format-detection
1
file-type
1
file-reader
1
file-processing
1
file-parsing
1
file-identification
1
file-format
1
file-conversion
1
excel
1
docx
1
document-understanding
1
document-text
1
document-reader
1
document-processing
1
document-parsing
1
document-ocr
1