pypi.org : tikara
The metadata and text content extractor for almost every file type.
Registry
-
Source
- Documentation
- JSON
purl: pkg:pypi/tikara
Keywords:
apache-tika
, content-detection
, content-extraction
, content-indexing
, content-intelligence
, content-management
, content-parsing
, content-processing
, content-type
, data-extraction
, data-parsing
, data-processing
, document-ai
, document-analysis
, document-automation
, document-classification
, document-converter
, document-extraction
, document-indexing
, document-intelligence
, document-management
, document-metadata
, document-ocr
, document-parsing
, document-processing
, document-reader
, document-text
, document-understanding
, docx
, excel
, file-analysis
, file-conversion
, file-format
, file-identification
, file-parsing
, file-processing
, file-reader
, file-type
, format-detection
, format-identification
, image-extraction
, information-extraction
, language-detection
, metadata
, mime-type
, ocr
, office-documents
, pdf
, pdf-parsing
, powerpoint
, structured-data
, text-analytics
, text-extraction
, text-mining
, text-parsing
, text-processing
, text-recognition
, tika
, unstructured-data
, word-documents
, image-to-text
, java
, llm
, metadata-extraction
, ml
, natural-language-processing
, pdf-to-text
, retrieval-augmented-generation
License: Apache-2.0
Latest release: 3 months ago
First release: 3 months ago
Downloads: 214 last month
Stars: 1 on GitHub
Forks: 0 on GitHub
See more repository details: repos.ecosyste.ms
Last synced: 13 days ago