An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "structured-data-extraction" keyword

View the packages on the pypi.org package registry that are tagged with the "structured-data-extraction" keyword.

contextgem 0.19.2
Effortless LLM extraction from documents
40 versions - Latest release: about 1 month ago - 2.93 thousand downloads last month - 1,511 stars on GitHub - 1 maintainer
docstrange 1.1.8
Extract and Convert PDF, Word, PowerPoint, Excel, images, URLs into multiple formats (Markdown, J...
19 versions - Latest release: 8 days ago - 3.48 thousand downloads last month - 935 stars on GitHub - 1 maintainer
llm-data-converter 2.2.0
Best open-source document to markdown converter for LLM training data. Convert PDF, Word, PowerPo...
23 versions - Latest release: 4 months ago - 133 downloads last month - 5 stars on GitHub - 1 maintainer
document-data-extractor 1.0.4
Best open-source document to markdown extractor for LLM training data. Convert PDF, Word, PowerPo...
5 versions - Latest release: 3 months ago - 38 downloads last month - 3 stars on GitHub - 1 maintainer
validex 0.0.3
A Python package to extract data from unstructured into structured format
3 versions - Latest release: 11 months ago - 3 downloads last month - 146 stars on GitHub - 1 maintainer
documiner 0.8.2
Advanced tool designed for text analysis and data mining in documents
1 version - Latest release: 4 months ago - 1 maintainer
Related Keywords
llm 6 document-processing 5 document-understanding 5 batch-document-processing 3 mineru-alternative 3 markitdown-alternative 3 marker-alternative 3 docling-alternative 3 unstructured-alternative 3 ai-training-data 3 rag 3 ocr 3 intelligent-document-processing 3 image-processing 3 pdf 3 markdown 3 document-conversion 3 llm-extraction 3 structured-data 3 word-to-markdown 3 powerpoint-to-markdown 3 excel-to-markdown 3 html-to-markdown 3 text-extraction 3 document-ai 3 llm-ready-data 3 layout-detection 3 table-extraction 3 nlp 3 local-document-processing 3 pdf-to-markdown 3 document-to-markdown 3 tesseract-alternative 3 paddleocr-alternative 3 document-parsing 3 text-analysis 2 text-processing 2 semantic-analysis 2 reference-mapping 2 question-answering 2 topic-extraction 2 unstructured-data 2 zero-shot 2 ai 2 offline-document-extractor 2 offline-document-converter 2 ppt-to-markdown 2 prompt-free 2 extraction-justifications 2 entity-extraction 2 docx 2 document-qa 2 document-pipeline 2 document-intelligence 2 document-extraction 2 document-analysis 2 document 2 data-extraction 2 contract-review 2 contract-parsing 2 contract-management 2 contract-intelligence 2 contract-automation 2 contract-analysis 2 context-aware 2 content-extraction 2 concept-extraction 2 automated-prompting 2 aspect-extraction 2 artificial-intelligence 2 no-prompt-engineering 2 neural-segmentation 2 multimodal 2 multilingual 2 machine-learning 2 low-code 2 llm-reasoning 2 llm-library 2 llm-framework 2 fintech 2 extraction-pipeline 2 legaltech 2 large-language-models 2 generative-ai 2 knowledge-extraction 2 insights-extraction 2 information-extraction 2 pdf-to-json 1 structured-data-capture 1 tables 1 extraction 1 openai 1 structured output parsing 1 fastapi 1 structured-output 1 documiner 1 pdf-parser 1 image-to-markdown 1 document-parser 1 docstrange 1