pypi.org "structured-data-extraction" keyword
View the packages on the pypi.org package registry that are tagged with the "structured-data-extraction" keyword.
contextgem 0.19.2
Effortless LLM extraction from documents40 versions - Latest release: about 1 month ago - 2.93 thousand downloads last month - 1,511 stars on GitHub - 1 maintainer
docstrange 1.1.8
Extract and Convert PDF, Word, PowerPoint, Excel, images, URLs into multiple formats (Markdown, J...19 versions - Latest release: 8 days ago - 3.48 thousand downloads last month - 935 stars on GitHub - 1 maintainer
llm-data-converter 2.2.0
Best open-source document to markdown converter for LLM training data. Convert PDF, Word, PowerPo...23 versions - Latest release: 4 months ago - 133 downloads last month - 5 stars on GitHub - 1 maintainer
document-data-extractor 1.0.4
Best open-source document to markdown extractor for LLM training data. Convert PDF, Word, PowerPo...5 versions - Latest release: 3 months ago - 38 downloads last month - 3 stars on GitHub - 1 maintainer
validex 0.0.3
A Python package to extract data from unstructured into structured format3 versions - Latest release: 11 months ago - 3 downloads last month - 146 stars on GitHub - 1 maintainer
documiner 0.8.2
Advanced tool designed for text analysis and data mining in documents1 version - Latest release: 4 months ago - 1 maintainer
Related Keywords
llm
6
document-processing
5
document-understanding
5
batch-document-processing
3
mineru-alternative
3
markitdown-alternative
3
marker-alternative
3
docling-alternative
3
unstructured-alternative
3
ai-training-data
3
rag
3
ocr
3
intelligent-document-processing
3
image-processing
3
pdf
3
markdown
3
document-conversion
3
llm-extraction
3
structured-data
3
word-to-markdown
3
powerpoint-to-markdown
3
excel-to-markdown
3
html-to-markdown
3
text-extraction
3
document-ai
3
llm-ready-data
3
layout-detection
3
table-extraction
3
nlp
3
local-document-processing
3
pdf-to-markdown
3
document-to-markdown
3
tesseract-alternative
3
paddleocr-alternative
3
document-parsing
3
text-analysis
2
text-processing
2
semantic-analysis
2
reference-mapping
2
question-answering
2
topic-extraction
2
unstructured-data
2
zero-shot
2
ai
2
offline-document-extractor
2
offline-document-converter
2
ppt-to-markdown
2
prompt-free
2
extraction-justifications
2
entity-extraction
2
docx
2
document-qa
2
document-pipeline
2
document-intelligence
2
document-extraction
2
document-analysis
2
document
2
data-extraction
2
contract-review
2
contract-parsing
2
contract-management
2
contract-intelligence
2
contract-automation
2
contract-analysis
2
context-aware
2
content-extraction
2
concept-extraction
2
automated-prompting
2
aspect-extraction
2
artificial-intelligence
2
no-prompt-engineering
2
neural-segmentation
2
multimodal
2
multilingual
2
machine-learning
2
low-code
2
llm-reasoning
2
llm-library
2
llm-framework
2
fintech
2
extraction-pipeline
2
legaltech
2
large-language-models
2
generative-ai
2
knowledge-extraction
2
insights-extraction
2
information-extraction
2
pdf-to-json
1
structured-data-capture
1
tables
1
extraction
1
openai
1
structured output parsing
1
fastapi
1
structured-output
1
documiner
1
pdf-parser
1
image-to-markdown
1
document-parser
1
docstrange
1