pypi.org "structured-data" keyword
xlstruct 0.1.0
LLM-powered Excel parser — define a Pydantic schema, get structured data from any Excel file1 version - Latest release: about 17 hours ago - 0 stars on GitHub - 1 maintainer
petey 0.1.2
Petey — The Easy PDF Extractor3 versions - Latest release: 1 day ago - 153 downloads last month - 1 maintainer
architxt 0.6.0 💰
ArchiTXT is a tool for structuring textual data into a valid database model. It is guided by a me...12 versions - Latest release: 28 days ago - 336 downloads last month - 5 stars on GitHub - 1 maintainer
jobextractor 0.1.0
Professional job description extraction using multiple LLM providers1 version - Latest release: about 2 months ago - 27 downloads last month - 1 maintainer
indoxraghelper 0.0.3
Indox Retrieval Augmentation3 versions - Latest release: about 1 year ago - 18 downloads last month - 20 stars on GitHub - 1 maintainer
langextract 1.1.1
LangExtract: A library for extracting structured data from language models18 versions - Latest release: 3 months ago - 130 thousand downloads last month - 33,499 stars on GitHub - 1 maintainer
iflow-mcp_langextract 1.1.0
LangExtract: A library for extracting structured data from language models1 version - Latest release: 3 months ago - 16 downloads last month - 33,499 stars on GitHub - 1 maintainer
langextract-azureopenai 0.1.7
LangExtract provider plugin for Azure OpenAI3 versions - Latest release: 7 months ago - 1.69 thousand downloads last month - 33,499 stars on GitHub - 1 maintainer
hydration 4.0.0
A module used to define python objects that can be converted to (and from) bytes.15 versions - Latest release: about 5 years ago - 1 dependent repositories - 385 downloads last month - 16 stars on GitHub - 2 maintainers
Top 1.8% on pypi.org
1,232 versions - Latest release: 3 months ago - 3 dependent packages - 15 dependent repositories - 186 thousand downloads last month - 6,566 stars on GitHub - 1 maintainer
autogluon.multimodal 1.5.0
Fast and Accurate ML in 3 Lines of Code1,232 versions - Latest release: 3 months ago - 3 dependent packages - 15 dependent repositories - 186 thousand downloads last month - 6,566 stars on GitHub - 1 maintainer
nlstruct 0.2.0
Natural language structuring library8 versions - Latest release: about 2 years ago - 1 dependent package - 1 dependent repositories - 107 downloads last month - 21 stars on GitHub - 1 maintainer
metaminer 0.3.6
Extract structured information from documents using AI7 versions - Latest release: 9 months ago - 20 downloads last month - 0 stars on GitHub - 1 maintainer
extract-monster 0.1.0
Python SDK for Extract Monster - Extract structured data from files and text using AI1 version - Latest release: 5 months ago - 15 downloads last month - 1 maintainer
wagtail-herald 0.6.0 💰
SEO toolkit for Wagtail CMS - meta tags, Open Graph, Twitter Cards, and Schema.org structured data8 versions - Latest release: 7 days ago - 104 downloads last month - 1 stars on GitHub - 1 maintainer
open-receipt-extractor 0.1.0
Modular Python pipeline that converts raw receipt documents (images or PDFs) into structured, ana...1 version - Latest release: 8 days ago - 97 downloads last month - 1 maintainer
llama-index-readers-llama-parse 0.5.1
llama-index readers llama-parse integration10 versions - Latest release: 6 months ago - 6 dependent packages - 2.59 million downloads last month - 3,956 stars on GitHub - 1 maintainer
Top 7.3% on pypi.org
19 versions - Latest release: about 2 months ago - 433 downloads last month - 1 maintainer
pdf-structify 0.1.18
Extract structured data from PDFs using LLMs with sklearn-like API19 versions - Latest release: about 2 months ago - 433 downloads last month - 1 maintainer
open-xtract 0.2.0
Extract structured data from documents, images, audio, and video using LLMs6 versions - Latest release: about 2 months ago - 63 downloads last month - 13 stars on GitHub - 1 maintainer
llama-cloud-services 0.6.94
Tailored SDK clients for LlamaCloud services.93 versions - Latest release: 25 days ago - 29.3 million downloads last month - 3,956 stars on GitHub - 1 maintainer
indoxrag 0.1.1
Indox Retrieval Augmentation5 versions - Latest release: about 1 year ago - 23 downloads last month - 19 stars on GitHub - 1 maintainer
pandapy 2.2
Structured Numpy with Pandas a Click Away22 versions - Latest release: about 6 years ago - 1 dependent repositories - 40 downloads last month - 549 stars on GitHub - 1 maintainer
log-surgeon-ffi 0.1.0b10
Python FFI bindings for log-surgeon: high-performance parsing of unstructured logs into structure...11 versions - Latest release: about 1 month ago - 1.19 thousand downloads last month - 0 stars on GitHub - 2 maintainers
contextgem 0.21.0
Effortless LLM extraction from documents44 versions - Latest release: 16 days ago - 2.02 thousand downloads last month - 1,697 stars on GitHub - 1 maintainer
amphi-scheduler 0.9.7
Amphi Scheduler (JupyterLab extension + Python backend)10 versions - Latest release: 22 days ago - 242 downloads last month - 1,278 stars on GitHub - 1 maintainer
Top 1.2% on pypi.org
1,848 versions - Latest release: 3 months ago - 7 dependent packages - 44 dependent repositories - 298 thousand downloads last month - 7,185 stars on GitHub - 3 maintainers
autogluon.tabular 1.5.0
Fast and Accurate ML in 3 Lines of Code1,848 versions - Latest release: 3 months ago - 7 dependent packages - 44 dependent repositories - 298 thousand downloads last month - 7,185 stars on GitHub - 3 maintainers
Top 1.3% on pypi.org
1,456 versions - Latest release: 3 months ago - 11 dependent packages - 23 dependent repositories - 242 thousand downloads last month - 6,566 stars on GitHub - 1 maintainer
autogluon.common 1.5.0
Fast and Accurate ML in 3 Lines of Code1,456 versions - Latest release: 3 months ago - 11 dependent packages - 23 dependent repositories - 242 thousand downloads last month - 6,566 stars on GitHub - 1 maintainer
lightfeed 0.1.6
Lightfeed API Client for Python6 versions - Latest release: 9 months ago - 32 downloads last month - 5 stars on GitHub - 1 maintainer
Top 4.0% on pypi.org
1,274 versions - Latest release: 3 months ago - 4 dependent repositories - 328 thousand downloads last month - 9,716 stars on GitHub - 1 maintainer
autogluon.timeseries 1.5.0
Fast and Accurate ML in 3 Lines of Code1,274 versions - Latest release: 3 months ago - 4 dependent repositories - 328 thousand downloads last month - 9,716 stars on GitHub - 1 maintainer
aglite-test.features 0.7.0b20230314
AutoML for Image, Text, and Tabular Data8 versions - Latest release: almost 3 years ago - 2 dependent packages - 140 downloads last month - 9,716 stars on GitHub - 1 maintainer
aglite-test.common 0.7.0b20230314
AutoML for Image, Text, and Tabular Data8 versions - Latest release: almost 3 years ago - 2 dependent packages - 165 downloads last month - 9,493 stars on GitHub - 1 maintainer
tableshot 0.1.0
Extract tables from PDFs into clean, structured data -- instantly. An MCP server for AI assistants.1 version - Latest release: 18 days ago - 91 downloads last month - 1 maintainer
openllmindex 0.1.0
LLM-ready index generator for websites — spec, validator, and CLI tools1 version - Latest release: 17 days ago - 1 maintainer
documiner 0.8.2
Advanced tool designed for text analysis and data mining in documents1 version - Latest release: 8 months ago - 1 maintainer
openapi-client-generator 1.0.13 💰
OpenAPI Client Generator15 versions - Latest release: about 5 years ago - 1 dependent repositories - 142 downloads last month - 9 stars on GitHub - 1 maintainer
Top 0.9% on pypi.org
1,695 versions - Latest release: over 1 year ago - 11 dependent packages - 111 dependent repositories - 288 thousand downloads last month - 6,566 stars on GitHub - 3 maintainers
autogluon.core 1.1.1
Fast and Accurate ML in 3 Lines of Code1,695 versions - Latest release: over 1 year ago - 11 dependent packages - 111 dependent repositories - 288 thousand downloads last month - 6,566 stars on GitHub - 3 maintainers
Top 1.3% on pypi.org
1,582 versions - Latest release: over 1 year ago - 7 dependent packages - 39 dependent repositories - 283 thousand downloads last month - 6,566 stars on GitHub - 3 maintainers
autogluon.features 1.1.1
Fast and Accurate ML in 3 Lines of Code1,582 versions - Latest release: over 1 year ago - 7 dependent packages - 39 dependent repositories - 283 thousand downloads last month - 6,566 stars on GitHub - 3 maintainers
Top 1.7% on pypi.org
1,886 versions - Latest release: over 1 year ago - 8 dependent packages - 59 dependent repositories - 270 thousand downloads last month - 6,566 stars on GitHub - 2 maintainers
autogluon 1.1.1
Fast and Accurate ML in 3 Lines of Code1,886 versions - Latest release: over 1 year ago - 8 dependent packages - 59 dependent repositories - 270 thousand downloads last month - 6,566 stars on GitHub - 2 maintainers
structcast 1.1.4
Elegantly orchestrating structured data via a flexible and serializable workflow.6 versions - Latest release: 17 days ago - 277 downloads last month - 0 stars on GitHub - 1 maintainer
indox 0.1.31
Indox Retrieval Augmentation29 versions - Latest release: over 1 year ago - 154 downloads last month - 20 stars on GitHub - 2 maintainers
indoxgen 0.2.0
Indox Synthetic Data Generation14 versions - Latest release: about 1 year ago - 101 downloads last month - 20 stars on GitHub - 1 maintainer
structer 0.3.0
Structer is a structurer written in Python based on C language structs.4 versions - Latest release: over 1 year ago - 12 downloads last month - 1 stars on GitHub - 1 maintainer
aglite-test 0.7.0b20230314
AutoML for Image, Text, and Tabular Data8 versions - Latest release: almost 3 years ago - 159 downloads last month - 9,493 stars on GitHub - 1 maintainer
dynamic-baml 0.2.0
A standalone library for dynamic BoundaryML schema generation and LLM response parsing4 versions - Latest release: 9 months ago - 43 downloads last month - 1 maintainer
Top 2.5% on pypi.org
837 versions - Latest release: about 3 years ago - 1 dependent package - 17 dependent repositories - 49.8 thousand downloads last month - 9,493 stars on GitHub - 1 maintainer
autogluon.text 0.6.2
AutoML for Image, Text, and Tabular Data837 versions - Latest release: about 3 years ago - 1 dependent package - 17 dependent repositories - 49.8 thousand downloads last month - 9,493 stars on GitHub - 1 maintainer
target_benchmark 0.1.3
Table Retrieval for Generative Tasks Benchmark4 versions - Latest release: 10 months ago - 48 downloads last month - 22 stars on GitHub - 1 maintainer
mseep-kreuzberg 3.13.5
Document intelligence framework for Python - Extract text, metadata, and structured data from div...4 versions - Latest release: 6 months ago - 43 downloads last month - 2,454 stars on GitHub - 1 maintainer
autogluon-tonyhu-test 1.0.5b20240302
AutoML for Image, Text, and Tabular Data4 versions - Latest release: about 2 years ago - 748 downloads last month - 9,493 stars on GitHub - 1 maintainer
tabular-ml-toolkit 0.0.35
A helper library to jumpstart your machine learning project based on tabular or structured data.35 versions - Latest release: about 4 years ago - 1 dependent repositories - 141 downloads last month - 1 stars on GitHub - 1 maintainer
gittxt 1.7.7
Gittxt: Get Text from Git — Optimized for AI.18 versions - Latest release: 11 months ago - 93 downloads last month - 0 stars on GitHub - 1 maintainer
langstruct 0.2.0
LLM-powered structured information extraction using DSPy optimization6 versions - Latest release: 5 months ago - 519 downloads last month - 55 stars on GitHub - 1 maintainer
outformer 0.1.3
Structure Outputs from Language Models4 versions - Latest release: 9 months ago - 40 downloads last month - 10 stars on GitHub - 1 maintainer
aglite-test.tabular 0.7.0b20230314
AutoML for Image, Text, and Tabular Data8 versions - Latest release: almost 3 years ago - 1 dependent package - 141 downloads last month - 9,493 stars on GitHub - 1 maintainer
lightfeed-sdk 0.1.7
Lightfeed SDK for Python1 version - Latest release: 9 months ago - 7 downloads last month - 5 stars on GitHub - 1 maintainer
autogluon-tonyhu-test.multimodal 1.0.5b20240302
AutoML for Image, Text, and Tabular Data5 versions - Latest release: about 2 years ago - 1 dependent package - 788 downloads last month - 9,493 stars on GitHub - 1 maintainer
aglite-test.core 0.7.0b20230314
AutoML for Image, Text, and Tabular Data8 versions - Latest release: almost 3 years ago - 2 dependent packages - 161 downloads last month - 9,493 stars on GitHub - 1 maintainer
Top 2.5% on pypi.org
846 versions - Latest release: about 3 years ago - 1 dependent package - 18 dependent repositories - 53.7 thousand downloads last month - 9,493 stars on GitHub - 1 maintainer
autogluon.vision 0.6.2
AutoML for Image, Text, and Tabular Data846 versions - Latest release: about 3 years ago - 1 dependent package - 18 dependent repositories - 53.7 thousand downloads last month - 9,493 stars on GitHub - 1 maintainer
autotabular 0.12.0
Automatic machine learning for tabular data.1 version - Latest release: over 4 years ago - 1 dependent repositories - 31 downloads last month - 70 stars on GitHub - 1 maintainer
Top 8.8% on pypi.org
189 versions - Latest release: almost 2 years ago - 1.33 thousand downloads last month - 9,493 stars on GitHub - 1 maintainer
autogluon.eda 0.8.3
AutoML for Image, Text, and Tabular Data189 versions - Latest release: almost 2 years ago - 1.33 thousand downloads last month - 9,493 stars on GitHub - 1 maintainer
doc2json 0.1.0
Turn unstructured documents into clean JSON with auto-generated schemas1 version - Latest release: 3 months ago - 61 downloads last month - 1 maintainer
jertl 0.1.3
A minimum viable package for processing structured data4 versions - Latest release: over 3 years ago - 15 downloads last month - 0 stars on GitHub - 1 maintainer
open-parser 0.0.7
Open parser for all.7 versions - Latest release: almost 2 years ago - 48 downloads last month - 130 stars on GitHub - 1 maintainer
indoxarcg 0.0.14
Indox Retrieval Augmentation13 versions - Latest release: 12 months ago - 59 downloads last month - 20 stars on GitHub - 2 maintainers
superpipe-py 0.1.9
build unstructured to structured data transformation pipelines8 versions - Latest release: over 1 year ago - 38 downloads last month - 108 stars on GitHub - 1 maintainer
autogluon-tonyhu-test.core 1.0.5b20240302
AutoML for Image, Text, and Tabular Data5 versions - Latest release: about 2 years ago - 1 dependent package - 781 downloads last month - 9,493 stars on GitHub - 1 maintainer
autogluon-tonyhu-test.common 1.0.5b20240302
AutoML for Image, Text, and Tabular Data5 versions - Latest release: about 2 years ago - 4 dependent packages - 741 downloads last month - 9,493 stars on GitHub - 1 maintainer
sibila 0.4.5
Structured queries from local or online LLM models15 versions - Latest release: over 1 year ago - 17 downloads last month - 42 stars on GitHub - 1 maintainer
Top 9.4% on pypi.org
14 versions - Latest release: about 2 years ago - 3 dependent repositories - 249 downloads last month - 694 stars on GitHub - 1 maintainer
deeptables 0.2.6
Deep-learning Toolkit for Tabular datasets14 versions - Latest release: about 2 years ago - 3 dependent repositories - 249 downloads last month - 694 stars on GitHub - 1 maintainer
sentinelsearcher 0.2.0
AI-powered web search and structured data extraction using Anthropic Claude or OpenAI GPT1 version - Latest release: 3 months ago - 151 downloads last month - 1 maintainer
bitbuffet 1.0.2
Python SDK for the bitbuffet API - BitBuffet7 versions - Latest release: 6 months ago - 53 downloads last month - 1 stars on GitHub - 1 maintainer
scrape-schema 0.6.3
A library for converting any text (xml, html, plain text, stdout, etc) to python datatypes37 versions - Latest release: over 2 years ago - 2 dependent packages - 211 downloads last month - 4 stars on GitHub - 1 maintainer
llm-parse 0.1.5
Parse data from documents optimised for downstream llm tasks.6 versions - Latest release: 9 months ago - 41 downloads last month - 3,859 stars on GitHub - 1 maintainer
data-analysis-framework 2.0.0
AI-powered analysis framework for structured data files and databases - part of the unified analy...3 versions - Latest release: 4 months ago - 53 downloads last month - 0 stars on GitHub - 1 maintainer
llmextract 0.2.0
A library to extract structured information from unstructured text using LLMs, powered by LangChain.2 versions - Latest release: 5 months ago - 19 downloads last month - 1 maintainer
thepipe-api 1.7.1
Get clean data from tricky documents, powered by VLMs.59 versions - Latest release: 5 months ago - 7.55 thousand downloads last month - 1,317 stars on GitHub - 1 maintainer
jsonschema-default 1.8.1
Create default objects from a JSON schema15 versions - Latest release: 12 months ago - 1 dependent package - 1 dependent repositories - 9.96 thousand downloads last month - 10 stars on GitHub - 1 maintainer
autogluon-tonyhu-test.features 1.0.5b20240302
AutoML for Image, Text, and Tabular Data5 versions - Latest release: about 2 years ago - 3 dependent packages - 779 downloads last month - 9,493 stars on GitHub - 1 maintainer
cleanlab-cli 0.1.14
Command line interface for all things Cleanlab Studio16 versions - Latest release: over 3 years ago - 185 downloads last month - 21 stars on GitHub - 3 maintainers
lmnr-baml 0.40.1
LMNR BAML for Python2 versions - Latest release: over 1 year ago - 345 downloads last month - 3,272 stars on GitHub - 1 maintainer
Top 2.5% on pypi.org
420 versions - Latest release: over 4 years ago - 3 dependent packages - 9 dependent repositories - 10.3 thousand downloads last month - 9,341 stars on GitHub - 1 maintainer
autogluon.mxnet 0.3.1
AutoML for Text, Image, and Tabular Data420 versions - Latest release: over 4 years ago - 3 dependent packages - 9 dependent repositories - 10.3 thousand downloads last month - 9,341 stars on GitHub - 1 maintainer
autogluon-tonyhu-test.tabular 1.0.5b20240302
AutoML for Image, Text, and Tabular Data5 versions - Latest release: about 2 years ago - 747 downloads last month - 9,493 stars on GitHub - 1 maintainer
tikara 0.1.6
The metadata and text content extractor for almost every file type.6 versions - Latest release: about 1 year ago - 182 downloads last month - 4 stars on GitHub - 1 maintainer
autogluon-tonyhu-test.timeseries 1.0.5b20240302
AutoML for Image, Text, and Tabular Data5 versions - Latest release: about 2 years ago - 743 downloads last month - 9,493 stars on GitHub - 1 maintainer
markdown-table-extractor 0.1.2
Robust structured data extraction from markdown text, built with literate programming using marim...3 versions - Latest release: 3 months ago - 170 downloads last month - 1 maintainer
delta_stream 0.1.6
Efficient structured streaming for real-time LLM outputs7 versions - Latest release: 10 months ago - 118 downloads last month - 3 stars on GitHub - 1 maintainer
structured-prompts 0.1.1
A modular package for managing structured prompts with any LLM API2 versions - Latest release: 7 months ago - 16 downloads last month - 0 stars on GitHub - 1 maintainer
docstrange 1.1.8
Extract and Convert PDF, Word, PowerPoint, Excel, images, URLs into multiple formats (Markdown, J...19 versions - Latest release: 4 months ago - 3.9 thousand downloads last month - 935 stars on GitHub - 1 maintainer
jxon-schema 0.2.0
JSON with change tracking - A library for converting between JSON and schemas with change tracking1 version - Latest release: about 1 year ago - 9 downloads last month - 1 stars on GitHub - 1 maintainer
structured-output-cookbook 0.1.2
Extract structured data from text using LLMs with ready-to-use templates1 version - Latest release: 8 months ago - 6 downloads last month - 1 stars on GitHub - 1 maintainer
transfertab 0.1.0
A library to help transfer learn for structured data.1 version - Latest release: over 4 years ago - 1 dependent repositories - 10 downloads last month - 1 stars on GitHub - 1 maintainer
Top 3.0% on pypi.org
420 versions - Latest release: over 4 years ago - 1 dependent package - 9 dependent repositories - 20.7 thousand downloads last month - 9,860 stars on GitHub - 1 maintainer
autogluon.extra 0.3.1
AutoML for Text, Image, and Tabular Data420 versions - Latest release: over 4 years ago - 1 dependent package - 9 dependent repositories - 20.7 thousand downloads last month - 9,860 stars on GitHub - 1 maintainer
leapocr 0.0.4
Official Python SDK for LeapOCR - Transform documents into structured data using AI-powered OCR4 versions - Latest release: 4 months ago - 40 downloads last month - 1 maintainer
any-parser 0.0.26
Parser for all.19 versions - Latest release: 6 months ago - 497 downloads last month - 129 stars on GitHub - 1 maintainer
ljson 0.5.4
A table dataformat based on json16 versions - Latest release: almost 7 years ago - 1 dependent repositories - 101 downloads last month - 2 stars on GitHub - 1 maintainer
openapi-type 0.2.0 💰
OpenAPI Type23 versions - Latest release: over 3 years ago - 2 dependent repositories - 235 downloads last month - 11 stars on GitHub - 1 maintainer
exstruct 0.4.2
Excel to structured JSON (tables, shapes, charts) for LLM/RAG pipelines26 versions - Latest release: about 2 months ago - 1.91 thousand downloads last month - 2 stars on GitHub - 1 maintainer
indoxminer 0.1.5
Indox Data Extraction19 versions - Latest release: about 1 year ago - 63 downloads last month - 20 stars on GitHub - 2 maintainers
jintra-aether 1.0.4
A lightweight, extensible framework for structured content authoring and validation with AI assis...5 versions - Latest release: 5 months ago - 30 downloads last month - 0 stars on GitHub - 1 maintainer
cleanlab-studio 2.5.21
Client interface for all things Cleanlab Studio128 versions - Latest release: about 1 year ago - 1 dependent repositories - 2.42 thousand downloads last month - 21 stars on GitHub - 4 maintainers
typeit 0.27.2 💰
typeit brings typed data into your project56 versions - Latest release: over 5 years ago - 4 dependent repositories - 502 downloads last month - 13 stars on GitHub - 1 maintainer
extrai-workflow 1.0.1
Structured data extraction with LLM majority vote2 versions - Latest release: 3 months ago - 39 downloads last month - 2 stars on GitHub - 1 maintainer
Related Keywords
llm
39
python
36
machine-learning
33
data-science
31
natural-language-processing
29
deep-learning
28
tabular-data
27
automl
27
scikit-learn
26
computer-vision
26
transfer-learning
25
time-series
24
pytorch
24
object-detection
24
hyperparameter-optimization
24
autogluon
24
automated-machine-learning
24
ensemble-learning
24
forecasting
24
gluon
24
ai
20
document
16
pdf
15
data-extraction
15
unstructured-data
13
json
13
nlp
13
openai
12
rag
11
extraction
10
large-language-models
9
document-processing
9
pydantic
8
ml
8
document-parsing
7
schema
7
parsing
7
gemini
7
image-classification
7
machine learning
7
ocr
7
index
6
information-extraction
6
LLM
6
AI
6
NLP
6
serialization
5
yaml
5
text-extraction
5
document-extraction
5
text-processing
5
language models
5
deep learning
5
python3
4
tables
4
pdf-to-text
4
pdf-to-markdown
4
pdf-to-json
4
document-parser
4
validation
4
document-analysis
4
document-intelligence
4
extract
4
document-understanding
4
RAG
4
llm-extraction
4
retrieval-augmented generation
4
natural language processing
4
anthropic
4
parser
4
excel
4
deserialization
4
data-labeling
4
retrieval-augmented-generation
3
multimodal
3
data-analysis
3
docx-to-markdown
3
structured-generation
3
pdf-document-processor
3
pdf-to-excel
3
web-scraping
3
ppt-to-json
3
ppt-to-markdown
3
pptx
3
automation
3
database
3
document-ai
3
artificial-intelligence
3
mcp
3
mypy
3
content-extraction
3
table-extraction
3
typing
3
etl
3
data
3
information-extration
3
gemini-pro
3
gemini-flash
3
gemini-api
3
gemini-ai
3