An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

pypi.org "data-ingestion" keyword

View the packages on the pypi.org package registry that are tagged with the "data-ingestion" keyword.

ingestr 0.14.102
ingestr is a command-line application that ingests data from various sources and stores them in a...
194 versions - Latest release: about 23 hours ago - 18.5 thousand downloads last month - 3,324 stars on GitHub - 1 maintainer
intabular 0.1.1
Intelligent Table Data Ingestion - AI-powered CSV mapping and schema alignment
2 versions - Latest release: 6 months ago - 17 downloads last month - 0 stars on GitHub - 1 maintainer
llm-markdownify 0.3.0
Convert PDFs, images to high-quality Markdown using Vision LLMs.
3 versions - Latest release: 3 months ago - 33 downloads last month - 14 stars on GitHub - 1 maintainer
dbsconnector 1.4
Python package for connecting and importing data from different DataBases
5 versions - Latest release: about 1 year ago - 33 downloads last month - 11 stars on GitHub - 1 maintainer
feedunify 0.3.3
A high-performance, unifying library for data ingestion pipelines from multiple sources.
5 versions - Latest release: 3 months ago - 30 downloads last month - 4 stars on GitHub - 1 maintainer
skeem 0.1.1
Infer SQL DDL statements from tabular data
2 versions - Latest release: about 1 year ago - 18 downloads last month - 3 stars on GitHub - 3 maintainers
tensorus 0.1.0
An agentic tensor database with unified SDK, agent orchestration, and intelligent workflows for M...
10 versions - Latest release: 12 days ago - 20 downloads last month - 1 stars on GitHub - 1 maintainer
Top 6.0% on pypi.org
oneagent-sdk 1.2.1
Dynatrace OneAgent SDK for Python
8 versions - Latest release: over 6 years ago - 2 dependent packages - 7 dependent repositories - 90.1 thousand downloads last month - 25 stars on GitHub - 1 maintainer
conduit-core 1.0.0
The dbt of data ingestion - declarative, reliable, and testable data pipelines
1 version - Latest release: 15 days ago
tracebloc-ingestor 0.2.6
A flexible data ingestion library for various file formats
10 versions - Latest release: 16 days ago - 540 downloads last month - 2 stars on GitHub - 4 maintainers
config-driven-data-ingestion 1.0.2
A comprehensive, config-driven data ingestion library for Python
3 versions - Latest release: 4 months ago - 28 downloads last month - 0 stars on GitHub - 1 maintainer
bors 0.3.6
A highly flexible and extensible service integration framework for scraping the web or consuming ...
8 versions - Latest release: about 7 years ago - 1 dependent repositories - 66 downloads last month - 2 stars on GitHub - 1 maintainer
scrapy-meili-pipeline 0.1.1
A Scrapy pipeline that batches and indexes items into Meilisearch, with task tracking and index c...
1 version - Latest release: 21 days ago
milvus-ingest 0.1.2
High-performance data ingestion tool for Milvus vector database with vectorized operations
3 versions - Latest release: 4 months ago - 26 downloads last month - 1 maintainer
rekonify-python-sdk 2.1.4
The Rekonify Python SDK is the official client for seamless, async-ready integration with the Rek...
6 versions - Latest release: 5 months ago - 28 downloads last month - 1 maintainer
nanostream 0.1.22
Small-scale stream processing for ETL
24 versions - Latest release: over 6 years ago - 1 dependent repositories - 98 downloads last month - 1 stars on GitHub - 1 maintainer
zeroetl 0.1.0
A Python package for ingesting data into Iceberg tables using PySpark.
1 version - Latest release: 4 months ago - 9 downloads last month - 0 stars on GitHub - 1 maintainer
estceque 0.5
Elasticsearch ingest pipeline validation
5 versions - Latest release: 5 months ago - 20 downloads last month - 1 stars on gitlab.com - 1 maintainer
Top 9.5% on pypi.org
squirrel-core 0.20.2
Squirrel is a Python library that enables ML teams to share, load, and transform data in a collab...
114 versions - Latest release: about 1 year ago - 2 dependent repositories - 337 downloads last month - 281 stars on GitHub - 2 maintainers
metalpipe 0.1.15
Modules for ETL Pipelines
15 versions - Latest release: over 6 years ago - 1 dependent repositories - 19 downloads last month - 1 stars on GitHub - 1 maintainer
squirrel-datasets-core 0.3.1
Squirrel public datasets collection
62 versions - Latest release: over 2 years ago - 2 dependent repositories - 126 downloads last month - 42 stars on GitHub - 2 maintainers
mustash 0.2.2
Portable ingest pipelines for documents
5 versions - Latest release: about 1 year ago - 25 downloads last month - 0 stars on gitlab.com - 1 maintainer
snowmover 0.0.1a0
A lightweight Python toolkit for Snowflake data moves: upload local text files with PUT/COPY INTO...
1 version - Latest release: 25 days ago - 1 maintainer
estuary-airbyte-cdk 0.1.56
A framework for writing Airbyte Connectors. Estuary's fork.
2 versions - Latest release: over 3 years ago - 19 downloads last month - 12,512 stars on GitHub - 1 maintainer
paimon-python 0.1.0 removed
Apache Paimon Python API
3 versions - Latest release: about 1 year ago - 206 downloads last month - 1,987 stars on GitHub - 1 maintainer