pypi.org "data-ingestion" keyword
View the packages on the pypi.org package registry that are tagged with the "data-ingestion" keyword.
ingestr 0.14.102
ingestr is a command-line application that ingests data from various sources and stores them in a...194 versions - Latest release: about 23 hours ago - 18.5 thousand downloads last month - 3,324 stars on GitHub - 1 maintainer
intabular 0.1.1
Intelligent Table Data Ingestion - AI-powered CSV mapping and schema alignment2 versions - Latest release: 6 months ago - 17 downloads last month - 0 stars on GitHub - 1 maintainer
llm-markdownify 0.3.0
Convert PDFs, images to high-quality Markdown using Vision LLMs.3 versions - Latest release: 3 months ago - 33 downloads last month - 14 stars on GitHub - 1 maintainer
dbsconnector 1.4
Python package for connecting and importing data from different DataBases5 versions - Latest release: about 1 year ago - 33 downloads last month - 11 stars on GitHub - 1 maintainer
feedunify 0.3.3
A high-performance, unifying library for data ingestion pipelines from multiple sources.5 versions - Latest release: 3 months ago - 30 downloads last month - 4 stars on GitHub - 1 maintainer
skeem 0.1.1
Infer SQL DDL statements from tabular data2 versions - Latest release: about 1 year ago - 18 downloads last month - 3 stars on GitHub - 3 maintainers
tensorus 0.1.0
An agentic tensor database with unified SDK, agent orchestration, and intelligent workflows for M...10 versions - Latest release: 12 days ago - 20 downloads last month - 1 stars on GitHub - 1 maintainer
Top 6.0% on pypi.org
8 versions - Latest release: over 6 years ago - 2 dependent packages - 7 dependent repositories - 90.1 thousand downloads last month - 25 stars on GitHub - 1 maintainer
oneagent-sdk 1.2.1
Dynatrace OneAgent SDK for Python8 versions - Latest release: over 6 years ago - 2 dependent packages - 7 dependent repositories - 90.1 thousand downloads last month - 25 stars on GitHub - 1 maintainer
conduit-core 1.0.0
The dbt of data ingestion - declarative, reliable, and testable data pipelines1 version - Latest release: 15 days ago
tracebloc-ingestor 0.2.6
A flexible data ingestion library for various file formats10 versions - Latest release: 16 days ago - 540 downloads last month - 2 stars on GitHub - 4 maintainers
config-driven-data-ingestion 1.0.2
A comprehensive, config-driven data ingestion library for Python3 versions - Latest release: 4 months ago - 28 downloads last month - 0 stars on GitHub - 1 maintainer
bors 0.3.6
A highly flexible and extensible service integration framework for scraping the web or consuming ...8 versions - Latest release: about 7 years ago - 1 dependent repositories - 66 downloads last month - 2 stars on GitHub - 1 maintainer
scrapy-meili-pipeline 0.1.1
A Scrapy pipeline that batches and indexes items into Meilisearch, with task tracking and index c...1 version - Latest release: 21 days ago
milvus-ingest 0.1.2
High-performance data ingestion tool for Milvus vector database with vectorized operations3 versions - Latest release: 4 months ago - 26 downloads last month - 1 maintainer
rekonify-python-sdk 2.1.4
The Rekonify Python SDK is the official client for seamless, async-ready integration with the Rek...6 versions - Latest release: 5 months ago - 28 downloads last month - 1 maintainer
nanostream 0.1.22
Small-scale stream processing for ETL24 versions - Latest release: over 6 years ago - 1 dependent repositories - 98 downloads last month - 1 stars on GitHub - 1 maintainer
zeroetl 0.1.0
A Python package for ingesting data into Iceberg tables using PySpark.1 version - Latest release: 4 months ago - 9 downloads last month - 0 stars on GitHub - 1 maintainer
estceque 0.5
Elasticsearch ingest pipeline validation5 versions - Latest release: 5 months ago - 20 downloads last month - 1 stars on gitlab.com - 1 maintainer
Top 9.5% on pypi.org
114 versions - Latest release: about 1 year ago - 2 dependent repositories - 337 downloads last month - 281 stars on GitHub - 2 maintainers
squirrel-core 0.20.2
Squirrel is a Python library that enables ML teams to share, load, and transform data in a collab...114 versions - Latest release: about 1 year ago - 2 dependent repositories - 337 downloads last month - 281 stars on GitHub - 2 maintainers
metalpipe 0.1.15
Modules for ETL Pipelines15 versions - Latest release: over 6 years ago - 1 dependent repositories - 19 downloads last month - 1 stars on GitHub - 1 maintainer
squirrel-datasets-core 0.3.1
Squirrel public datasets collection62 versions - Latest release: over 2 years ago - 2 dependent repositories - 126 downloads last month - 42 stars on GitHub - 2 maintainers
mustash 0.2.2
Portable ingest pipelines for documents5 versions - Latest release: about 1 year ago - 25 downloads last month - 0 stars on gitlab.com - 1 maintainer
snowmover 0.0.1a0
A lightweight Python toolkit for Snowflake data moves: upload local text files with PUT/COPY INTO...1 version - Latest release: 25 days ago - 1 maintainer
estuary-airbyte-cdk 0.1.56
A framework for writing Airbyte Connectors. Estuary's fork.2 versions - Latest release: over 3 years ago - 19 downloads last month - 12,512 stars on GitHub - 1 maintainer
paimon-python 0.1.0 removed
Apache Paimon Python API3 versions - Latest release: about 1 year ago - 206 downloads last month - 1,987 stars on GitHub - 1 maintainer
Related Keywords
etl
8
python
5
data-pipeline
4
ai
4
data-integration
3
database
3
pytorch
3
snowflake
3
csv
3
pipeline
3
data-engineering
3
ingest
2
es-query
2
es_query
2
elasticsearch
2
stream-processing
2
sdk
2
agent
2
json
2
open-source
2
cloud-computing
2
collaboration
2
computer-vision
2
cv
2
data-mesh
2
data-science
2
dataops
2
datasets
2
deep-learning
2
distributed
2
machine-learning
2
ml
2
natural-language-processing
2
tensorflow
2
pandas
2
llm
2
bigquery
2
excel-import
1
integration
1
pay-in
1
payout
1
erp
1
banking
1
png
1
pyspark
1
iceberg
1
pdf
1
ocr
1
markdown
1
litellm
1
pydantic
1
jpg
1
jpeg
1
image
1
docx
1
search
1
milvus
1
vector-database
1
bulk-import
1
data-generation
1
high-performance
1
vectorization
1
rekonify
1
reconciliation
1
finance
1
transactions
1
payment
1
automation
1
api-client
1
async
1
event-tracking
1
financial-data
1
accounting
1
python-sdk
1
transaction-matching
1
data-warehouse
1
database-tools
1
airbyte
1
connector-development-kit
1
cdk
1
change-data-capture
1
data
1
data-analysis
1
data-collection
1
elt
1
java
1
redshift
1
big-data
1
flink
1
paimon
1
real-time-analytics
1
spark
1
streaming-datalake
1
table-store
1
schema-alignment
1
data-mapping
1
postgresql
1
mssql
1
internal
1
ingestion-pipeline
1