An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

proxy.golang.org "data-engineering" keyword

Top 3.0% on proxy.golang.org
github.com/argoproj/argo-workflows v2.5.2+incompatible
Workflow Engine for Kubernetes
125 versions - Latest release: about 6 years ago - 3 dependent repositories - 16,138 stars on GitHub
Top 0.5% on proxy.golang.org
github.com/argoproj/argo-workflows/v3 v3.7.10
Workflow Engine for Kubernetes
169 versions - Latest release: 18 days ago - 95 dependent packages - 102 dependent repositories - 16,215 stars on GitHub
Top 2.0% on proxy.golang.org
github.com/rudderlabs/rudder-server v1.69.0
Privacy and Security focused Segment-alternative, in Golang and React
432 versions - Latest release: 2 days ago - 7 dependent packages - 1 dependent repositories - 4,310 stars on GitHub
Top 8.6% on proxy.golang.org
github.com/saucam/airflow-runner v0.1.0
1 version - Latest release: about 4 years ago - 2 stars on GitHub
Top 1.3% on proxy.golang.org
github.com/Jeffail/benthos/v3 v3.65.0 πŸ’°
Fancy stream processing made operationally mundane
99 versions - Latest release: almost 4 years ago - 13 dependent packages - 9 dependent repositories - 5,851 stars on GitHub
Top 1.4% on proxy.golang.org
github.com/feast-dev/feast/sdk/go v0.9.4
The Open Source Feature Store for AI/ML
18 versions - Latest release: almost 4 years ago - 6 dependent packages - 17 dependent repositories - 6,405 stars on GitHub
Top 8.2% on proxy.golang.org
github.com/ConduitIO/conduit v0.14.0
Conduit streams data between data stores. Kafka Connect replacement. No JVM required.
816 versions - Latest release: 9 months ago - 561 stars on GitHub
Top 3.9% on proxy.golang.org
github.com/conduitio/conduit v0.14.0
Conduit streams data between data stores. Kafka Connect replacement. No JVM required.
1,017 versions - Latest release: 9 months ago - 1 dependent package - 1 dependent repositories - 561 stars on GitHub
Top 5.9% on proxy.golang.org
github.com/Conduitio/conduit v0.14.0
Conduit streams data between data stores. Kafka Connect replacement. No JVM required.
305 versions - Latest release: 9 months ago - 561 stars on GitHub
Top 3.0% on proxy.golang.org
github.com/treeverse/lakefs v1.79.0
lakeFS - Data version control for your data lake | Git for data
263 versions - Latest release: 4 days ago - 1 dependent package - 1 dependent repositories - 4,999 stars on GitHub
Top 1.4% on proxy.golang.org
github.com/benthosdev/benthos/v4 v4.82.0 πŸ’°
Fancy stream processing made operationally mundane
183 versions - Latest release: about 23 hours ago - 36 dependent packages - 9 dependent repositories - 5,853 stars on GitHub
Top 5.6% on proxy.golang.org
github.com/apache/airflow/go-sdk v1.0.0-beta1
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
1 version - Latest release: 5 months ago - 42,884 stars on GitHub
github.com/jitsucom/bulker/bulkerlib v0.0.0-20240119080811-2fb2a0d61cc2
Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake,...
19 versions - Latest release: about 2 years ago - 198 stars on GitHub
Top 8.2% on proxy.golang.org
github.com/benthosdev/benthos-captain v0.1.0
A Kubernetes Operator to orchestrate Benthos pipelines
2 versions - Latest release: about 2 years ago - 44 stars on GitHub
Top 8.2% on proxy.golang.org
github.com/apache/incubator-airflow v1.8.2
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
1 version - Latest release: over 8 years ago - 33,577 stars on GitHub
Top 6.0% on proxy.golang.org
github.com/mirpo/chopdoc v0.0.12
A tool to split documents into chunks for RAG and LLM applications
10 versions - Latest release: 3 months ago - 2 stars on GitHub
Top 6.6% on proxy.golang.org
github.com/benmizrahi/duckspark v0.0.0-20240613130600-5ab7537d9fd2
duckspark - A DuckDB based distributed data processing engine
1 version - Latest release: over 1 year ago - 0 stars on GitHub
Top 6.5% on proxy.golang.org
github.com/raptor-ml/raptor v0.3.3
Transform your pythonic research to an artifact that engineers can deploy easily.
9 versions - Latest release: almost 2 years ago - 1 dependent repositories - 64 stars on GitHub
Top 6.6% on proxy.golang.org
github.com/gear5sh/gear5/drivers/google-sheets
high performance better alternative to Airbyte, Singer, Meltano
Latest release: 5 days ago - 16 stars on GitHub
github.com/n0rdy/pippin v0.0.4
Go library for creating and managing data pipelines
2 versions - Latest release: over 2 years ago - 0 stars on GitHub
Top 5.5% on proxy.golang.org
github.com/datazip-inc/olake-ui v0.3.2
Frontend & BFF (Backend for frontend) for Olake. This includes the UI code and backend code for s...
26 versions - Latest release: 6 days ago - 24 stars on GitHub
Top 9.6% on proxy.golang.org
github.com/dagster-io/dagster v0.2.8
An orchestration platform for the development, production, and observation of data assets.
7 versions - Latest release: over 7 years ago - 14,215 stars on GitHub
Top 3.6% on proxy.golang.org
github.com/apache/superset v2021.41.0+incompatible
Apache Superset is a Data Visualization and Data Exploration Platform
118 versions - Latest release: over 4 years ago - 1 dependent repositories - 68,353 stars on GitHub
Top 5.6% on proxy.golang.org
github.com/PureML-Inc/PureML/packages/purebackend v0.0.0-20230218144505-250e8001829b
Track, version, compare and review your data and models.
1 version - Latest release: about 3 years ago - 106 stars on GitHub
Top 5.6% on proxy.golang.org
github.com/setl-framework/setl v0.4.0
A simple Spark-powered ETL framework that just works 🍺
1 version - Latest release: about 6 years ago - 182 stars on GitHub
Top 6.3% on proxy.golang.org
github.com/ab180/lrmr v0.5.3
Less-Resilient MapReduce framework for Go
139 versions - Latest release: over 2 years ago - 32 stars on GitHub
Top 6.8% on proxy.golang.org
github.com/benthosdev/connect/v4 v4.81.0 πŸ’°
Fancy stream processing made operationally mundane
180 versions - Latest release: 16 days ago - 7,744 stars on GitHub
Top 9.5% on proxy.golang.org
github.com/jitsucom/bulker/ingest
Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake,...
Latest release: about 1 month ago - 198 stars on GitHub
Top 3.3% on proxy.golang.org
github.com/filecoin-project/bacalhau v1.8.0
Compute over Data framework for public, transparent, and optionally verifiable computation
255 versions - Latest release: 9 months ago - 3 dependent packages - 3 dependent repositories - 237 stars on GitHub
Top 8.2% on proxy.golang.org
github.com/airbnb/airflow v1.8.2
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
1 version - Latest release: over 8 years ago - 33,542 stars on GitHub
Top 8.2% on proxy.golang.org
github.com/argoproj/argo-workflows/v2 v2.12.13
Workflow Engine for Kubernetes
117 versions - Latest release: over 4 years ago - 16,110 stars on GitHub
Top 6.1% on proxy.golang.org
github.com/hupe1980/dagster-pipes-go v0.0.10
Package dagsterpipes provides tools and utilities for communication and interaction within Dagste...
10 versions - Latest release: about 1 year ago - 0 stars on GitHub
Top 9.7% on proxy.golang.org
github.com/bacalhau-project/bacalhau/apps/job-info-consumer/consumer v0.0.0-20240125091916-849dab55b6fd
Community-driven, simple, yet powerful framework for fast, cost-effective distributed Compute ove...
14 versions - Latest release: about 2 years ago - 836 stars on GitHub
Top 8.2% on proxy.golang.org
github.com/feast-dev/feast v0.60.0
The Open Source Feature Store for AI/ML
156 versions - Latest release: 17 days ago - 6,405 stars on GitHub
Top 8.2% on proxy.golang.org
github.com/natun-ai/natun v0.3.3
Transform your pythonic research to an artifact that engineers can deploy easily.
3 versions - Latest release: almost 2 years ago - 64 stars on GitHub
github.com/jitsucom/bulker/sync-sidecar v0.0.0-20240119080811-2fb2a0d61cc2
Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake,...
24 versions - Latest release: about 2 years ago - 198 stars on GitHub
Top 7.5% on proxy.golang.org
github.com/raptor-ml/raptor/api/proto/gen/go v0.0.0-20230629150952-f7dc91871913
Transform your pythonic research to an artifact that engineers can deploy easily.
1 version - Latest release: over 2 years ago - 1 dependent package - 135 stars on GitHub
Top 7.7% on proxy.golang.org
github.com/sudohainguyen/self
Latest release: about 1 month ago - 0 stars on GitHub
Top 6.1% on proxy.golang.org
github.com/realtimedatalake/rtdl v0.2.0
rtdl makes it easy to build and maintain a real-time data lake
6 versions - Latest release: almost 4 years ago - 41 stars on GitHub
Top 6.6% on proxy.golang.org
github.com/gear5sh/gear5/drivers/hubspot
high performance better alternative to Airbyte, Singer, Meltano
Latest release: 2 months ago - 14 stars on GitHub
Top 6.7% on proxy.golang.org
github.com/great-expectations/great_expectations v0.7.10
Always know what to expect from your data.
26 versions - Latest release: over 6 years ago - 10,884 stars on GitHub
Top 7.6% on proxy.golang.org
github.com/scaleway/cq-source-scaleway v1.0.0
CloudQuery Provider for Scaleway
2 versions - Latest release: about 3 years ago - 5 stars on GitHub
Top 5.8% on proxy.golang.org
github.com/jitsucom/bulker/jitsubase v0.0.0-20240119080811-2fb2a0d61cc2
Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake,...
48 versions - Latest release: about 2 years ago - 1 dependent repositories - 198 stars on GitHub
Top 4.4% on proxy.golang.org
github.com/benthosdev/benthos v1.20.4 πŸ’°
Fancy stream processing made operationally mundane
295 versions - Latest release: almost 7 years ago - 5,853 stars on GitHub
Top 9.5% on proxy.golang.org
github.com/piyushsingariya/kaku/drivers/google-sheets
Shift is a high performance better alternative to Airbyte, Singer, Meltano
Latest release: 7 months ago - 7 stars on GitHub
Top 4.2% on proxy.golang.org
github.com/apache/incubator-devlake v1.0.2
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragment...
293 versions - Latest release: 9 months ago - 1 dependent package - 2,837 stars on GitHub
Top 5.8% on proxy.golang.org
github.com/bacalhau-project/bacalhau/pkg/executor/wasm/funcs/http/client
Package client provides a TinyGo-compatible client for the HTTP host functions.
Latest release: 3 months ago - 836 stars on GitHub
Top 9.1% on proxy.golang.org
github.com/piyushsingariya/shift v0.0.0-20231015093324-0ecaae21e861
Shift is a high performance better alternative to Airbyte, Singer, Meltano
1 version - Latest release: over 2 years ago - 1 dependent repositories - 5 stars on GitHub
Top 4.1% on proxy.golang.org
github.com/airbytehq/airbyte v1.2.0
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files t...
507 versions - Latest release: over 1 year ago - 19,852 stars on GitHub
Top 1.5% on proxy.golang.org
github.com/Jeffail/benthos v1.20.4 πŸ’°
Fancy stream processing made operationally mundane
295 versions - Latest release: almost 7 years ago - 8 dependent packages - 6 dependent repositories - 5,851 stars on GitHub
github.com/piyushsingariya/syndicate/drivers/google-sheets v0.0.0-20230604051918-3ad8c64cefbc
Shift is a high performance better alternative to Airbyte, Singer, Meltano
1 version - Latest release: almost 3 years ago - 5 stars on GitHub
Top 6.7% on proxy.golang.org
github.com/ericmjl/pyjanitor v0.32.20
Clean APIs for data cleaning. Python implementation of R package Janitor
77 versions - Latest release: 16 days ago - 1,285 stars on GitHub
Top 9.6% on proxy.golang.org
github.com/jitsucom/bulker/connectors/firebase v0.0.0-20240119080811-2fb2a0d61cc2
Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake,...
2 versions - Latest release: about 2 years ago - 198 stars on GitHub
Top 9.7% on proxy.golang.org
github.com/piyushsingariya/kaku/drivers/hubspot v0.0.0-20230630130252-054496f39abb
Shift is a high performance better alternative to Airbyte, Singer, Meltano
1 version - Latest release: over 2 years ago - 7 stars on GitHub
Top 8.2% on proxy.golang.org
github.com/benthosdev/benthos/v3 v3.65.0 πŸ’°
Fancy stream processing made operationally mundane
99 versions - Latest release: almost 4 years ago - 8,138 stars on GitHub
Top 7.8% on proxy.golang.org
github.com/apache/incubator-devlake/backend
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragment...
Latest release: about 1 month ago - 2,843 stars on GitHub
Top 4.7% on proxy.golang.org
github.com/mlrun/mlrun v1.7.0
MLRun is an open source MLOps platform for quickly building and managing continuous ML applicatio...
617 versions - Latest release: over 1 year ago - 1,600 stars on GitHub
github.com/jitsucom/bulker/kafkabase v0.0.0-20240119080811-2fb2a0d61cc2
Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake,...
11 versions - Latest release: about 2 years ago - 198 stars on GitHub
Top 6.7% on proxy.golang.org
github.com/eventual-inc/daft v0.7.3
Distributed DataFrame for Python designed for the cloud, powered by Rust
156 versions - Latest release: 21 days ago - 1,810 stars on GitHub
Top 6.7% on proxy.golang.org
github.com/yobulkdev/yobulkdev v0.1.1
πŸ”₯ πŸ”₯ πŸ”₯Open Source & AI driven Data Onboarding Platform:Free flatfile.com alternative
2 versions - Latest release: almost 3 years ago - 852 stars on GitHub
Top 8.2% on proxy.golang.org
github.com/kevin-hanselman/dud v0.4.5
A lightweight CLI tool for versioning data alongside source code and building data pipelines.
10 versions - Latest release: over 1 year ago - 212 stars on GitHub
Top 5.7% on proxy.golang.org
github.com/manishjalui11/csv-diff v1.0.0
> A fast and flexible command-line tool to compare CSV files using composite keys and generate cl...
1 version - Latest release: 10 months ago - 1 stars on GitHub
Top 5.5% on proxy.golang.org
github.com/dataplane-app/dataplane/app v0.7.5
Dataplane is an Airflow inspired unified data platform with additional data mesh and RPA capabili...
32 versions - Latest release: over 2 years ago - 230 stars on GitHub
Top 4.1% on proxy.golang.org
github.com/prefecthq/prefect v2.14.6+incompatible
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
1 version - Latest release: over 2 years ago - 20,952 stars on GitHub
Top 5.7% on proxy.golang.org
github.com/nessi-dev/nessi v1.0.0
A Python-based data processing and analysis tool built with PySpark and Delta Lake
2 versions - Latest release: 10 months ago - 0 stars on GitHub
Top 5.6% on proxy.golang.org
github.com/Avaiga/taipy v1.0.0
Turns Data and AI algorithms into production-ready web applications in no time.
1 version - Latest release: almost 4 years ago - 18,807 stars on GitHub
Top 5.8% on proxy.golang.org
github.com/snowflakedb/snowpark-python v1.44.0
Snowflake Snowpark Python API
66 versions - Latest release: 3 months ago - 317 stars on GitHub
Top 6.5% on proxy.golang.org
github.com/superstreamlabs/memphis v1.4.4
Memphis.dev is a highly scalable and effortless data streaming platform
28 versions - Latest release: almost 2 years ago - 3,398 stars on GitHub
Top 6.8% on proxy.golang.org
github.com/redpanda-data/connect/v4 v4.81.0 πŸ’°
Fancy stream processing made operationally mundane
179 versions - Latest release: 16 days ago - 8,474 stars on GitHub
Top 6.8% on proxy.golang.org
github.com/redpanda-data/connect v1.20.4 πŸ’°
Fancy stream processing made operationally mundane
295 versions - Latest release: almost 7 years ago - 8,474 stars on GitHub
Top 6.6% on proxy.golang.org
github.com/redpanda-data/connect/v3 v3.65.0 πŸ’°
Fancy stream processing made operationally mundane
99 versions - Latest release: almost 4 years ago - 8,474 stars on GitHub
Top 6.8% on proxy.golang.org
github.com/redpanda-data/connect/public/bundle/enterprise/v4 v4.78.1 πŸ’°
Fancy stream processing made operationally mundane
16 versions - Latest release: about 2 months ago - 8,474 stars on GitHub
Top 6.8% on proxy.golang.org
github.com/redpanda-data/connect/public/bundle/free/v4 v4.78.1 πŸ’°
Package free imports all free, open source plugin implementations that ship with Redpanda Connect...
16 versions - Latest release: about 2 months ago - 8,474 stars on GitHub
Top 9.6% on proxy.golang.org
github.com/bacalhau-project/amplify v1.0.1
Bacalhau Amplify: automatic enrichment, enhancement, and explanation of your data
45 versions - Latest release: almost 3 years ago - 12 stars on GitHub
Top 9.3% on proxy.golang.org
github.com/jitsucom/bulker/sync-controller
Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake,...
Latest release: 4 months ago - 198 stars on GitHub
Top 6.7% on proxy.golang.org
github.com/phidatahq/phidata v2.7.10+incompatible
Build AI Assistants with function calling and connect LLMs to external tools.
315 versions - Latest release: about 1 year ago - 2,881 stars on GitHub
Top 9.6% on proxy.golang.org
github.com/rdagumampan/yuniql v1.3.15
Free and open source schema versioning and database migration made natively with .NET/6. NEW THIS...
14 versions - Latest release: almost 4 years ago - 423 stars on GitHub
Top 8.2% on proxy.golang.org
github.com/treeverse/lakeFS v1.77.1
lakeFS - Data version control for your data lake | Git for data
260 versions - Latest release: 16 days ago - 4,999 stars on GitHub
Top 8.2% on proxy.golang.org
github.com/treeverse/lakefS v1.77.1
lakeFS - Data version control for your data lake | Git for data
259 versions - Latest release: 16 days ago - 4,999 stars on GitHub
Top 5.6% on proxy.golang.org
github.com/treeverse/lakeFs v1.77.1
lakeFS - Data version control for your data lake | Git for data
259 versions - Latest release: 16 days ago - 4,999 stars on GitHub
Top 5.7% on proxy.golang.org
github.com/treeverse/lakefs/modules/license/factory
lakeFS - Data version control for your data lake | Git for data
Latest release: 30 days ago - 4,999 stars on GitHub
Top 5.8% on proxy.golang.org
github.com/treeverse/lakefs/modules/authentication/factory v0.0.0-20260218151720-8685b97bbd1e
lakeFS - Data version control for your data lake | Git for data
10 versions - Latest release: 16 days ago - 4,999 stars on GitHub
Top 6.2% on proxy.golang.org
github.com/treeverse/lakefs/modules/cache
lakeFS - Data version control for your data lake | Git for data
Latest release: about 2 months ago - 4,999 stars on GitHub
Top 5.4% on proxy.golang.org
github.com/treeverse/lakefs/modules/api/factory
lakeFS - Data version control for your data lake | Git for data
Latest release: about 2 months ago - 4,999 stars on GitHub
Top 3.4% on proxy.golang.org
github.com/treeverse/lakefs/modules/gateway/factory
lakeFS - Data version control for your data lake | Git for data
Latest release: 4 months ago - 4,999 stars on GitHub
Top 6.0% on proxy.golang.org
github.com/treeverse/lakefs/modules/config/factory
lakeFS - Data version control for your data lake | Git for data
Latest release: 4 months ago - 4,999 stars on GitHub
Top 5.8% on proxy.golang.org
github.com/treeverse/lakefs/modules/auth/factory
lakeFS - Data version control for your data lake | Git for data
Latest release: 24 days ago - 4,999 stars on GitHub
Top 6.1% on proxy.golang.org
github.com/treeverse/lakefs/webui
lakeFS - Data version control for your data lake | Git for data
Latest release: 4 months ago - 4,999 stars on GitHub
Top 5.3% on proxy.golang.org
github.com/treeverse/lakefs/modules/catalog/factory
lakeFS - Data version control for your data lake | Git for data
Latest release: 3 months ago - 4,999 stars on GitHub
Top 6.0% on proxy.golang.org
github.com/treeverse/lakefs/modules/block/factory
lakeFS - Data version control for your data lake | Git for data
Latest release: 24 days ago - 4,999 stars on GitHub
Top 5.6% on proxy.golang.org
github.com/GoogleCloudPlatform/public-datasets-pipelines v5.2.0+incompatible
Cloud-native, data onboarding architecture for Google Cloud Datasets
32 versions - Latest release: over 3 years ago - 165 stars on GitHub
Top 3.6% on proxy.golang.org
github.com/bacalhau-project/bacalhau v1.8.0
Community-driven, simple, yet powerful framework for fast, cost-effective distributed Compute ove...
258 versions - Latest release: 9 months ago - 4 dependent packages - 1 dependent repositories - 836 stars on GitHub
Top 6.0% on proxy.golang.org
github.com/pureml-inc/pureml v0.3.6
Developer platform for production ML.
11 versions - Latest release: almost 3 years ago - 107 stars on GitHub
Top 6.0% on proxy.golang.org
github.com/PureML-Inc/PureML v0.2.3
Developer platform for production ML.
7 versions - Latest release: about 3 years ago - 107 stars on GitHub
Top 5.6% on proxy.golang.org
github.com/avaiga/taipy v1.0.0
Turns Data and AI algorithms into production-ready web applications in no time.
1 version - Latest release: almost 4 years ago - 18,608 stars on GitHub
Top 6.1% on proxy.golang.org
github.com/realdatadriven/etlx v1.4.17
This project is an ETL / ELT Framework powered by DuckDB, designed to seamlessly integrate and pr...
123 versions - Latest release: about 1 month ago - 20 stars on GitHub
Top 6.7% on proxy.golang.org
github.com/meltano/meltano v4.1.2+incompatible
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-...
303 versions - Latest release: about 1 month ago - 2,232 stars on GitHub
Top 5.9% on proxy.golang.org
github.com/Multiwoven/multiwoven v0.101.0
πŸ”₯πŸ”₯πŸ”₯ Open source Reverse ETL - alternative to hightouch and census.
133 versions - Latest release: 22 days ago - 1,622 stars on GitHub
Top 8.1% on proxy.golang.org
github.com/jitsucom/bulker/bulkerapp v0.0.0-20231205094423-1fb51a06bb73
Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake,...
6 versions - Latest release: about 2 years ago - 198 stars on GitHub
Top 8.2% on proxy.golang.org
github.com/gojek/feast/sdk/go v0.9.4
Feature Store for Machine Learning
18 versions - Latest release: almost 4 years ago - 4,083 stars on GitHub