An open API service providing package, version and dependency metadata of many open source software ecosystems and registries.

proxy.golang.org "data-engineering" keyword

View the packages on the proxy.golang.org package registry that are tagged with the "data-engineering" keyword.

Top 9.3% on proxy.golang.org
github.com/jitsucom/bulker/ingress-manager
Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake,...
Latest release: 3 months ago - 198 stars on GitHub
Top 9.7% on proxy.golang.org
github.com/jitsucom/bulker/admin v0.0.0-20240119080811-2fb2a0d61cc2
Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake,...
2 versions - Latest release: about 2 years ago - 198 stars on GitHub
Top 0.5% on proxy.golang.org
github.com/argoproj/argo-workflows/v3 v3.7.9
Workflow Engine for Kubernetes
167 versions - Latest release: 14 days ago - 95 dependent packages - 102 dependent repositories - 16,215 stars on GitHub
Top 5.3% on proxy.golang.org
github.com/DAGWorks-Inc/hamilton v1.89.0-incubating-RC1
Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting da...
2 versions - Latest release: 5 months ago - 2,269 stars on GitHub
Top 9.3% on proxy.golang.org
github.com/cloudquery/cloudquery/plugins/destination/snowflake
The open source ELT framework powered by Apache Arrow
Latest release: about 7 hours ago - 6,245 stars on GitHub
Top 4.6% on proxy.golang.org
github.com/cloudquery/cloudquery/plugins/source/cloudflare v0.1.7
The open source ELT framework powered by Apache Arrow
6 versions - Latest release: over 3 years ago - 6,245 stars on GitHub
Top 6.7% on proxy.golang.org
github.com/Eventual-Inc/Daft v0.7.2
Distributed DataFrame for Python designed for the cloud, powered by Rust
155 versions - Latest release: 27 days ago - 1,810 stars on GitHub
Top 5.5% on proxy.golang.org
github.com/datazip-inc/olake-ui v0.3.0
Frontend & BFF (Backend for frontend) for Olake. This includes the UI code and backend code for s...
23 versions - Latest release: about 14 hours ago - 24 stars on GitHub
Top 5.3% on proxy.golang.org
github.com/dagworks-inc/hamilton v1.89.0-incubating-RC1
Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting da...
2 versions - Latest release: 5 months ago - 2,269 stars on GitHub
Top 3.0% on proxy.golang.org
github.com/treeverse/lakefs v1.76.0
lakeFS - Data version control for your data lake | Git for data
259 versions - Latest release: 22 days ago - 1 dependent package - 1 dependent repositories - 4,999 stars on GitHub
Top 6.7% on proxy.golang.org
github.com/evidence-dev/evidence v2.2.1+incompatible
Business intelligence as code: build fast, interactive data visualizations in SQL and markdown
2 versions - Latest release: almost 3 years ago - 5,420 stars on GitHub
Top 8.6% on proxy.golang.org
github.com/saucam/airflow-runner v0.1.0
1 version - Latest release: about 4 years ago - 2 stars on GitHub
Top 1.4% on proxy.golang.org
github.com/feast-dev/feast/sdk/go v0.9.4
The Open Source Feature Store for AI/ML
18 versions - Latest release: over 3 years ago - 6 dependent packages - 17 dependent repositories - 6,405 stars on GitHub
Top 6.1% on proxy.golang.org
github.com/cloudquery/cloudquery/plugins/transformer/test v1.1.29
The open source ELT framework powered by Apache Arrow
31 versions - Latest release: 28 days ago - 6,220 stars on GitHub
Top 4.6% on proxy.golang.org
github.com/cloudquery/cloudquery/plugins/source/test v1.1.4
The open source ELT framework powered by Apache Arrow
6 versions - Latest release: over 3 years ago - 6,220 stars on GitHub
Top 3.6% on proxy.golang.org
github.com/apache/airflow v1.8.2
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
1 version - Latest release: over 8 years ago - 1 dependent repositories - 42,884 stars on GitHub
Top 3.0% on proxy.golang.org
github.com/argoproj/argo-workflows v2.5.2+incompatible
Workflow Engine for Kubernetes
125 versions - Latest release: almost 6 years ago - 3 dependent repositories - 16,138 stars on GitHub
Top 6.4% on proxy.golang.org
github.com/cloudquery/cloudquery/plugins/transformer/basic v1.4.0
The open source ELT framework powered by Apache Arrow
6 versions - Latest release: about 1 year ago - 6,239 stars on GitHub
Top 1.4% on proxy.golang.org
github.com/benthosdev/benthos/v4 v4.80.1 πŸ’°
Fancy stream processing made operationally mundane
178 versions - Latest release: 6 days ago - 36 dependent packages - 9 dependent repositories - 5,853 stars on GitHub
Top 1.3% on proxy.golang.org
github.com/Jeffail/benthos/v3 v3.65.0 πŸ’°
Fancy stream processing made operationally mundane
99 versions - Latest release: almost 4 years ago - 13 dependent packages - 9 dependent repositories - 5,851 stars on GitHub
Top 5.8% on proxy.golang.org
github.com/treeverse/lakefs/modules/auth/factory
lakeFS - Data version control for your data lake | Git for data
Latest release: about 24 hours ago - 4,999 stars on GitHub
Top 4.6% on proxy.golang.org
github.com/cloudquery/cloudquery/plugins/destination/s3 v0.0.0-20240124172713-41d4fe688626
The open source ELT framework powered by Apache Arrow
1,248 versions - Latest release: about 2 years ago - 6,220 stars on GitHub
Top 5.6% on proxy.golang.org
github.com/googlecloudplatform/public-datasets-pipelines v5.2.0+incompatible
Cloud-native, data onboarding architecture for Google Cloud Datasets
32 versions - Latest release: over 3 years ago - 165 stars on GitHub
Top 6.0% on proxy.golang.org
github.com/treeverse/lakefs/modules/block/factory
lakeFS - Data version control for your data lake | Git for data
Latest release: 1 day ago - 4,999 stars on GitHub
Top 5.7% on proxy.golang.org
github.com/treeverse/lakefs/modules/license/factory
lakeFS - Data version control for your data lake | Git for data
Latest release: 7 days ago - 4,999 stars on GitHub
Top 4.9% on proxy.golang.org
github.com/dataform-co/dataform v0.1.0
Dataform is a framework for managing SQL based data operations in BigQuery
71 versions - Latest release: almost 7 years ago - 932 stars on GitHub
Top 8.2% on proxy.golang.org
github.com/ConduitIO/conduit v0.14.0
Conduit streams data between data stores. Kafka Connect replacement. No JVM required.
796 versions - Latest release: 8 months ago - 561 stars on GitHub
Top 3.9% on proxy.golang.org
github.com/conduitio/conduit v0.14.0
Conduit streams data between data stores. Kafka Connect replacement. No JVM required.
995 versions - Latest release: 8 months ago - 1 dependent package - 1 dependent repositories - 561 stars on GitHub
Top 5.9% on proxy.golang.org
github.com/Conduitio/conduit v0.14.0
Conduit streams data between data stores. Kafka Connect replacement. No JVM required.
285 versions - Latest release: 8 months ago - 561 stars on GitHub
Top 8.2% on proxy.golang.org
github.com/benthosdev/benthos-captain v0.1.0
A Kubernetes Operator to orchestrate Benthos pipelines
2 versions - Latest release: about 2 years ago - 44 stars on GitHub
Top 4.6% on proxy.golang.org
github.com/cloudquery/cloudquery/plugins/source/gandi v0.0.0-20230701042655-edb7ed838968
The open source ELT framework powered by Apache Arrow
1,283 versions - Latest release: over 2 years ago - 6,245 stars on GitHub
Top 6.5% on proxy.golang.org
github.com/raptor-ml/raptor v0.3.3
Transform your pythonic research to an artifact that engineers can deploy easily.
9 versions - Latest release: almost 2 years ago - 1 dependent repositories - 64 stars on GitHub
Top 6.0% on proxy.golang.org
github.com/mirpo/chopdoc v0.0.12
A tool to split documents into chunks for RAG and LLM applications
10 versions - Latest release: 3 months ago - 2 stars on GitHub
Top 6.6% on proxy.golang.org
github.com/benmizrahi/duckspark
duckspark - A DuckDB based distributed data processing engine
Latest release: 11 days ago - 0 stars on GitHub
Top 5.6% on proxy.golang.org
github.com/treeverse/lakeFs v1.76.0
lakeFS - Data version control for your data lake | Git for data
257 versions - Latest release: 22 days ago - 4,999 stars on GitHub
Top 8.2% on proxy.golang.org
github.com/apache/incubator-airflow v1.8.2
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
1 version - Latest release: over 8 years ago - 33,577 stars on GitHub
Top 5.6% on proxy.golang.org
github.com/PuremlHQ/PureML/packages/purebackend v0.0.0-20230228183052-559f631db95b
Developer platform for production ML.
1 version - Latest release: almost 3 years ago - 107 stars on GitHub
Top 5.6% on proxy.golang.org
github.com/apache/airflow/go-sdk v1.0.0-beta1
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
1 version - Latest release: 4 months ago - 42,884 stars on GitHub
github.com/jitsucom/bulker/bulkerlib v0.0.0-20240119080811-2fb2a0d61cc2
Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake,...
19 versions - Latest release: about 2 years ago - 198 stars on GitHub
Top 2.0% on proxy.golang.org
github.com/rudderlabs/rudder-server v1.62.1
Privacy and Security focused Segment-alternative, in Golang and React
401 versions - Latest release: 4 months ago - 7 dependent packages - 1 dependent repositories - 4,310 stars on GitHub
Top 6.6% on proxy.golang.org
github.com/gear5sh/gear5/drivers/google-sheets
high performance better alternative to Airbyte, Singer, Meltano
Latest release: 13 days ago - 16 stars on GitHub
github.com/n0rdy/pippin v0.0.4
Go library for creating and managing data pipelines
2 versions - Latest release: about 2 years ago - 0 stars on GitHub
Top 9.6% on proxy.golang.org
github.com/dagster-io/dagster v0.2.8
An orchestration platform for the development, production, and observation of data assets.
7 versions - Latest release: over 7 years ago - 14,215 stars on GitHub
Top 6.8% on proxy.golang.org
github.com/redpanda-data/connect/v4 v4.78.0 πŸ’°
Fancy stream processing made operationally mundane
173 versions - Latest release: 26 days ago - 8,474 stars on GitHub
Top 6.8% on proxy.golang.org
github.com/redpanda-data/connect/public/bundle/free/v4 v4.78.1 πŸ’°
Package free imports all free, open source plugin implementations that ship with Redpanda Connect...
16 versions - Latest release: 23 days ago - 8,474 stars on GitHub
Top 6.6% on proxy.golang.org
github.com/redpanda-data/connect/v3 v3.65.0 πŸ’°
Fancy stream processing made operationally mundane
99 versions - Latest release: almost 4 years ago - 8,474 stars on GitHub
Top 6.8% on proxy.golang.org
github.com/redpanda-data/connect/public/bundle/enterprise/v4 v4.78.1 πŸ’°
Fancy stream processing made operationally mundane
16 versions - Latest release: 23 days ago - 8,474 stars on GitHub
Top 6.8% on proxy.golang.org
github.com/redpanda-data/connect v1.20.4 πŸ’°
Fancy stream processing made operationally mundane
295 versions - Latest release: almost 7 years ago - 8,474 stars on GitHub
Top 3.6% on proxy.golang.org
github.com/apache/superset v2021.41.0+incompatible
Apache Superset is a Data Visualization and Data Exploration Platform
118 versions - Latest release: over 4 years ago - 1 dependent repositories - 68,353 stars on GitHub
Top 5.6% on proxy.golang.org
github.com/PureML-Inc/PureML/packages/purebackend v0.0.0-20230218144505-250e8001829b
Track, version, compare and review your data and models.
1 version - Latest release: almost 3 years ago - 106 stars on GitHub
Top 5.6% on proxy.golang.org
github.com/setl-framework/setl v0.4.0
A simple Spark-powered ETL framework that just works 🍺
1 version - Latest release: about 6 years ago - 182 stars on GitHub
Top 6.3% on proxy.golang.org
github.com/ab180/lrmr v0.5.3
Less-Resilient MapReduce framework for Go
139 versions - Latest release: over 2 years ago - 32 stars on GitHub
Top 6.8% on proxy.golang.org
github.com/benthosdev/connect/v4 v4.78.0 πŸ’°
Fancy stream processing made operationally mundane
173 versions - Latest release: 26 days ago - 7,744 stars on GitHub
Top 9.5% on proxy.golang.org
github.com/jitsucom/bulker/ingest
Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake,...
Latest release: 18 days ago - 198 stars on GitHub
Top 5.4% on proxy.golang.org
github.com/cloudquery/cloudquery/plugins/source/vercel v0.0.0-20230706100108-ff1469b82f6a
The open source ELT framework powered by Apache Arrow
1 version - Latest release: over 2 years ago - 6,245 stars on GitHub
Top 9.4% on proxy.golang.org
github.com/cloudquery/cloudquery/plugins/source/notion
The open source ELT framework powered by Apache Arrow
Latest release: 18 days ago - 6,245 stars on GitHub
Top 3.3% on proxy.golang.org
github.com/filecoin-project/bacalhau v1.8.0
Compute over Data framework for public, transparent, and optionally verifiable computation
255 versions - Latest release: 8 months ago - 3 dependent packages - 3 dependent repositories - 237 stars on GitHub
Top 8.2% on proxy.golang.org
github.com/airbnb/airflow v1.8.2
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
1 version - Latest release: over 8 years ago - 33,542 stars on GitHub
Top 9.7% on proxy.golang.org
github.com/bacalhau-project/bacalhau/apps/job-info-consumer/consumer v0.0.0-20240125091916-849dab55b6fd
Community-driven, simple, yet powerful framework for fast, cost-effective distributed Compute ove...
14 versions - Latest release: about 2 years ago - 836 stars on GitHub
Top 6.1% on proxy.golang.org
github.com/hupe1980/dagster-pipes-go v0.0.10
Package dagsterpipes provides tools and utilities for communication and interaction within Dagste...
10 versions - Latest release: about 1 year ago - 0 stars on GitHub
Top 4.1% on proxy.golang.org
github.com/prefecthq/prefect v2.14.6+incompatible
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
1 version - Latest release: about 2 years ago - 20,952 stars on GitHub
Top 8.2% on proxy.golang.org
github.com/feast-dev/feast v0.59.0
The Open Source Feature Store for AI/ML
155 versions - Latest release: 26 days ago - 6,405 stars on GitHub
Top 7.7% on proxy.golang.org
github.com/sudohainguyen/self
Latest release: 21 days ago - 0 stars on GitHub
Top 7.5% on proxy.golang.org
github.com/raptor-ml/raptor/api/proto/gen/go v0.0.0-20230629150952-f7dc91871913
Transform your pythonic research to an artifact that engineers can deploy easily.
1 version - Latest release: over 2 years ago - 1 dependent package - 135 stars on GitHub
Top 9.1% on proxy.golang.org
github.com/cloudquery/cloudquery/plugins/destination/motherduck
The open source ELT framework powered by Apache Arrow
Latest release: 22 days ago - 6,245 stars on GitHub
Top 6.1% on proxy.golang.org
github.com/realtimedatalake/rtdl v0.2.0
rtdl makes it easy to build and maintain a real-time data lake
6 versions - Latest release: over 3 years ago - 41 stars on GitHub
Top 4.2% on proxy.golang.org
github.com/apache/incubator-devlake v1.0.2
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragment...
293 versions - Latest release: 8 months ago - 1 dependent package - 2,837 stars on GitHub
Top 7.8% on proxy.golang.org
github.com/apache/incubator-devlake/backend
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragment...
Latest release: 22 days ago - 2,843 stars on GitHub
Top 6.7% on proxy.golang.org
github.com/great-expectations/great_expectations v0.7.10
Always know what to expect from your data.
26 versions - Latest release: over 6 years ago - 10,884 stars on GitHub
Top 5.4% on proxy.golang.org
github.com/treeverse/lakefs/modules/api/factory
lakeFS - Data version control for your data lake | Git for data
Latest release: 23 days ago - 4,999 stars on GitHub
Top 8.2% on proxy.golang.org
github.com/argoproj/argo-workflows/v2 v2.12.13
Workflow Engine for Kubernetes
117 versions - Latest release: over 4 years ago - 16,110 stars on GitHub
Top 9.1% on proxy.golang.org
github.com/piyushsingariya/shift v0.0.0-20231015093324-0ecaae21e861
Shift is a high performance better alternative to Airbyte, Singer, Meltano
1 version - Latest release: over 2 years ago - 1 dependent repositories - 5 stars on GitHub
Top 5.8% on proxy.golang.org
github.com/treeverse/lakefs/modules/authentication/factory
lakeFS - Data version control for your data lake | Git for data
Latest release: 24 days ago - 4,999 stars on GitHub
Top 9.6% on proxy.golang.org
github.com/bacalhau-project/amplify v1.0.1
Bacalhau Amplify: automatic enrichment, enhancement, and explanation of your data
45 versions - Latest release: almost 3 years ago - 12 stars on GitHub
Top 5.6% on proxy.golang.org
github.com/SETL-Framework/setl v0.4.0
A simple Spark-powered ETL framework that just works 🍺
1 version - Latest release: about 6 years ago - 182 stars on GitHub
Top 8.2% on proxy.golang.org
github.com/gojek/feast/sdk/go v0.9.4
Feature Store for Machine Learning
18 versions - Latest release: over 3 years ago - 4,083 stars on GitHub
github.com/jitsucom/bulker/eventslog v0.0.0-20240119080811-2fb2a0d61cc2
Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake,...
18 versions - Latest release: about 2 years ago - 198 stars on GitHub
Top 6.5% on proxy.golang.org
github.com/superstreamlabs/memphis v1.4.4
Memphis.dev is a highly scalable and effortless data streaming platform
28 versions - Latest release: over 1 year ago - 3,398 stars on GitHub
Top 7.7% on proxy.golang.org
github.com/samjbobb/mammoth v1.0.8
Synchronize Postgres to Snowflake in near-real time with one simple application
9 versions - Latest release: about 2 years ago - 10 stars on GitHub
Top 6.0% on proxy.golang.org
github.com/pureml-inc/pureml v0.3.6
Developer platform for production ML.
11 versions - Latest release: almost 3 years ago - 107 stars on GitHub
Top 6.0% on proxy.golang.org
github.com/PureML-Inc/PureML v0.2.3
Developer platform for production ML.
7 versions - Latest release: almost 3 years ago - 107 stars on GitHub
Top 5.6% on proxy.golang.org
github.com/puremlhq/pureml v0.3.6
Developer platform for production ML.
11 versions - Latest release: almost 3 years ago - 107 stars on GitHub
Top 5.6% on proxy.golang.org
github.com/PuremlHQ/PureML v0.3.6
Developer platform for production ML.
11 versions - Latest release: almost 3 years ago - 107 stars on GitHub
Top 6.7% on proxy.golang.org
github.com/quixio/quix-streams v3.23.1+incompatible
Python Streaming DataFrames for Kafka
61 versions - Latest release: 5 months ago - 1,474 stars on GitHub
Top 5.6% on proxy.golang.org
github.com/mlcraft-io/mlcraft v0.1.55 πŸ’°
Synmetrix – production-ready open source semantic layer on Cube
56 versions - Latest release: about 1 year ago - 549 stars on GitHub
Top 6.7% on proxy.golang.org
github.com/ericmjl/pyjanitor v0.32.8
Clean APIs for data cleaning. Python implementation of R package Janitor
69 versions - Latest release: about 1 month ago - 1,285 stars on GitHub
Top 6.2% on proxy.golang.org
github.com/feast-dev/feast/infra/feast-operator
The Open Source Feature Store for AI/ML
Latest release: about 1 month ago - 6,405 stars on GitHub
Top 9.6% on proxy.golang.org
github.com/datafold/data-diff v0.11.1
Compare tables within or across databases
59 versions - Latest release: almost 2 years ago - 2,989 stars on GitHub
Top 6.6% on proxy.golang.org
github.com/benthosdev/connect v1.20.4 πŸ’°
Fancy stream processing made operationally mundane
295 versions - Latest release: almost 7 years ago - 7,961 stars on GitHub
Top 8.2% on proxy.golang.org
github.com/dreamdata-io/analytics-go v3.1.0+incompatible
Segment analytics client for Go
8 versions - Latest release: over 6 years ago - 0 stars on GitHub
Top 5.7% on proxy.golang.org
github.com/kevin-hanselman/duc v0.4.5
A lightweight CLI tool for versioning data alongside source code and building data pipelines.
10 versions - Latest release: over 1 year ago - 206 stars on GitHub
Top 9.4% on proxy.golang.org
github.com/cloudquery/cloudquery/plugins/destination/azblob
The open source ELT framework powered by Apache Arrow
Latest release: about 1 month ago - 6,220 stars on GitHub
Top 5.7% on proxy.golang.org
github.com/quiltdata/quilt v3.2.0+incompatible
Quilt is a data mesh for connecting people with actionable data
6 versions - Latest release: over 5 years ago - 1,349 stars on GitHub
Top 6.5% on proxy.golang.org
github.com/beneath-hq/beneath v1.0.0-rc.5
Beneath is a serverless real-time data platform ⚑️
3 versions - Latest release: almost 5 years ago - 1 dependent repositories - 83 stars on GitHub
Top 8.2% on proxy.golang.org
github.com/natun-ai/natun v0.3.3
Transform your pythonic research to an artifact that engineers can deploy easily.
3 versions - Latest release: almost 2 years ago - 64 stars on GitHub
Top 6.7% on proxy.golang.org
github.com/yobulkdev/yobulkdev v0.1.1
πŸ”₯ πŸ”₯ πŸ”₯Open Source & AI driven Data Onboarding Platform:Free flatfile.com alternative
2 versions - Latest release: almost 3 years ago - 852 stars on GitHub
github.com/djonatans/cloud-data-sync v0.3.0
A Go package for synchronizing data between different cloud storage providers. Supports Google Cl...
2 versions - Latest release: 10 months ago - 0 stars on GitHub
Top 6.1% on proxy.golang.org
github.com/jitsucom/bulker/config-keeper
Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake,...
Latest release: about 1 month ago - 198 stars on GitHub
Top 6.2% on proxy.golang.org
github.com/cloudquery/cloudquery/v6 v6.33.1
CloudQuery uses a monorepo approach with a separate Go module per component. Visit our GitHub rep...
77 versions - Latest release: about 1 month ago - 6,220 stars on GitHub
Top 6.2% on proxy.golang.org
github.com/cloudquery/cloudquery/plugins/destination/gcs/v5 v5.4.35
The open source ELT framework powered by Apache Arrow
33 versions - Latest release: about 1 month ago - 6,245 stars on GitHub