gobblin

A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.

License

Apache License 2.0

A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.

Creator

apache

Related apps

incubator-pagespeed-mod

incubator-pagespeed-mod

Apache module for rewriting web pages to reduce latency and bandwidth.

C++697apache-2.0

2 years ago

pagespeed

shardingsphere

shardingsphere

Distributed SQL transaction & query engine for data sharding, scaling, encryptio

Java19846apache-2.0

last month

bigdatadatabasedatabase-cluster

cassandra

cassandra

Apache Cassandra®

Java8772apache-2.0

last month

cassandradatabasejava

couchdb

couchdb

Seamless multi-master syncing database with an intuitive HTTP/JSON API, designed

Erlang6205apache-2.0

last month

big-datacloudcontent

datasketches-java

datasketches-java

A software library of stochastic streaming algorithms, a.k.a. sketches.

Java894apache-2.0

8 days ago

datasketches

fury

fury

A blazingly fast multi-language serialization framework powered by JIT and zero-

Java3068apache-2.0

15 days ago

compressioncppcross-language

hudi

hudi

Upserts, Deletes And Incremental Processing on Big Data.

Java5419apache-2.0

19 hours ago

apacheflinkapachehudiapachespark

incubator-baremaps

Create custom vector tiles from OpenStreetMap and other data sources with Postgi

Java498apache-2.0

3 months ago

javamapboxopenstreetmap

incubator-hugegraph

incubator-hugegraph

A graph database that supports more than 100+ billion data, high performance and

Java2632apache-2.0

last month

big-datadatabasegraph

incubator-stormcrawler

incubator-stormcrawler

A scalable, mature and versatile web crawler based on Apache Storm

Java889apache-2.0

2 days ago

apache-stormcrawlerdistributed

mxnet

mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, M

C++20778apache-2.0

last year

mxnet

pekko

Build highly concurrent, distributed, and resilient message-driven applications

Scala1216apache-2.0

2 days ago

actor-modelcloud-nativeconcurrency

sedona

sedona

A cluster computing framework for processing large-scale geospatial data

Java1955apache-2.0

5 days ago

cluster-computinggeospatialjava

skywalking-rover

skywalking-rover

Monitor and profiler powered by eBPF to monitor network traffic, and diagnose CP

Go190apache-2.0

3 months ago

apmebpfnetwork

systemds

systemds

An open source ML system for the end-to-end data science lifecycle

Java1034apache-2.0

4 days ago

dmljavapython

trafficserver

Apache Traffic Server™ is a fast, scalable and extensible HTTP/1.1 and HTTP/2 co

C++1783apache-2.0

3 months ago

apachecachecdn

zeppelin

zeppelin

Web-based notebook that enables data-driven, interactive data analytics and coll

Java6337apache-2.0

4 months ago

big-datadatabaseflink

apex-core

Mirror of Apache Apex core

Java350apache-2.0

3 years ago

apexbig-datajava

cassandra-gocql-driver

cassandra-gocql-driver

GoCQL Driver for Apache Cassandra®

Go2573apache-2.0

last month

cassandraclientdatabase

cassandra-java-driver

cassandra-java-driver

Java Driver for Apache Cassandra®

Java1352apache-2.0

2 months ago