spark2014

SPARK 2014 is the new version of SPARK, a software development technology specif

Ada248gpl-3.0

24 days ago

spark-connect-go

Apache Spark Connect Client for Golang

Go162apache-2.0

14 days ago

awesome-spark

awesome-spark

A curated list of awesome Apache Spark packages and resources.

Shell1722cc0-1.0

29 days ago

apache-sparkawesomepyspark

spark-gotchas

Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks

360other

7 years ago

apache-sparkbookguide

spark-avro

spark-avro

Avro Data Source for Apache Spark

Scala539apache-2.0

6 years ago

spark-corenlp

Stanford CoreNLP wrapper for Apache Spark

Scala422gpl-3.0

6 years ago

spark-csv

spark-csv

CSV Data Source for Apache Spark 1.x

Scala1053apache-2.0

6 years ago

spark-sklearn

(Deprecated) Scikit-learn integration package for Apache Spark

Python1079apache-2.0

5 years ago

apache-sparkgrid-searchmachine-learning

spark-xml

XML data source for Spark SQL and DataFrames

Scala506apache-2.0

3 months ago

spark-cassandra-connector

DataStax Connector for Apache Spark to Apache Cassandra

Scala1943apache-2.0

3 months ago

cassandrascalaspark

spark-cassandra-stress

A tool for testing the DataStax Spark Connector against Apache Cassandra or DSE

Scala25apache-2.0

2 years ago

spark

spark

.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers

C#2025mit

4 months ago

analyticsapache-sparkazure

spark

Firely and Incendi's open source FHIR server

C#254bsd-3-clause

4 months ago

c-sharpdockerdstu2

spark-testing-base

Base classes to use when writing tests with Spark

Scala1523apache-2.0

19 days ago

spark

▁▂▃▅▂▇ in your shell.

Shell6008mit

3 years ago

shc

The Apache Spark - Apache HBase Connector is a library to support Spark accessin

Scala552apache-2.0

4 years ago

dbscan-on-spark

An implementation of DBSCAN runing on top of Apache Spark

Scala183apache-2.0

7 years ago

spark-nlp

spark-nlp

State of the Art Natural Language Processing

Scala3772apache-2.0

4 months ago

albertbertentity-extraction

spark.fish

▁▂▄▆▇█▇▆▄▂▁

Shell346mit

4 years ago

fishfish-pluginspark

jpmml-evaluator-spark

PMML evaluator library for the Apache Spark cluster computing system (http://spa

Java94agpl-3.0

3 years ago

EMR_Spark_Automation

A repository for deploying an AWS EMR cluster and submiting spark jobs on it. Bo

Python8apache-2.0

7 years ago

kotlin-spark-api

This projects gives Kotlin bindings and several extensions for Apache Spark. We

Kotlin461apache-2.0

5 months ago

bigdatakotlinnullability

TD-Spark

Java0mit

3 years ago

spark-connect-csharp

spark-connect-csharp

Apache Spark Connect Client for C#

C#1apache-2.0

7 months ago

apacheconnectdotnet

mongo-spark

mongo-spark

The MongoDB Spark Connector

Java712apache-2.0

3 months ago

connectormongo-sparkmongodb

spark-daria

spark-daria

Essential Spark extensions and helper methods ✨😲

Scala754mit

28 days ago

dataframespark

spark-fast-tests

spark-fast-tests

Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and

Scala436mit

13 days ago

sparktesting-framework

spark-daria

spark-daria

Essential Spark extensions and helper methods ✨😲

Scala746mit

3 years ago

dataframespark

spark-fast-tests

spark-fast-tests

Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and

Scala421mit

7 months ago

sparktesting-framework

neo4j-spark-connector

Neo4j Connector for Apache Spark, which provides bi-directional read/write acces

Scala312apache-2.0

2 months ago

boltcypherhacktoberfest

neo4j-spark-connector

Neo4j Connector for Apache Spark, which provides bi-directional read/write acces

Scala313apache-2.0

8 days ago

boltcypherhacktoberfest

ruby-spark

Ruby wrapper for Apache Spark

Ruby227mit

7 years ago

distributedrddruby

spark-orientdb

Apache Spark datasource for OrientDB

Scala19other

3 years ago

ob-ada-spark

Emacs Lisp8gpl-3.0

2 years ago

docker-spark

Shell765apache-2.0

4 years ago

spark-connect-rs

Apache Spark Connect Client for Rust

Rust90apache-2.0

20 days ago

grpc-clientsparkspark-connect

spark-notebook

spark-notebook

Interactive and Reactive Data Science using Scala and Spark.

JavaScript3150apache-2.0

2 years ago

apache-sparkdata-sciencenotebook

spark-timeseries

A library for time series analysis on Apache Spark

Scala1191apache-2.0

4 years ago

deep-spark

Connecting Apache Spark with different data stores [DEPRECATED]

Java197apache-2.0

8 years ago

Spark-MongoDB

Spark library for easy MongoDB access

Scala307apache-2.0

8 years ago

frege-spark

Try Apache Spark with Frege

Frege5

9 years ago

cl-spark

(spark '(1 1 2 3 5 8)) => "▁▁▂▃▅▇"

Common Lisp95mit

9 years ago

spark-by-example

SPARK by Example is an adaptation of ACSL by Example for SPARK 2014, a programmi

Ada152

2 years ago

adaformal-methodsformal-specification

spark-riak-connector

spark-riak-connector

The official Riak Spark Connector for Apache Spark with Riak TS and Riak KV

Scala60apache-2.0

8 years ago

joblib-spark

Joblib Apache Spark Backend

Python242apache-2.0

3 months ago

gatling-sql

gatling-sql

Gatling Extension for JDBC or Spark Thrift Server stress tests

Scala6apache-2.0

4 years ago

gatlingjdbcstress-testing

sparklyr

sparklyr

R interface for Apache Spark

R956apache-2.0

24 days ago

apache-sparkdistributeddplyr

sample-SparkJobserverCassandra

Simple sample job illustrating the use of Spark Jobserver to execute Apache Spar

Scala2apache-2.0

9 years ago

netapp-public

delight

delight

A Spark UI and Spark History Server alternative with CPU and Memory metrics! Del

Scala342other

6 months ago

apache-sparkcpudashboard

sparks

sparks

A typeface for creating sparklines in text without code.

CSS2084ofl-1.1

last year

sparkly

sparkly

Generate sparklines ▁▂▃▅▂▇

JavaScript424mit

3 years ago

sparkly-cli

Generate sparklines ▁▂▃▅▂▇

JavaScript141mit

3 years ago

sparkly

Helpers & syntactic sugar for PySpark.

Python60apache-2.0

last year

pysparkpythonspark

toolchain

GitHub action to setup an Ada/SPARK dev environment

JavaScript20mit

3 years ago

ada_language_server

ada_language_server

Server implementing the Microsoft Language Protocol for Ada and SPARK

Ada235gpl-3.0

25 days ago

gnatstudio

gnatstudio

GNAT Studio is a powerful and lightweight IDE for Ada and SPARK.

Ada409

24 days ago

twut

An open-source toolkit for analyzing line-oriented JSON Twitter archives with Ap

Scala9apache-2.0

2 years ago

apache-sparksparkspark-packages

gneiss

Framework for platform-independent SPARK components

Ada22agpl-3.0

4 years ago

adacomponent-basedembedded

jwx

JSON/JWK/JWS/JWT/Base64 library in SPARK

Ada17agpl-3.0

4 years ago

adabase64jose

tensorframes

[DEPRECATED] Tensorflow wrapper for DataFrames on Apache Spark

Scala749apache-2.0

4 months ago

OpenDL

The Deep Learning training framework on Spark

Java220other

9 years ago

magellan

Geo Spatial Data Analytics on Spark

Scala533apache-2.0

3 years ago

big-datageojsongeometric-algorithms

ArchiveSpark

ArchiveSpark

An Apache Spark framework for easy data processing, extraction as well as deriva

Scala145mit

2 months ago

archivesparkinternet-archivespark

sparkplug

sparkplug

Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌

Scala28apache-2.0

5 years ago

datapipelinesparkspark-sql

sparkmagic

sparkmagic

Jupyter magics and kernels for working with remote Spark clusters

Python1329other

8 days ago

clusterjupyterjupyter-notebook

photon-ml

A scalable machine learning library on Apache Spark

Terra792other

3 years ago

importCSVSparkCassandra

How to import a file from S3 into cassandra using spark

Scala0

8 years ago

Mobius

Mobius

C# and F# language binding and extensions to Apache Spark

C#941mit

10 months ago

apache-sparkbigdatacsharp

flintrock

flintrock

A command-line tool for launching Apache Spark clusters.

Python638apache-2.0

5 months ago

apache-sparkapache-spark-clusterec2

first-edition

The book's repo

Scala273

7 years ago

aas

Code to accompany Advanced Analytics with Spark from O'Reilly Media

Scala1520other

2 months ago

sparta

sparta

Real Time Analytics and Data Pipelines based on Spark Streaming

Scala524apache-2.0

5 years ago

analyticshdfskafka

benchm-ml

benchm-ml

A minimal benchmark for scalability, speed and accuracy of commonly used open so

R1869mit

2 years ago

data-sciencedeep-learninggradient-boosting-machine

sparkle

Haskell on Apache Spark.

Haskell447bsd-3-clause

2 years ago

analyticsapache-sparkhaskell

flint

A Time Series Library for Apache Spark

Scala999apache-2.0

4 years ago

sparktimeseries

silex

something to help you spark

Scala19apache-2.0

7 years ago

pyspark-stubs

pyspark-stubs

Apache (Py)Spark type annotations (stub files).

Python115apache-2.0

2 years ago

apache-sparkmypypep484

geni

geni

A Clojure dataframe library that runs on Spark

Clojure286apache-2.0

12 months ago

big-dataclojureclojure-library

Clustering4Ever

C4E, a JVM friendly library written in Scala for both local and distributed (Sp

Scala130apache-2.0

4 years ago

aiartificial-intelligencebig-data

Vegas

The missing MatPlotLib for Scala + Spark

Scala730mit

3 years ago

datascienceplottingscala

spindle

spindle

Next-generation web analytics processing with Scala, Spark, and Parquet.

JavaScript331apache-2.0

10 years ago

SparkR-pkg

R frontend for Spark

R641apache-2.0

8 years ago

datacompy

Pandas, Polars, and Spark DataFrame comparison for humans and more!

Python473apache-2.0

2 months ago

comparedaskdata

wildfire

wildfire

🔥From a little spark may burst a flame.

CSS178gpl-3.0

6 years ago

comment-plugincommentsfirebase

streamDM

Stream Data Mining Library for Spark Streaming

Scala492apache-2.0

2 years ago

mist

mist

Serverless proxy for Spark cluster

Scala326apache-2.0

4 years ago

apache-sparkapibig-data

osm4scala

osm4scala

Scala and Spark library focused on reading OpenStreetMap Pbf files.

Scala79mit

last year

gisopenstreetmapopenstreetmap-pbf-files

crossdata

DISCONTINUED - Easy access to big things. Library for Apache Spark extending and

Scala169apache-2.0

5 years ago

pysparkling

pysparkling

A pure Python implementation of Apache Spark's RDD and DStream interfaces.

Python260other

2 years ago

apache-sparkdata-processingdata-science

strong-together

strong-together

A starter project to build single page Vue.js apps as stand-alone or for Laravel

CSS89mit

7 years ago

itachi

A library that brings useful functions from various modern database management s

Scala56apache-2.0

last year

hivepostgrespresto

charmander

charmander

Charmander Scheduler Lab - Mesos, Docker, InfluxDB, Spark

Shell66mit

8 years ago

deequ

Deequ is a library built on top of Apache Spark for defining "unit tests for dat

Scala3310apache-2.0

last month

dataqualityscalaspark

glow

glow

Glow is an easy-to-use distributed computation system written in Go, similar to

Go3193

6 years ago

koalas

Koalas: pandas API on Apache Spark

Python3339apache-2.0

8 months ago

big-datadata-sciencedataframe

delta

delta

An open-source storage framework that enables building a Lakehouse architecture

Scala7607apache-2.0

3 days ago

acidanalyticsbig-data

sample-KafkaSparkCassandra

Introductory sample scala app using Apache Spark Streaming to accept data from K

Scala23

6 years ago

netapp-public

continuous-verification

SPARK formal verification automated with Travis CI

Ada9gpl-3.0

5 years ago

LiFT

The LinkedIn Fairness Toolkit (LiFT) is a Scala/Spark library that enables the m

Scala168bsd-2-clause

2 years ago

fairnessfairness-aifairness-ml

SynapseML

SynapseML

Simple and Distributed Machine Learning

Scala5069mit

3 days ago

aiapache-sparkazure

awesome-ada

awesome-ada

A curated list of awesome resources related to the Ada and SPARK programming lan

615cc0-1.0

4 months ago

adaada-bindingada-framework

connector-ciscospark

📡 A Cisco Spark connector for opsdroid

Python1apache-2.0

5 years ago

RoaringBitmap

A better compressed bitset in Java: used by Apache Spark, Netflix Atlas, Apache

Java3548apache-2.0

14 days ago

bitsetdruidjava

streaming-benchmarks

Benchmarks for Low Latency (Streaming) solutions including Apache Storm, Apache

Jupyter Notebook633apache-2.0

11 months ago

benchmarkslow-latencystreaming

TensorFlowOnSpark

TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.

Python3873apache-2.0

last year

clusterfeaturedmachine-learning

kafka-sparkstreaming-cassandra

kafka-sparkstreaming-cassandra

Docker container for Kafka - Spark Streaming - Cassandra

Jupyter Notebook98

5 years ago

neo4j-mazerunner

Mazerunner extends a Neo4j graph database to run scheduled big data graph comput

Java381apache-2.0

2 years ago

incubator-livy

incubator-livy

Apache Livy is an open source REST interface for interacting with Apache Spark f

Scala889apache-2.0

9 days ago

apachelivybigdatalivy

dist-keras

Distributed Deep Learning, with a focus on distributed training, using Keras and

Python623gpl-3.0

6 years ago

apache-sparkdata-parallelismdata-science

livy

Livy is an open source REST interface for interacting with Apache Spark from any

Scala1010

2 years ago

vue-bar

vue-bar

Simple, elegant spark bars for Vue.js

JavaScript440mit

3 years ago

barchartcss

sparkling-water

Sparkling Water provides H2O functionality inside Spark cluster

Scala968apache-2.0

2 days ago

big-datah2ointegration

vue-info-card

Simple and beautiful card component with an elegant spark line, for VueJS.

JavaScript192mit

2 years ago

cardcard-componentcomponent

oryx

oryx

Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large sc

Java1787apache-2.0

3 years ago

apache-kafkaapache-sparkcloudera

vue-trend

vue-trend

🌈 Simple, elegant spark lines for Vue.js

JavaScript1191mit

2 years ago

trendtrendingvue

ydata-profiling

ydata-profiling

1 Line of code data quality profiling & exploratory data analysis for Pandas and

Python12270mit

4 months ago

big-data-analyticsdata-analysisdata-exploration

adam

ADAM is a genomics analysis platform with specialized file formats built using A

Scala1003apache-2.0

30 days ago

avrobig-databioinformatics

data-science-ipython-notebooks

data-science-ipython-notebooks

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras),

Python27493other

8 months ago

awsbig-datacaffe

dev-setup

dev-setup

macOS development environment setup: Easy-to-understand instructions with autom

Python6130other

2 years ago

android-developmentawsbash

scylla-migrator

Migrate data extract using Spark to Scylla, normally from Cassandra/parquet file

Scala58apache-2.0

2 months ago

alternatordynamodbmigration

clj-kondo

clj-kondo

Static analyzer and linter for Clojure code that sparks joy

Clojure1681epl-1.0

4 months ago

clojureclojurescriptgraalvm

pyspark-cheatsheet

🐍 Quick reference guide to common patterns & functions in PySpark.

451mit

2 years ago

cheatcheatsheetcheatsheets

Kimera-Semantics

Kimera-Semantics

Real-Time 3D Semantic Reconstruction from 2D data

C++635bsd-2-clause

12 months ago

3d-reconstructioncpudepth-image

lang

List of 126 languages for Laravel Framework, Laravel Jetstream, Laravel Fortify,

PHP7499mit

5 days ago

i18nlanguagelaravel

Kimera

Kimera

Index repo for Kimera code

1840bsd-2-clause

4 years ago

3d-reconstructioncomputer-visionrobotics

awesome

awesome

Description PBS KIDS Games makes learning fun & safe with 250+ educational ga

532cc0-1.0

2 years ago

awesomeawesome-listcraft

spark2014

SPARK 2014 is the new version of SPARK, a software development technology specif

Ada248gpl-3.0

24 days ago

spark-connect-go

Apache Spark Connect Client for Golang

Go162apache-2.0

14 days ago

awesome-spark

awesome-spark

A curated list of awesome Apache Spark packages and resources.

Shell1722cc0-1.0

29 days ago

apache-sparkawesomepyspark

spark-gotchas

Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks

360other

7 years ago

apache-sparkbookguide

spark-avro

spark-avro

Avro Data Source for Apache Spark

Scala539apache-2.0

6 years ago

spark-corenlp

Stanford CoreNLP wrapper for Apache Spark

Scala422gpl-3.0

6 years ago

spark-csv

spark-csv

CSV Data Source for Apache Spark 1.x

Scala1053apache-2.0

6 years ago

spark-sklearn

(Deprecated) Scikit-learn integration package for Apache Spark

Python1079apache-2.0

5 years ago

apache-sparkgrid-searchmachine-learning

spark-xml

XML data source for Spark SQL and DataFrames

Scala506apache-2.0

3 months ago

spark-cassandra-connector

DataStax Connector for Apache Spark to Apache Cassandra

Scala1943apache-2.0

3 months ago

cassandrascalaspark

spark-cassandra-stress

A tool for testing the DataStax Spark Connector against Apache Cassandra or DSE

Scala25apache-2.0

2 years ago

spark

spark

.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers

C#2025mit

4 months ago

analyticsapache-sparkazure

spark

Firely and Incendi's open source FHIR server

C#254bsd-3-clause

4 months ago

c-sharpdockerdstu2

spark-testing-base

Base classes to use when writing tests with Spark

Scala1523apache-2.0

19 days ago

spark

▁▂▃▅▂▇ in your shell.

Shell6008mit

3 years ago

shc

The Apache Spark - Apache HBase Connector is a library to support Spark accessin

Scala552apache-2.0

4 years ago

dbscan-on-spark

An implementation of DBSCAN runing on top of Apache Spark

Scala183apache-2.0

7 years ago

spark-nlp

spark-nlp

State of the Art Natural Language Processing

Scala3772apache-2.0

4 months ago

albertbertentity-extraction

spark.fish

▁▂▄▆▇█▇▆▄▂▁

Shell346mit

4 years ago

fishfish-pluginspark

jpmml-evaluator-spark

PMML evaluator library for the Apache Spark cluster computing system (http://spa

Java94agpl-3.0

3 years ago

EMR_Spark_Automation

A repository for deploying an AWS EMR cluster and submiting spark jobs on it. Bo

Python8apache-2.0

7 years ago

kotlin-spark-api

This projects gives Kotlin bindings and several extensions for Apache Spark. We

Kotlin461apache-2.0

5 months ago

bigdatakotlinnullability

TD-Spark

Java0mit

3 years ago

spark-connect-csharp

spark-connect-csharp

Apache Spark Connect Client for C#

C#1apache-2.0

7 months ago

apacheconnectdotnet

mongo-spark

mongo-spark

The MongoDB Spark Connector

Java712apache-2.0

3 months ago

connectormongo-sparkmongodb

spark-daria

spark-daria

Essential Spark extensions and helper methods ✨😲

Scala754mit

28 days ago

dataframespark

spark-fast-tests

spark-fast-tests

Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and

Scala436mit

13 days ago

sparktesting-framework

spark-daria

spark-daria

Essential Spark extensions and helper methods ✨😲

Scala746mit

3 years ago

dataframespark

spark-fast-tests

spark-fast-tests

Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and

Scala421mit

7 months ago

sparktesting-framework

neo4j-spark-connector

Neo4j Connector for Apache Spark, which provides bi-directional read/write acces

Scala312apache-2.0

2 months ago

boltcypherhacktoberfest

neo4j-spark-connector

Neo4j Connector for Apache Spark, which provides bi-directional read/write acces

Scala313apache-2.0

8 days ago

boltcypherhacktoberfest

ruby-spark

Ruby wrapper for Apache Spark

Ruby227mit

7 years ago

distributedrddruby

spark-orientdb

Apache Spark datasource for OrientDB

Scala19other

3 years ago

ob-ada-spark

Emacs Lisp8gpl-3.0

2 years ago

docker-spark

Shell765apache-2.0

4 years ago

spark-connect-rs

Apache Spark Connect Client for Rust

Rust90apache-2.0

20 days ago

grpc-clientsparkspark-connect

spark-jobserver

REST job server for Apache Spark

Scala2839other

5 months ago

rest-apiscalaspark

spark-notebook

spark-notebook

Interactive and Reactive Data Science using Scala and Spark.

JavaScript3150apache-2.0

2 years ago

apache-sparkdata-sciencenotebook

spark-timeseries

A library for time series analysis on Apache Spark

Scala1191apache-2.0

4 years ago

deep-spark

Connecting Apache Spark with different data stores [DEPRECATED]

Java197apache-2.0

8 years ago

Spark-MongoDB

Spark library for easy MongoDB access

Scala307apache-2.0

8 years ago

frege-spark

Try Apache Spark with Frege

Frege5

9 years ago

cl-spark

(spark '(1 1 2 3 5 8)) => "▁▁▂▃▅▇"

Common Lisp95mit

9 years ago

spark-by-example

SPARK by Example is an adaptation of ACSL by Example for SPARK 2014, a programmi

Ada152

2 years ago

adaformal-methodsformal-specification

spark-riak-connector

spark-riak-connector

The official Riak Spark Connector for Apache Spark with Riak TS and Riak KV

Scala60apache-2.0

8 years ago

joblib-spark

Joblib Apache Spark Backend

Python242apache-2.0

3 months ago

gatling-sql

gatling-sql

Gatling Extension for JDBC or Spark Thrift Server stress tests

Scala6apache-2.0

4 years ago

gatlingjdbcstress-testing

sparklyr

sparklyr

R interface for Apache Spark

R956apache-2.0

24 days ago

apache-sparkdistributeddplyr

delight

delight

A Spark UI and Spark History Server alternative with CPU and Memory metrics! Del

Scala342other

6 months ago

apache-sparkcpudashboard

sparks

sparks

A typeface for creating sparklines in text without code.

CSS2084ofl-1.1

last year

sparkly

sparkly

Generate sparklines ▁▂▃▅▂▇

JavaScript424mit

3 years ago

sparkly-cli

Generate sparklines ▁▂▃▅▂▇

JavaScript141mit

3 years ago

sparkly

Helpers & syntactic sugar for PySpark.

Python60apache-2.0

last year

pysparkpythonspark

toolchain

GitHub action to setup an Ada/SPARK dev environment

JavaScript20mit

3 years ago

ada_language_server

ada_language_server

Server implementing the Microsoft Language Protocol for Ada and SPARK

Ada235gpl-3.0

25 days ago

gnatstudio

gnatstudio

GNAT Studio is a powerful and lightweight IDE for Ada and SPARK.

Ada409

24 days ago

twut

An open-source toolkit for analyzing line-oriented JSON Twitter archives with Ap

Scala9apache-2.0

2 years ago

apache-sparksparkspark-packages

gneiss

Framework for platform-independent SPARK components

Ada22agpl-3.0

4 years ago

adacomponent-basedembedded

jwx

JSON/JWK/JWS/JWT/Base64 library in SPARK

Ada17agpl-3.0

4 years ago

adabase64jose

tensorframes

[DEPRECATED] Tensorflow wrapper for DataFrames on Apache Spark

Scala749apache-2.0

4 months ago

OpenDL

The Deep Learning training framework on Spark

Java220other

9 years ago

magellan

Geo Spatial Data Analytics on Spark

Scala533apache-2.0

3 years ago

big-datageojsongeometric-algorithms

ArchiveSpark

ArchiveSpark

An Apache Spark framework for easy data processing, extraction as well as deriva

Scala145mit

2 months ago

archivesparkinternet-archivespark

sparkplug

sparkplug

Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌

Scala28apache-2.0

5 years ago

datapipelinesparkspark-sql

sparkmagic

sparkmagic

Jupyter magics and kernels for working with remote Spark clusters

Python1329other

8 days ago

clusterjupyterjupyter-notebook

photon-ml

A scalable machine learning library on Apache Spark

Terra792other

3 years ago

importCSVSparkCassandra

How to import a file from S3 into cassandra using spark

Scala0

8 years ago

Mobius

Mobius

C# and F# language binding and extensions to Apache Spark

C#941mit

10 months ago

apache-sparkbigdatacsharp

flintrock

flintrock

A command-line tool for launching Apache Spark clusters.

Python638apache-2.0

5 months ago

apache-sparkapache-spark-clusterec2

first-edition

The book's repo

Scala273

7 years ago

aas

Code to accompany Advanced Analytics with Spark from O'Reilly Media

Scala1520other

2 months ago

sparta

sparta

Real Time Analytics and Data Pipelines based on Spark Streaming

Scala524apache-2.0

5 years ago

analyticshdfskafka

benchm-ml

benchm-ml

A minimal benchmark for scalability, speed and accuracy of commonly used open so

R1869mit

2 years ago

data-sciencedeep-learninggradient-boosting-machine

sparkle

Haskell on Apache Spark.

Haskell447bsd-3-clause

2 years ago

analyticsapache-sparkhaskell

flint

A Time Series Library for Apache Spark

Scala999apache-2.0

4 years ago

sparktimeseries

silex

something to help you spark

Scala19apache-2.0

7 years ago

pyspark-stubs

pyspark-stubs

Apache (Py)Spark type annotations (stub files).

Python115apache-2.0

2 years ago

apache-sparkmypypep484

geni

geni

A Clojure dataframe library that runs on Spark

Clojure286apache-2.0

12 months ago

big-dataclojureclojure-library

Clustering4Ever

C4E, a JVM friendly library written in Scala for both local and distributed (Sp

Scala130apache-2.0

4 years ago

aiartificial-intelligencebig-data

Vegas

The missing MatPlotLib for Scala + Spark

Scala730mit

3 years ago

datascienceplottingscala

spindle

spindle

Next-generation web analytics processing with Scala, Spark, and Parquet.

JavaScript331apache-2.0

10 years ago

SparkR-pkg

R frontend for Spark

R641apache-2.0

8 years ago

datacompy

Pandas, Polars, and Spark DataFrame comparison for humans and more!

Python473apache-2.0

2 months ago

comparedaskdata

wildfire

wildfire

🔥From a little spark may burst a flame.

CSS178gpl-3.0

6 years ago

comment-plugincommentsfirebase

streamDM

Stream Data Mining Library for Spark Streaming

Scala492apache-2.0

2 years ago

mist

mist

Serverless proxy for Spark cluster

Scala326apache-2.0

4 years ago

apache-sparkapibig-data

osm4scala

osm4scala

Scala and Spark library focused on reading OpenStreetMap Pbf files.

Scala79mit

last year

gisopenstreetmapopenstreetmap-pbf-files

crossdata

DISCONTINUED - Easy access to big things. Library for Apache Spark extending and

Scala169apache-2.0

5 years ago

pysparkling

pysparkling

A pure Python implementation of Apache Spark's RDD and DStream interfaces.

Python260other

2 years ago

apache-sparkdata-processingdata-science

strong-together

strong-together

A starter project to build single page Vue.js apps as stand-alone or for Laravel

CSS89mit

7 years ago

itachi

A library that brings useful functions from various modern database management s

Scala56apache-2.0

last year

hivepostgrespresto

charmander

charmander

Charmander Scheduler Lab - Mesos, Docker, InfluxDB, Spark

Shell66mit

8 years ago

deequ

Deequ is a library built on top of Apache Spark for defining "unit tests for dat

Scala3310apache-2.0

last month

dataqualityscalaspark

glow

glow

Glow is an easy-to-use distributed computation system written in Go, similar to

Go3193

6 years ago

koalas

Koalas: pandas API on Apache Spark

Python3339apache-2.0

8 months ago

big-datadata-sciencedataframe

delta

delta

An open-source storage framework that enables building a Lakehouse architecture

Scala7607apache-2.0

3 days ago

acidanalyticsbig-data

sample-KafkaSparkCassandra

Introductory sample scala app using Apache Spark Streaming to accept data from K

Scala23

6 years ago

netapp-public

continuous-verification

SPARK formal verification automated with Travis CI

Ada9gpl-3.0

5 years ago

LiFT

The LinkedIn Fairness Toolkit (LiFT) is a Scala/Spark library that enables the m

Scala168bsd-2-clause

2 years ago

fairnessfairness-aifairness-ml

SynapseML

SynapseML

Simple and Distributed Machine Learning

Scala5069mit

3 days ago

aiapache-sparkazure

awesome-ada

awesome-ada

A curated list of awesome resources related to the Ada and SPARK programming lan

615cc0-1.0

4 months ago

adaada-bindingada-framework

connector-ciscospark

📡 A Cisco Spark connector for opsdroid

Python1apache-2.0

5 years ago

RoaringBitmap

A better compressed bitset in Java: used by Apache Spark, Netflix Atlas, Apache

Java3548apache-2.0

14 days ago

bitsetdruidjava

streaming-benchmarks

Benchmarks for Low Latency (Streaming) solutions including Apache Storm, Apache

Jupyter Notebook633apache-2.0

11 months ago

benchmarkslow-latencystreaming

TensorFlowOnSpark

TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.

Python3873apache-2.0

last year

clusterfeaturedmachine-learning

kafka-sparkstreaming-cassandra

kafka-sparkstreaming-cassandra

Docker container for Kafka - Spark Streaming - Cassandra

Jupyter Notebook98

5 years ago

neo4j-mazerunner

Mazerunner extends a Neo4j graph database to run scheduled big data graph comput

Java381apache-2.0

2 years ago

incubator-livy

incubator-livy

Apache Livy is an open source REST interface for interacting with Apache Spark f

Scala889apache-2.0

9 days ago

apachelivybigdatalivy

dist-keras

Distributed Deep Learning, with a focus on distributed training, using Keras and

Python623gpl-3.0

6 years ago

apache-sparkdata-parallelismdata-science

livy

Livy is an open source REST interface for interacting with Apache Spark from any

Scala1010

2 years ago

vue-bar

vue-bar

Simple, elegant spark bars for Vue.js

JavaScript440mit

3 years ago

barchartcss

sparkling-water

Sparkling Water provides H2O functionality inside Spark cluster

Scala968apache-2.0

2 days ago

big-datah2ointegration

vue-info-card

Simple and beautiful card component with an elegant spark line, for VueJS.

JavaScript192mit

2 years ago

cardcard-componentcomponent

oryx

oryx

Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large sc

Java1787apache-2.0

3 years ago

apache-kafkaapache-sparkcloudera

vue-trend

vue-trend

🌈 Simple, elegant spark lines for Vue.js

JavaScript1191mit

2 years ago

trendtrendingvue

ydata-profiling

ydata-profiling

1 Line of code data quality profiling & exploratory data analysis for Pandas and

Python12270mit

4 months ago

big-data-analyticsdata-analysisdata-exploration

adam

ADAM is a genomics analysis platform with specialized file formats built using A

Scala1003apache-2.0

30 days ago

avrobig-databioinformatics

data-science-ipython-notebooks

data-science-ipython-notebooks

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras),

Python27493other

8 months ago

awsbig-datacaffe

dev-setup

dev-setup

macOS development environment setup: Easy-to-understand instructions with autom

Python6130other

2 years ago

android-developmentawsbash

scylla-migrator

Migrate data extract using Spark to Scylla, normally from Cassandra/parquet file

Scala58apache-2.0

2 months ago

alternatordynamodbmigration

clj-kondo

clj-kondo

Static analyzer and linter for Clojure code that sparks joy

Clojure1681epl-1.0

4 months ago

clojureclojurescriptgraalvm

pyspark-cheatsheet

🐍 Quick reference guide to common patterns & functions in PySpark.

451mit

2 years ago

cheatcheatsheetcheatsheets

Kimera-Semantics

Kimera-Semantics

Real-Time 3D Semantic Reconstruction from 2D data

C++635bsd-2-clause

12 months ago

3d-reconstructioncpudepth-image

lang

List of 126 languages for Laravel Framework, Laravel Jetstream, Laravel Fortify,

PHP7499mit

5 days ago

i18nlanguagelaravel

Kimera

Kimera

Index repo for Kimera code

1840bsd-2-clause

4 years ago

3d-reconstructioncomputer-visionrobotics

awesome

awesome

Description PBS KIDS Games makes learning fun & safe with 250+ educational ga

532cc0-1.0

2 years ago

awesomeawesome-listcraft