elasticsearch-hadoop
:elephant: Elasticsearch real-time search and analytics natively integrated wit
Java10apache-2.0
yesterday
spatial-framework-for-hadoop
The Spatial Framework for Hadoop allows developers and data scientists to use th
Java363apache-2.0
9 months ago
data-managementspatial-analysis
hiho
Hadoop Data Integration with various databases, ftp servers, salesforce. Increme
Java91apache-2.0
12 years ago
awesome-hadoop
A curated list of amazingly awesome Hadoop and Hadoop ecosystem resources
1078
7 months ago
gis-tools-for-hadoop
The GIS Tools for Hadoop are a collection of GIS tools for spatial analysis of b
515apache-2.0
3 years ago
hadoopspatial-analysis
HadoopConcatGz
A Splitable Hadoop InputFormat for Concatenated GZIP Files and *.(w)arc.gz
Java9
7 years ago
hadoopsparkwarc
bigfatlm
Hadoop MapReduce training of modified Kneser-Ney smoothed language models
Java30lgpl-3.0
6 years ago
crunch
A fast to develop, fast to run, Go based toolkit for ETL and feature extraction
Go214
10 years ago
elephantdb
Distributed database specialized in exporting key/value data from Hadoop
Java558bsd-3-clause
10 years ago
DuctileDB
Ductile DB is a graph database based on Hadoop/HBase which provides a vast set o
Java13apache-2.0
7 years ago
daudit
🌲 Configuration flaws detector for Hadoop, MongoDB, MySQL, and more!
Python105mit
4 years ago
auditingbigdatahadoop-spark
elephant-bird
Twitter's collection of LZO and Protocol Buffer-related Hadoop, Pig, Hive, and H
Java1138apache-2.0
2 years ago
glow
Glow is an easy-to-use distributed computation system written in Go, similar to
Go3193
6 years ago
webarchive-indexing
Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.
Python41mit
7 years ago
netlytics
NetLytics is a Hadoop-powered framework for performing advanced analytics on var
Python9gpl-3.0
7 years ago
schedoscope
Schedoscope is a scheduling framework for painfree agile development, testing, (
Scala96apache-2.0
5 years ago
dev-setup
macOS development environment setup: Easy-to-understand instructions with autom
Python6130other
2 years ago
android-developmentawsbash
seaweedfs
SeaweedFS is a fast distributed storage system for blobs, objects, files, and da
Go22976apache-2.0
3 days ago
blob-storagecloud-drivedistributed-file-system
elasticsearch-hadoop
:elephant: Elasticsearch real-time search and analytics natively integrated wit
Java10apache-2.0
yesterday
spatial-framework-for-hadoop
The Spatial Framework for Hadoop allows developers and data scientists to use th
Java363apache-2.0
9 months ago
data-managementspatial-analysis
hiho
Hadoop Data Integration with various databases, ftp servers, salesforce. Increme
Java91apache-2.0
12 years ago
awesome-hadoop
A curated list of amazingly awesome Hadoop and Hadoop ecosystem resources
1078
7 months ago
gis-tools-for-hadoop
The GIS Tools for Hadoop are a collection of GIS tools for spatial analysis of b
515apache-2.0
3 years ago
hadoopspatial-analysis
HadoopConcatGz
A Splitable Hadoop InputFormat for Concatenated GZIP Files and *.(w)arc.gz
Java9
7 years ago
hadoopsparkwarc
bigfatlm
Hadoop MapReduce training of modified Kneser-Ney smoothed language models
Java30lgpl-3.0
6 years ago
crunch
A fast to develop, fast to run, Go based toolkit for ETL and feature extraction
Go214
10 years ago
elephantdb
Distributed database specialized in exporting key/value data from Hadoop
Java558bsd-3-clause
10 years ago
DuctileDB
Ductile DB is a graph database based on Hadoop/HBase which provides a vast set o
Java13apache-2.0
7 years ago
daudit
🌲 Configuration flaws detector for Hadoop, MongoDB, MySQL, and more!
Python105mit
4 years ago
auditingbigdatahadoop-spark
elephant-bird
Twitter's collection of LZO and Protocol Buffer-related Hadoop, Pig, Hive, and H
Java1138apache-2.0
2 years ago
glow
Glow is an easy-to-use distributed computation system written in Go, similar to
Go3193
6 years ago
webarchive-indexing
Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.
Python41mit
7 years ago
netlytics
NetLytics is a Hadoop-powered framework for performing advanced analytics on var
Python9gpl-3.0
7 years ago
schedoscope
Schedoscope is a scheduling framework for painfree agile development, testing, (
Scala96apache-2.0
5 years ago
data-science-ipython-notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras),
Python27493other
8 months ago
awsbig-datacaffe
dev-setup
macOS development environment setup: Easy-to-understand instructions with autom
Python6130other
2 years ago
android-developmentawsbash
seaweedfs
SeaweedFS is a fast distributed storage system for blobs, objects, files, and da
Go22976apache-2.0
3 days ago
blob-storagecloud-drivedistributed-file-system