hadoop-pcap

Hadoop library to read packet capture (PCAP) files

Java203lgpl-3.0

11 months ago

hadoop

Apache Hadoop

Java14151apache-2.0

2 months ago

hadoop

elasticsearch-hadoop

elasticsearch-hadoop

:elephant: Elasticsearch real-time search and analytics natively integrated wit

Java1920apache-2.0

2 months ago

xgboost

xgboost

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library

C++25350apache-2.0

2 months ago

distributed-systemsgbdtgbm

luigi

luigi

Luigi is a Python module that helps you build complex pipelines of batch jobs. I

Python17120apache-2.0

2 months ago

hadoopluigiorchestration-framework

seaweedfs

seaweedfs

SeaweedFS is a fast distributed storage system for blobs, objects, files, and da

Go21088apache-2.0

yesterday

blob-storagecloud-drivedistributed-file-system

data-science-ipython-notebooks

data-science-ipython-notebooks

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras),

Python25938other

7 months ago

awsbig-datacaffe

zsh-hadoop-plugin

Useful aliases for Hadoop

Shell0

7 years ago

mongo-hadoop

MongoDB Connector for Hadoop

Java1522

2 years ago

spatial-framework-for-hadoop

The Spatial Framework for Hadoop allows developers and data scientists to use th

Java359apache-2.0

last year

data-managementspatial-analysis

awesome-hadoop

A curated list of amazingly awesome Hadoop and Hadoop ecosystem resources

1071

2 years ago

gis-tools-for-hadoop

gis-tools-for-hadoop

The GIS Tools for Hadoop are a collection of GIS tools for spatial analysis of b

509apache-2.0

2 years ago

hadoopspatial-analysis

pangool

Tuple MapReduce for Hadoop: Hadoop API made easy

Java56apache-2.0

2 years ago

parkour

Hadoop MapReduce in idiomatic Clojure.

Clojure257apache-2.0

8 years ago

binarypig

Scalable Binary Data Extraction in Hadoop

JavaScript142apache-2.0

10 years ago

HadoopConcatGz

A Splitable Hadoop InputFormat for Concatenated GZIP Files and *.(w)arc.gz

Java9

6 years ago

hadoopsparkwarc

bigfatlm

Hadoop MapReduce training of modified Kneser-Ney smoothed language models

Java30lgpl-3.0

6 years ago

crunch

crunch

A fast to develop, fast to run, Go based toolkit for ETL and feature extraction

Go213

9 years ago

akela

A bunch of utility classes for Java, Hadoop, HBase, Pig, etc.

Java76apache-2.0

10 years ago

elephantdb

Distributed database specialized in exporting key/value data from Hadoop

Java555bsd-3-clause

10 years ago

DuctileDB

Ductile DB is a graph database based on Hadoop/HBase which provides a vast set o

Java13apache-2.0

6 years ago

daudit

daudit

🌲 Configuration flaws detector for Hadoop, MongoDB, MySQL, and more!

Python104mit

4 years ago

auditingbigdatahadoop-spark

elephant-bird

Twitter's collection of LZO and Protocol Buffer-related Hadoop, Pig, Hive, and H

Java1137apache-2.0

last year

vessel

Elixir MapReduce interfaces with Hadoop Streaming integration

Elixir23mit

3 years ago

mrjob

mrjob

Run MapReduce jobs on Hadoop or Amazon Web Services

Python2609other

last year

glow

glow

Glow is an easy-to-use distributed computation system written in Go, similar to

Go3179

5 years ago

webarchive-indexing

Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.

Python40mit

6 years ago

netlytics

netlytics

NetLytics is a Hadoop-powered framework for performing advanced analytics on var

Python10gpl-3.0

6 years ago

opensoc

OpenSOC Apache Hadoop Code

573apache-2.0

4 years ago

Impala

Real-time Query for Hadoop; mirror of Apache Impala

C++30apache-2.0

last year

schedoscope

schedoscope

Schedoscope is a scheduling framework for painfree agile development, testing, (

Scala95apache-2.0

4 years ago

dev-setup

dev-setup

macOS development environment setup: Easy-to-understand instructions with autom

Python6014other

last year

android-developmentawsbash

hindex

hindex

Secondary Index for HBase

Java589apache-2.0

7 years ago