Reviews
An easy-to-use and highly customizable crawler that enables you to create your own little Web archives (WARC/CDX)
Search similar apps
License
MIT License
Related apps
ArchiveSpark
An Apache Spark framework for easy data processing, extraction as well as deriva
Scala145mit
2 months ago
archivesparkinternet-archivespark
HadoopConcatGz
A Splitable Hadoop InputFormat for Concatenated GZIP Files and *.(w)arc.gz
Java9
7 years ago
hadoopsparkwarc
WarcPartitioner
Partition (W)ARC Files by MIME Type and Year
Java1mit
8 years ago
hadoopwarcweb-archiving