Apache Spark, Flink batch, and modern dataflow architectures
Understanding the map-reduce programming model for big data