Timilearning

MIT 6.824: Lecture 1 - MapReduce

Background

I started a study group with some of my friends where we'll be going through this course. Over the next couple of weeks, I intend to upload my notes from studying each week's material.


MapReduce

This week's material focused on the MapReduce paradigm for data processing. The material included the seminal MapReduce paper by Jeff Dean and Sanjay Ghemawat, and an accompanying video lecture. Below are my notes from the materials and group discussion.


MapReduce is a system for parallelizing the computation of a large volume of data across multiple machines in a cluster. It achieves this by exposing a simple API for expressing these computations using two operations: Map and Reduce.

The Map task takes an input file and outputs a set of intermediate (key, value) pairs. The intermediate values with the same key are then grouped together and processed in the Reduce task for each distinct key.

Some examples of programs that can be expressed as MapReduce computations are:

  1. Word Count in Documents: Here, the map function can emit a (key, value) pair for each occurrence of a word like (word, count). The reduce function can then add all the counts for the same word and emit a (word, total count) pair.
  2. Distributed Grep: Grep is a regular expression search for a given pattern in a text document. To search across a large volume of documents, we could define a map function which emits a line if it matches the supplied pattern like (pattern, line). The reduce function then outputs all the lines from the intermediate values for the given key.
  3. Distributed Sort: We can have a map function which extracts the key from each record and emits a (key, record) pair. Depending on the partitioning and ordering scheme, we can then have a reduce function that emits all the pairs unchanged. We'll go into more detail on the ordering scheme later on.

Implementation Details

The MapReduce interface can be implemented in many ways, so this section just details the implementation specific to Google at the time of writing this paper.

The Map function invocations are distributed across multiple machines by automatically partitioning the input data into a set of M splits. The Reduce invocations are split into R pieces based on a partitioning function defined on the intermediate key.

A sample MapReduce Job

A sample MapReduce Job

The flow of execution when the MapReduce function is called by a user is as follows:

The output of a MapReduce job is a set of R files (one per reduce task).

Dealing with Faults

Dealing with Network Resource Scarcity

Their implementation at the time of writing the paper used locality as a means of conserving network bandwidth. This means that the input files were kept close to where they will be processed to avoid the network trip of transferring these large files. The Master's scheduling algorithm took the file location into account when determining what workers should execute what input files.

Note, the lecture video for the week explained that as Google's networking infrastructure was expanded and upgraded in later years, they relied less on this locality optimization.

Dealing with Stragglers

Stragglers are machines that take longer time than usual to complete one of the last few map or reduce tasks. They addressed this by having the master schedule backup tasks when the computation is almost completed. A task is then marked as completed when either the primary or backup execution completes.

Some Other Interesting Features

Conclusion

Though no longer in use at Google for a number of reasons, MapReduce fundamentally changed the way large-scale data processing architectures are built. It abstracted the complexity of dealing with parallelism, fault-tolerance and load balancing by exposing a simple API that allowed programmers without experience with these systems to distribute the processing of large datasets across a cluster of computers.

mit-6.824 distributed-systems learning-diary

To get notified when I write something new, you can subscribe to the RSS feed.

A small favour

Did you find anything I wrote confusing, outdated, or incorrect? Or do you have an answer to a question I posed here? Please let me know! Just write a few words below and I'll be sure to amend this post with your suggestions.

← Home