WebMapReduce in Education

WebMapReduce is geared toward use in the classroom. Free teaching materials for introductory computer science courses are available for the following languages through the CS in Parallel web site at its Map-reduce Computing page.

What is Map-Reduce?

Map-reduce is a model for writing programs that can easily be made to process data in parallel. It usually goes along with a framework that manages the details of parallelism like distributing work and synchronizing shared resources. This makes map-reduce programs very easy to write as well.

In the map-reduce model, work is divided into two phases: a map phase and a reduce phase. The map phase takes a piece of input and performs some operation on it (e.g., extracting a field), and the reduce phase aggregates similar pieces of information that are produced by the map phase (e.g., averaging fields with the same name). These pieces of information are represented by key-value pairs whose content is determined by the program.

Diagram of Map-Reduce

A visualization of the map-reduce process. Many mappers and reducers can work on different parts of the input in parallel, while the map-reduce framework takes care of distributing work.

For more information and examples, see:

Why Map-Reduce?

Map-reduce can serve as an ideal introduction to parallelism for a number of reasons. It avoids issues that can be particularly difficult with parallel programming, like deadlock and race conditions. At the same time, though, it does demonstrate many important concepts, such as:

Map-reduce is also a real-world application used by companies like Google, Yahoo!, Amazon, and Facebook. This, along with its ability to tackle large problems on powerful machines, gives it a certain appeal for students.