The document provides an introduction to MapReduce, describing its motivation as a framework for simplifying large-scale data processing across distributed systems. It outlines MapReduce's programming model and main features, including automatic parallelization, fault tolerance, and locality. The document also provides a detailed example of counting letter frequencies in a large file to illustrate how MapReduce works.
Related topics: