MAP REDUCE SLIDESHARE

PRESENTED BY
DHARANI.S
16CSEA62
1

What is Map Reduce?
Map Reduce is a massive parallel technique for processing data which is
maps are the individual tasks that transform input records into
intermediate records.
MapReduce program executes in three stages,
 1.map stage.
 2.shuffle stage.
 3.reduce stage.
2

Cont..
 Map − Map is a user-defined function, which takes a series of key-value pairs and
processes each one of them to generate zero or more key-value pairs.
 Shuffle and Sort − The process of exchanging the intermediate outputs from the
map tasks to where they are required by the reducers is known as shuffling.
 Reducer −Reduces a set of intermediate values which share a key to a smaller set
of values. All of the values with the same key are presented to a single reducer
together
4

Why Map Reduce?
 Large scale data processing was difficult!
 Managing hundreds of 1000s of process
 Managing parallelization and distribution
 Reliable execution with easy data access
Map reduce provides all of these easily..!
5

Why Map Reduce?
 Traditional Enterprise Systems normally have a centralized server to store and
process data. The following illustration depicts a schematic view of a traditional
enterprise system. Traditional model is certainly not suitable to process huge
volumes of scalable data and cannot be accommodated by standard database
servers. Moreover, the centralized system creates too much of a bottleneck while
processing multiple files simultaneously.
 Google solved this bottleneck issue using an algorithm called Map Reduce. Map
Reduce divides a task into small parts and assigns them to many computers. Later,
the results are collected at one place and integrated to form the result dataset.
7

ADVANTAGES
 Scalability
 Cost-effective solution
 Flexibility
 Fast
 Security and Authentication
 Parallel processing
9

DISADVANTAGES
 Its not always very easy to implement each and everything as a MR
program
 When your processing requires lot of data to be shuffled over the network
 When you need to handle streaming data.MR is best suited to batch
Process huge amounts of data which you already have with you.
10

CONCLUSION
 Map Reduce provides a simple way to scale your application.
 Effortlessly scale from a single machine to thousands
 The Map Reduce Programming model has been with success
used at Google for several completely diffent functions.
11

REFERENCES
 http://mapreduce-
specifics.wikispaces.asu.edu/Applications+and+Limitations+of+MapReduce
 https://www.google.co.in/search?ei=C1F4W624D4ztvASBt4aYAQ&q=what+is+ma
preduce+and+how+it+works&oq=wht+is+map+reduce&gs_l=psy-
ab.1.0.0i71k1l8.0.0.0.5845.0.0.0.0.0.0.0.0..0.0....0...1c..64.psy-
ab..0.0.0....0.vd3TcDimDKE
 https://stackoverflow.com/questions/12375761/good-mapreduce-examples
 https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html#Mapper
12

MAP REDUCE SLIDESHARE

More Related Content

What's hot (15)

Similar to MAP REDUCE SLIDESHARE (20)

Recently uploaded (20)

MAP REDUCE SLIDESHARE