SlideShare a Scribd company logo
PRESENTED BY
DHARANI.S
16CSEA62
1
What is Map Reduce?
Map Reduce is a massive parallel technique for processing data which is
maps are the individual tasks that transform input records into
intermediate records.
MapReduce program executes in three stages,
 1.map stage.
 2.shuffle stage.
 3.reduce stage.
2
MAP REDUCE OVERVIEW3
Cont..
 Map − Map is a user-defined function, which takes a series of key-value pairs and
processes each one of them to generate zero or more key-value pairs.
 Shuffle and Sort − The process of exchanging the intermediate outputs from the
map tasks to where they are required by the reducers is known as shuffling.
 Reducer −Reduces a set of intermediate values which share a key to a smaller set
of values. All of the values with the same key are presented to a single reducer
together
4
Why Map Reduce?
 Large scale data processing was difficult!
 Managing hundreds of 1000s of process
 Managing parallelization and distribution
 Reliable execution with easy data access
Map reduce provides all of these easily..!
5
6
Why Map Reduce?
 Traditional Enterprise Systems normally have a centralized server to store and
process data. The following illustration depicts a schematic view of a traditional
enterprise system. Traditional model is certainly not suitable to process huge
volumes of scalable data and cannot be accommodated by standard database
servers. Moreover, the centralized system creates too much of a bottleneck while
processing multiple files simultaneously.
 Google solved this bottleneck issue using an algorithm called Map Reduce. Map
Reduce divides a task into small parts and assigns them to many computers. Later,
the results are collected at one place and integrated to form the result dataset.
7
8
ADVANTAGES
 Scalability
 Cost-effective solution
 Flexibility
 Fast
 Security and Authentication
 Parallel processing
9
DISADVANTAGES
 Its not always very easy to implement each and everything as a MR
program
 When your processing requires lot of data to be shuffled over the network
 When you need to handle streaming data.MR is best suited to batch
Process huge amounts of data which you already have with you.
10
CONCLUSION
 Map Reduce provides a simple way to scale your application.
 Effortlessly scale from a single machine to thousands
 The Map Reduce Programming model has been with success
used at Google for several completely diffent functions.
11
REFERENCES
 http://mapreduce-
specifics.wikispaces.asu.edu/Applications+and+Limitations+of+MapReduce
 https://www.google.co.in/search?ei=C1F4W624D4ztvASBt4aYAQ&q=what+is+ma
preduce+and+how+it+works&oq=wht+is+map+reduce&gs_l=psy-
ab.1.0.0i71k1l8.0.0.0.5845.0.0.0.0.0.0.0.0..0.0....0...1c..64.psy-
ab..0.0.0....0.vd3TcDimDKE
 https://stackoverflow.com/questions/12375761/good-mapreduce-examples
 https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html#Mapper
12
13

More Related Content

PDF
Introduction to map reduce s. jency jayastina II MSC COMPUTER SCIENCE BON SEC...
PPTX
Map Reduce
PPT
Map reduce in BIG DATA
PPTX
Introducing MapReduce Programming Framework
PDF
BREEZE CALPUFF Tech Sheet
PPTX
Hadoop MapReduce Paradigm
PPTX
What is MapReduce ?
PPTX
Where's the Terrorist? (Lightning Talk)
Introduction to map reduce s. jency jayastina II MSC COMPUTER SCIENCE BON SEC...
Map Reduce
Map reduce in BIG DATA
Introducing MapReduce Programming Framework
BREEZE CALPUFF Tech Sheet
Hadoop MapReduce Paradigm
What is MapReduce ?
Where's the Terrorist? (Lightning Talk)

What's hot (15)

PPT
Dsm Presentation
PPTX
Management information system
PDF
5 Ways to Improve Your LiDAR Workflows
PPTX
Hadoop Mapreduce joins
PPTX
Main map reduce
PDF
Mrp Final
PPSX
Floor mgmt software ppt -wts
PPTX
GIS Modeling
PDF
Reduce Side Joins
PDF
8 Ways Utility Networks Can Meet Data Demands
PDF
Coordinate Systems in FME 101
PDF
FME Cloud as Engine for New Mobility Ideas
PDF
3D Solution Templates - Making the World 3D
PPTX
Automating Engineering with FME
PPTX
LIDAR and Drone Data - Datamine Discover3D
Dsm Presentation
Management information system
5 Ways to Improve Your LiDAR Workflows
Hadoop Mapreduce joins
Main map reduce
Mrp Final
Floor mgmt software ppt -wts
GIS Modeling
Reduce Side Joins
8 Ways Utility Networks Can Meet Data Demands
Coordinate Systems in FME 101
FME Cloud as Engine for New Mobility Ideas
3D Solution Templates - Making the World 3D
Automating Engineering with FME
LIDAR and Drone Data - Datamine Discover3D
Ad

Similar to MAP REDUCE SLIDESHARE (20)

PDF
MapReduce
PPTX
Module3 for enginerring students ppt.pptx
PPTX
MapReduce Programming Model
PPTX
MapReduce : Simplified Data Processing on Large Clusters
PPTX
introduction to Complete Map and Reduce Framework
PDF
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
PPTX
Big Data.pptx
PDF
2 mapreduce-model-principles
PPTX
Embarrassingly/Delightfully Parallel Problems
PPTX
This gives a brief detail about big data
PDF
Mapreduce2008 cacm
PPTX
MapReduce Paradigm
PPTX
PPT
MapReduce in cgrid and cloud computinge.ppt
PPTX
COMPLETE MAP AND REDUCE FRAMEWORK INTRODUCTION
PDF
Lecture 1 mapreduce
PPTX
Mapreduce script
PPTX
Map reduce presentation
PDF
2004 map reduce simplied data processing on large clusters (mapreduce)
PDF
Map reduce
MapReduce
Module3 for enginerring students ppt.pptx
MapReduce Programming Model
MapReduce : Simplified Data Processing on Large Clusters
introduction to Complete Map and Reduce Framework
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
Big Data.pptx
2 mapreduce-model-principles
Embarrassingly/Delightfully Parallel Problems
This gives a brief detail about big data
Mapreduce2008 cacm
MapReduce Paradigm
MapReduce in cgrid and cloud computinge.ppt
COMPLETE MAP AND REDUCE FRAMEWORK INTRODUCTION
Lecture 1 mapreduce
Mapreduce script
Map reduce presentation
2004 map reduce simplied data processing on large clusters (mapreduce)
Map reduce
Ad

Recently uploaded (20)

PPTX
Cell Types and Its function , kingdom of life
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
master seminar digital applications in india
PDF
RMMM.pdf make it easy to upload and study
PPTX
PPH.pptx obstetrics and gynecology in nursing
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
Pre independence Education in Inndia.pdf
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
Cell Structure & Organelles in detailed.
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PPTX
Pharma ospi slides which help in ospi learning
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Microbial disease of the cardiovascular and lymphatic systems
Cell Types and Its function , kingdom of life
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Anesthesia in Laparoscopic Surgery in India
master seminar digital applications in india
RMMM.pdf make it easy to upload and study
PPH.pptx obstetrics and gynecology in nursing
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Renaissance Architecture: A Journey from Faith to Humanism
Pre independence Education in Inndia.pdf
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
Cell Structure & Organelles in detailed.
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Pharma ospi slides which help in ospi learning
Final Presentation General Medicine 03-08-2024.pptx
human mycosis Human fungal infections are called human mycosis..pptx
Microbial disease of the cardiovascular and lymphatic systems

MAP REDUCE SLIDESHARE

  • 2. What is Map Reduce? Map Reduce is a massive parallel technique for processing data which is maps are the individual tasks that transform input records into intermediate records. MapReduce program executes in three stages,  1.map stage.  2.shuffle stage.  3.reduce stage. 2
  • 4. Cont..  Map − Map is a user-defined function, which takes a series of key-value pairs and processes each one of them to generate zero or more key-value pairs.  Shuffle and Sort − The process of exchanging the intermediate outputs from the map tasks to where they are required by the reducers is known as shuffling.  Reducer −Reduces a set of intermediate values which share a key to a smaller set of values. All of the values with the same key are presented to a single reducer together 4
  • 5. Why Map Reduce?  Large scale data processing was difficult!  Managing hundreds of 1000s of process  Managing parallelization and distribution  Reliable execution with easy data access Map reduce provides all of these easily..! 5
  • 6. 6
  • 7. Why Map Reduce?  Traditional Enterprise Systems normally have a centralized server to store and process data. The following illustration depicts a schematic view of a traditional enterprise system. Traditional model is certainly not suitable to process huge volumes of scalable data and cannot be accommodated by standard database servers. Moreover, the centralized system creates too much of a bottleneck while processing multiple files simultaneously.  Google solved this bottleneck issue using an algorithm called Map Reduce. Map Reduce divides a task into small parts and assigns them to many computers. Later, the results are collected at one place and integrated to form the result dataset. 7
  • 8. 8
  • 9. ADVANTAGES  Scalability  Cost-effective solution  Flexibility  Fast  Security and Authentication  Parallel processing 9
  • 10. DISADVANTAGES  Its not always very easy to implement each and everything as a MR program  When your processing requires lot of data to be shuffled over the network  When you need to handle streaming data.MR is best suited to batch Process huge amounts of data which you already have with you. 10
  • 11. CONCLUSION  Map Reduce provides a simple way to scale your application.  Effortlessly scale from a single machine to thousands  The Map Reduce Programming model has been with success used at Google for several completely diffent functions. 11
  • 13. 13