SlideShare a Scribd company logo
© 2013 IBM Corporation1
BigData processing in the cloud – Guest Lecture -
University of Applied Sciences Rapperswil - 29.4.14
Romeo Kienzler
IBM Innovation Center
Source: http://res.sys-con.com/story/oct12/2398990/Cloud_BigData_468.jpg
© 2013 IBM Corporation2
What is BIG data?
© 2013 IBM Corporation3
What is BIG data?
© 2013 IBM Corporation4
What is BIG data?
Big Data
Hadoop
© 2013 IBM Corporation5
What is BIG data?
Business Intelligence
Data Warehouse
© 2013 IBM Corporation6
Map-Reduce → Hadoop → BigInsights
© 2013 IBM Corporation7
BigData UseCases
●
Google Index
●
40 X 10^9 = 40.000.000.000 => 40 billion pages indexed
●
Will break 100 PB barrier soon
●
Derived from MapReduce
●
now “caffeine” based on “percolator”
●
Incremental vs. batch
●
In-Memory vs. disk
© 2013 IBM Corporation8
BigData UseCases
●
CERN LHC
●
25 petabytes per year
●
Facebook
●
Hive Datawarehouse
●
300 PB, growing 600 TB / d
●
> 100 k servers
●
Genomics
●
Enterprises
●
Data center analytics (Logflies, OS/NW monitors, ...)
●
Predictive Maintenance, Cybersecurity
●
Social Media Analytics
●
DWH offload
●
Call Detail Record (CDR) data preservation
http://www.balthasar-glaettli.ch/vorratsdaten/
© 2013 IBM Corporation9
BigData Analytics
© 2013 IBM Corporation10
BigData Analytics – Predictive Analytics
"sometimes it's not
who has the best
algorithm that wins;
it's who has the most
data."
(C) Google Inc.
The Unreasonable Effectiveness of Data¹
¹http://www.csee.wvu.edu/~gidoretto/courses/2011-fall-cp/reading/TheUnreasonable%20EffectivenessofData_IEEE_IS2009.pdf
No Sampling => Work with full dataset => No p-Value/z-Scores anymore
© 2013 IBM Corporation11
Data Parallelism
© 2013 IBM Corporation12
Aggregated Bandwith between CPU, Main
Memory and Hard Drive
1 TB (at 10 GByte/s)
- 1 Node - 100 sec
- 10 Nodes - 10 sec
- 100 Nodes - 1 sec
- 1000 Nodes - 100 msec
© 2013 IBM Corporation13
Fault Tolerance / Commodity Hardware
AMD Turion II Neo N40L (2x 1,5GHz / 2MB / 15W), 8 GB RAM,
3TB SEAGATE Barracuda 7200.14
< CHF 500
 100 K => 200 X (2, 4, 3) => 400 Cores, 1,6 TB RAM, 200 TB HD
 MTBF ~ 365 d > 1,5 d
Source: http://www.cloudcomputingpatterns.org/Watchdog
© 2013 IBM Corporation14
© 2013 IBM Corporation15
© 2013 IBM Corporation16
HDFS – Hadoop File System
© 2013 IBM Corporation17
© 2013 IBM Corporation18
© 2013 IBM Corporation19
© 2013 IBM Corporation20
© 2013 IBM Corporation21
© 2013 IBM Corporation22
© 2013 IBM Corporation23
© 2013 IBM Corporation24
© 2013 IBM Corporation25
© 2013 IBM Corporation26
© 2013 IBM Corporation27
© 2013 IBM Corporation28
© 2013 IBM Corporation29
© 2013 IBM Corporation30
© 2013 IBM Corporation31
© 2013 IBM Corporation32
© 2013 IBM Corporation33
© 2013 IBM Corporation34
© 2013 IBM Corporation35
Map-Reduce
Source: http://www.cloudcomputingpatterns.org/Map_Reduce
© 2013 IBM Corporation36
© 2013 IBM Corporation37
© 2013 IBM Corporation38
© 2013 IBM Corporation39
© 2013 IBM Corporation40
© 2013 IBM Corporation41
© 2013 IBM Corporation42
© 2013 IBM Corporation43
© 2013 IBM Corporation44
© 2013 IBM Corporation45
© 2013 IBM Corporation46
© 2013 IBM Corporation47
© 2013 IBM Corporation48
© 2013 IBM Corporation49
© 2013 IBM Corporation50
© 2013 IBM Corporation51
© 2013 IBM Corporation52
© 2013 IBM Corporation53
© 2013 IBM Corporation54
© 2013 IBM Corporation55
© 2013 IBM Corporation56
© 2013 IBM Corporation57
© 2013 IBM Corporation58
© 2013 IBM Corporation59
© 2013 IBM Corporation60
© 2013 IBM Corporation61
© 2013 IBM Corporation62
© 2013 IBM Corporation63
© 2013 IBM Corporation64
© 2013 IBM Corporation65
© 2013 IBM Corporation66
© 2013 IBM Corporation67
© 2013 IBM Corporation68
© 2013 IBM Corporation69
© 2013 IBM Corporation70
© 2013 IBM Corporation71
© 2013 IBM Corporation72
© 2013 IBM Corporation73
© 2013 IBM Corporation74
© 2013 IBM Corporation75
© 2013 IBM Corporation76
© 2013 IBM Corporation77
What role is the cloud playing here?
© 2013 IBM Corporation78
“Elastic” Scale-Out
Source: http://www.cloudcomputingpatterns.org/Continuously_Changing_Workload
© 2013 IBM Corporation79
“Elastic” Scale-Out
of
© 2013 IBM Corporation80
“Elastic” Scale-Out
of
CPU Cores
© 2013 IBM Corporation81
“Elastic” Scale-Out
of
CPU Cores Storage
© 2013 IBM Corporation82
“Elastic” Scale-Out
of
CPU Cores Storage
© 2013 IBM Corporation83
“Elastic” Scale-Out
of
CPU Cores Storage Memory
© 2013 IBM Corporation84
“Elastic” Scale-Out
of
CPU Cores Storage Memory
© 2013 IBM Corporation85
“Elastic” Scale-Out
linear
Source: http://www.cloudcomputingpatterns.org/Elastic_Platform
© 2013 IBM Corporation86
“Elastic” Scale-Out
linear
Source: http://www.cloudcomputingpatterns.org/Elastic_Platform
© 2013 IBM Corporation87
BigData Scale-Out
How do Databases Scale-Out?
© 2013 IBM Corporation88
BigData Scale-Out
How do Databases Scale-Out?
© 2013 IBM Corporation89
How do Databases Scale-Out?
Shared Disk Architectures
© 2013 IBM Corporation90
How do Databases Scale-Out?
Shared Disk Architectures
© 2013 IBM Corporation91
How do Databases Scale-Out?
Shared Nothing Architectures
© 2013 IBM Corporation92
Born on the cloud Databases
Source: http://www.constructioncloudcomputing.com/wp-content/uploads/2010/10/dreamstime_7360880-480x300.jpg
Source: http://www.cloudcomputingpatterns.org/Execution_Environment
© 2013 IBM Corporation93
Google AppEngine
Google App Engine is a Platform as a Service (PaaS) offering that lets
you build and run applications on Google’s infrastructure. App Engine
applications are easy to build, easy to maintain, and easy to scale as
your traffic and data storage needs change. With App Engine, there are
no servers for you to maintain. You simply upload your application and
it’s ready to go.
Source: http://www.cloudcomputingpatterns.org/Platform_as_a_Service_%28PaaS%29
© 2013 IBM Corporation94
Google AppEngine Database Services
© 2013 IBM Corporation95
© 2013 IBM Corporation96
IBM BlueMix
BlueMix is a Platform as a Service Cloud,
based on Cloud Foundry, employing Enterprise
grade services enriched with IBM Software and
hosted at SOFTLAYER
© 2013 IBM Corporation97
IBM BlueMix, a Cloudfoundry runtime
Linux VM
Linux VM
Code
Runtime
Framework+
Droplet
Linux VM
Container Container Container
SQL
Push
SSO
Services:
...
DropletDroplet
© 2013 IBM Corporation98
●
Summary
●
BigData is born on the cloud
●
Cloud facilitates resource provisioning, configuration and deployment
●
Highly innovative area
●
Technology
●
UseCases
●
Links
●
http://en.wikipedia.org/wiki/MapReduce
●
http://www.se-radio.net/2013/12/episode-199-michael-stonebraker/
●
Sign up for the free BlueMix beta
●
http://bluemix.net
●
Come to the BlueMix Days
●
http://bit.ly/1lsIY8J
●
Use our software
●
Biginsights:
http://www.ibm.com/software/data/infosphere/biginsights/quick-start/

More Related Content

PDF
Technology Outlook - The new Era of computing
PDF
Practical Hadoop Big Data Training Course by Certified Architect
PDF
How to go into production your machine learning models? #CWT2017
PDF
Hadoop 101 - Big Data Technology
PPTX
Hadoop 101 v2
PDF
Big Data Analytics using Amazon Elastic MapReduce and Amazon Redshift
PDF
Apache Kudu - Updatable Analytical Storage #rakutentech
PDF
Get Results, Build Your Own Big Data Beast : Greenplum + Dell
Technology Outlook - The new Era of computing
Practical Hadoop Big Data Training Course by Certified Architect
How to go into production your machine learning models? #CWT2017
Hadoop 101 - Big Data Technology
Hadoop 101 v2
Big Data Analytics using Amazon Elastic MapReduce and Amazon Redshift
Apache Kudu - Updatable Analytical Storage #rakutentech
Get Results, Build Your Own Big Data Beast : Greenplum + Dell

What's hot (16)

PDF
Introduction to Hadoop and Big Data Processing
PPTX
VMworld 2009: VMworld Data Center
PDF
Introduction to Apache Hivemall v0.5.2 and v0.6
PPTX
The hadoop 2.0 ecosystem and yarn
PDF
Build Your Own Data Beast : Greenplum + Dell
PDF
Propelling IoT Innovation with Predictive Analytics
PDF
2017 04-13-google-tpu-04
PPTX
A Day in the Life of a Hadoop Administrator
PDF
Spark Pipelines in the Cloud with Alluxio
PDF
InTech Event | Cognitive Infrastructure for Enterprise AI
PPTX
Introducing Backblaze B2, the lowest cost cloud storage on the planet.
PDF
Large-Scale Optimization Strategies for Typical HPC Workloads
PDF
[db tech showcase OSS 2017] A23: Analytics with MariaDB ColumnStore by MariaD...
PPTX
Deep Learning with Apache MXNet (September 2017)
PPTX
Never late again! Job-Level deadline SLOs in YARN
PDF
Introduction to SQream and the IoT environment
Introduction to Hadoop and Big Data Processing
VMworld 2009: VMworld Data Center
Introduction to Apache Hivemall v0.5.2 and v0.6
The hadoop 2.0 ecosystem and yarn
Build Your Own Data Beast : Greenplum + Dell
Propelling IoT Innovation with Predictive Analytics
2017 04-13-google-tpu-04
A Day in the Life of a Hadoop Administrator
Spark Pipelines in the Cloud with Alluxio
InTech Event | Cognitive Infrastructure for Enterprise AI
Introducing Backblaze B2, the lowest cost cloud storage on the planet.
Large-Scale Optimization Strategies for Typical HPC Workloads
[db tech showcase OSS 2017] A23: Analytics with MariaDB ColumnStore by MariaD...
Deep Learning with Apache MXNet (September 2017)
Never late again! Job-Level deadline SLOs in YARN
Introduction to SQream and the IoT environment
Ad

Viewers also liked (20)

PDF
IBM Bluemix Introdution for Hackathons
PPTX
Bluemix - Deploying a Java Web Application
PPTX
Give Your Java Apps “The Boot” With Spring Boot And Cloud Foundry
PPTX
Building Highly Scalable Apps On Bluemix
PPTX
A Node.js Developer's Guide to Bluemix
PDF
Twitter analytics in Bluemix
PPTX
A gentle introduction to the world of BigData and Hadoop
PPTX
Think Small To Go Big - Introduction To Microservices
PPTX
IAB3948 Wiring the internet of things with Node-RED
PDF
An Overview of IBM Streaming Analytics for Bluemix
PDF
Quickly build and deploy a scalable OpenStack Swift application using IBM Blu...
PDF
デモで理解する!Bluemixモバイル・サービス
PDF
Flow based programming an overview
PDF
Using bluemix predictive analytics service in Node-RED
PDF
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
PDF
Deployment Automation for Hybrid Cloud and Multi-Platform Environments
PPTX
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
PPTX
Migrating Java EE applications to IBM Bluemix Platform-as-a-Service
PDF
Flow Base Programming with Node-RED and Functional Reactive Programming with ...
PDF
A data analyst view of Bigdata
IBM Bluemix Introdution for Hackathons
Bluemix - Deploying a Java Web Application
Give Your Java Apps “The Boot” With Spring Boot And Cloud Foundry
Building Highly Scalable Apps On Bluemix
A Node.js Developer's Guide to Bluemix
Twitter analytics in Bluemix
A gentle introduction to the world of BigData and Hadoop
Think Small To Go Big - Introduction To Microservices
IAB3948 Wiring the internet of things with Node-RED
An Overview of IBM Streaming Analytics for Bluemix
Quickly build and deploy a scalable OpenStack Swift application using IBM Blu...
デモで理解する!Bluemixモバイル・サービス
Flow based programming an overview
Using bluemix predictive analytics service in Node-RED
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Deployment Automation for Hybrid Cloud and Multi-Platform Environments
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
Migrating Java EE applications to IBM Bluemix Platform-as-a-Service
Flow Base Programming with Node-RED and Functional Reactive Programming with ...
A data analyst view of Bigdata
Ad

Similar to BigData processing in the cloud – Guest Lecture - University of Applied Sciences Rapperswil - 29.4.14 (20)

PDF
The datascientists workplace of the future, IBM developerDays 2014, Vienna by...
PDF
Getting started with Hadoop on the Cloud with Bluemix
PDF
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
PDF
Data Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
PPTX
Big data business case
PPT
Final deck
PPTX
MongoDB and In-Memory Computing
PPTX
BDI- The Beginning (Big data training in Coimbatore)
PPT
Database Management Myths & Reality for the future
PPTX
Cloud Computing y Big Data, próxima frontera de la innovación
PPTX
Aaum Analytics event - Big data in the cloud
PPTX
Essential Data Engineering for Data Scientist
PPTX
Iotbds v1.0
PPTX
Big Data in 200 km/h | AWS Big Data Demystified #1.3
ODP
BigData Hadoop
PPTX
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
PDF
Bigdata and Hadoop Bootcamp
PPTX
Complex Analytics with NoSQL Data Store in Real Time
PPTX
Big data4businessusers
PDF
2013 International Conference on Knowledge, Innovation and Enterprise Presen...
The datascientists workplace of the future, IBM developerDays 2014, Vienna by...
Getting started with Hadoop on the Cloud with Bluemix
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Data Science Connect, July 22nd 2014 @IBM Innovation Center Zurich
Big data business case
Final deck
MongoDB and In-Memory Computing
BDI- The Beginning (Big data training in Coimbatore)
Database Management Myths & Reality for the future
Cloud Computing y Big Data, próxima frontera de la innovación
Aaum Analytics event - Big data in the cloud
Essential Data Engineering for Data Scientist
Iotbds v1.0
Big Data in 200 km/h | AWS Big Data Demystified #1.3
BigData Hadoop
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
Bigdata and Hadoop Bootcamp
Complex Analytics with NoSQL Data Store in Real Time
Big data4businessusers
2013 International Conference on Knowledge, Innovation and Enterprise Presen...

More from Romeo Kienzler (20)

PDF
Parallelization Stategies of DeepLearning Neural Network Training
PDF
Cognitive IoT using DeepLearning on data parallel frameworks like Spark & Flink
PDF
Love & Innovative technology presented by a technology pioneer and an AI expe...
PDF
Blockchain Technology Book Vernisage
PDF
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
PDF
IBM Middle East Data Science Connect 2016 - Doha, Qatar
PDF
Apache SystemML - Declarative Large-Scale Machine Learning
PDF
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16
PDF
DeepLearning and Advanced Machine Learning on IoT
PDF
Geo Python16 keynote
PDF
Real-time DeepLearning on IoT Sensor Data
PPT
Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...
PDF
Scala, Apache Spark, The PlayFramework and Docker in IBM Platform As A Service
PDF
IBM Watson Technical Deep Dive Swiss Group for Artificial Intelligence and Co...
PDF
TDWI_DW2014_SQLNoSQL_DBAAS
PPT
Cloudant Overview Bluemix Meetup from Lisa Neddam
ODP
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
ODP
DBaaS Bluemix Meetup DACH 26.8.14
ODP
Cloud Databases, Developer Week Nuernberg 2014
ODP
Cloudfoundry / Bluemix tutorials, compressed in 4 Hours
Parallelization Stategies of DeepLearning Neural Network Training
Cognitive IoT using DeepLearning on data parallel frameworks like Spark & Flink
Love & Innovative technology presented by a technology pioneer and an AI expe...
Blockchain Technology Book Vernisage
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
IBM Middle East Data Science Connect 2016 - Doha, Qatar
Apache SystemML - Declarative Large-Scale Machine Learning
Intro to DeepLearning4J on ApacheSpark SDS DL Workshop 16
DeepLearning and Advanced Machine Learning on IoT
Geo Python16 keynote
Real-time DeepLearning on IoT Sensor Data
Cloud scale predictive DevOps automation using Apache Spark: Velocity in Amst...
Scala, Apache Spark, The PlayFramework and Docker in IBM Platform As A Service
IBM Watson Technical Deep Dive Swiss Group for Artificial Intelligence and Co...
TDWI_DW2014_SQLNoSQL_DBAAS
Cloudant Overview Bluemix Meetup from Lisa Neddam
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
DBaaS Bluemix Meetup DACH 26.8.14
Cloud Databases, Developer Week Nuernberg 2014
Cloudfoundry / Bluemix tutorials, compressed in 4 Hours

Recently uploaded (20)

PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
KodekX | Application Modernization Development
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
Spectroscopy.pptx food analysis technology
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Cloud computing and distributed systems.
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Machine learning based COVID-19 study performance prediction
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Encapsulation theory and applications.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
MYSQL Presentation for SQL database connectivity
Network Security Unit 5.pdf for BCA BBA.
KodekX | Application Modernization Development
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Mobile App Security Testing_ A Comprehensive Guide.pdf
Spectroscopy.pptx food analysis technology
Review of recent advances in non-invasive hemoglobin estimation
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Programs and apps: productivity, graphics, security and other tools
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Cloud computing and distributed systems.
“AI and Expert System Decision Support & Business Intelligence Systems”
Machine learning based COVID-19 study performance prediction
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
Building Integrated photovoltaic BIPV_UPV.pdf
Encapsulation theory and applications.pdf
NewMind AI Weekly Chronicles - August'25 Week I
MYSQL Presentation for SQL database connectivity

BigData processing in the cloud – Guest Lecture - University of Applied Sciences Rapperswil - 29.4.14

  • 1. © 2013 IBM Corporation1 BigData processing in the cloud – Guest Lecture - University of Applied Sciences Rapperswil - 29.4.14 Romeo Kienzler IBM Innovation Center Source: http://res.sys-con.com/story/oct12/2398990/Cloud_BigData_468.jpg
  • 2. © 2013 IBM Corporation2 What is BIG data?
  • 3. © 2013 IBM Corporation3 What is BIG data?
  • 4. © 2013 IBM Corporation4 What is BIG data? Big Data Hadoop
  • 5. © 2013 IBM Corporation5 What is BIG data? Business Intelligence Data Warehouse
  • 6. © 2013 IBM Corporation6 Map-Reduce → Hadoop → BigInsights
  • 7. © 2013 IBM Corporation7 BigData UseCases ● Google Index ● 40 X 10^9 = 40.000.000.000 => 40 billion pages indexed ● Will break 100 PB barrier soon ● Derived from MapReduce ● now “caffeine” based on “percolator” ● Incremental vs. batch ● In-Memory vs. disk
  • 8. © 2013 IBM Corporation8 BigData UseCases ● CERN LHC ● 25 petabytes per year ● Facebook ● Hive Datawarehouse ● 300 PB, growing 600 TB / d ● > 100 k servers ● Genomics ● Enterprises ● Data center analytics (Logflies, OS/NW monitors, ...) ● Predictive Maintenance, Cybersecurity ● Social Media Analytics ● DWH offload ● Call Detail Record (CDR) data preservation http://www.balthasar-glaettli.ch/vorratsdaten/
  • 9. © 2013 IBM Corporation9 BigData Analytics
  • 10. © 2013 IBM Corporation10 BigData Analytics – Predictive Analytics "sometimes it's not who has the best algorithm that wins; it's who has the most data." (C) Google Inc. The Unreasonable Effectiveness of Data¹ ¹http://www.csee.wvu.edu/~gidoretto/courses/2011-fall-cp/reading/TheUnreasonable%20EffectivenessofData_IEEE_IS2009.pdf No Sampling => Work with full dataset => No p-Value/z-Scores anymore
  • 11. © 2013 IBM Corporation11 Data Parallelism
  • 12. © 2013 IBM Corporation12 Aggregated Bandwith between CPU, Main Memory and Hard Drive 1 TB (at 10 GByte/s) - 1 Node - 100 sec - 10 Nodes - 10 sec - 100 Nodes - 1 sec - 1000 Nodes - 100 msec
  • 13. © 2013 IBM Corporation13 Fault Tolerance / Commodity Hardware AMD Turion II Neo N40L (2x 1,5GHz / 2MB / 15W), 8 GB RAM, 3TB SEAGATE Barracuda 7200.14 < CHF 500  100 K => 200 X (2, 4, 3) => 400 Cores, 1,6 TB RAM, 200 TB HD  MTBF ~ 365 d > 1,5 d Source: http://www.cloudcomputingpatterns.org/Watchdog
  • 14. © 2013 IBM Corporation14
  • 15. © 2013 IBM Corporation15
  • 16. © 2013 IBM Corporation16 HDFS – Hadoop File System
  • 17. © 2013 IBM Corporation17
  • 18. © 2013 IBM Corporation18
  • 19. © 2013 IBM Corporation19
  • 20. © 2013 IBM Corporation20
  • 21. © 2013 IBM Corporation21
  • 22. © 2013 IBM Corporation22
  • 23. © 2013 IBM Corporation23
  • 24. © 2013 IBM Corporation24
  • 25. © 2013 IBM Corporation25
  • 26. © 2013 IBM Corporation26
  • 27. © 2013 IBM Corporation27
  • 28. © 2013 IBM Corporation28
  • 29. © 2013 IBM Corporation29
  • 30. © 2013 IBM Corporation30
  • 31. © 2013 IBM Corporation31
  • 32. © 2013 IBM Corporation32
  • 33. © 2013 IBM Corporation33
  • 34. © 2013 IBM Corporation34
  • 35. © 2013 IBM Corporation35 Map-Reduce Source: http://www.cloudcomputingpatterns.org/Map_Reduce
  • 36. © 2013 IBM Corporation36
  • 37. © 2013 IBM Corporation37
  • 38. © 2013 IBM Corporation38
  • 39. © 2013 IBM Corporation39
  • 40. © 2013 IBM Corporation40
  • 41. © 2013 IBM Corporation41
  • 42. © 2013 IBM Corporation42
  • 43. © 2013 IBM Corporation43
  • 44. © 2013 IBM Corporation44
  • 45. © 2013 IBM Corporation45
  • 46. © 2013 IBM Corporation46
  • 47. © 2013 IBM Corporation47
  • 48. © 2013 IBM Corporation48
  • 49. © 2013 IBM Corporation49
  • 50. © 2013 IBM Corporation50
  • 51. © 2013 IBM Corporation51
  • 52. © 2013 IBM Corporation52
  • 53. © 2013 IBM Corporation53
  • 54. © 2013 IBM Corporation54
  • 55. © 2013 IBM Corporation55
  • 56. © 2013 IBM Corporation56
  • 57. © 2013 IBM Corporation57
  • 58. © 2013 IBM Corporation58
  • 59. © 2013 IBM Corporation59
  • 60. © 2013 IBM Corporation60
  • 61. © 2013 IBM Corporation61
  • 62. © 2013 IBM Corporation62
  • 63. © 2013 IBM Corporation63
  • 64. © 2013 IBM Corporation64
  • 65. © 2013 IBM Corporation65
  • 66. © 2013 IBM Corporation66
  • 67. © 2013 IBM Corporation67
  • 68. © 2013 IBM Corporation68
  • 69. © 2013 IBM Corporation69
  • 70. © 2013 IBM Corporation70
  • 71. © 2013 IBM Corporation71
  • 72. © 2013 IBM Corporation72
  • 73. © 2013 IBM Corporation73
  • 74. © 2013 IBM Corporation74
  • 75. © 2013 IBM Corporation75
  • 76. © 2013 IBM Corporation76
  • 77. © 2013 IBM Corporation77 What role is the cloud playing here?
  • 78. © 2013 IBM Corporation78 “Elastic” Scale-Out Source: http://www.cloudcomputingpatterns.org/Continuously_Changing_Workload
  • 79. © 2013 IBM Corporation79 “Elastic” Scale-Out of
  • 80. © 2013 IBM Corporation80 “Elastic” Scale-Out of CPU Cores
  • 81. © 2013 IBM Corporation81 “Elastic” Scale-Out of CPU Cores Storage
  • 82. © 2013 IBM Corporation82 “Elastic” Scale-Out of CPU Cores Storage
  • 83. © 2013 IBM Corporation83 “Elastic” Scale-Out of CPU Cores Storage Memory
  • 84. © 2013 IBM Corporation84 “Elastic” Scale-Out of CPU Cores Storage Memory
  • 85. © 2013 IBM Corporation85 “Elastic” Scale-Out linear Source: http://www.cloudcomputingpatterns.org/Elastic_Platform
  • 86. © 2013 IBM Corporation86 “Elastic” Scale-Out linear Source: http://www.cloudcomputingpatterns.org/Elastic_Platform
  • 87. © 2013 IBM Corporation87 BigData Scale-Out How do Databases Scale-Out?
  • 88. © 2013 IBM Corporation88 BigData Scale-Out How do Databases Scale-Out?
  • 89. © 2013 IBM Corporation89 How do Databases Scale-Out? Shared Disk Architectures
  • 90. © 2013 IBM Corporation90 How do Databases Scale-Out? Shared Disk Architectures
  • 91. © 2013 IBM Corporation91 How do Databases Scale-Out? Shared Nothing Architectures
  • 92. © 2013 IBM Corporation92 Born on the cloud Databases Source: http://www.constructioncloudcomputing.com/wp-content/uploads/2010/10/dreamstime_7360880-480x300.jpg Source: http://www.cloudcomputingpatterns.org/Execution_Environment
  • 93. © 2013 IBM Corporation93 Google AppEngine Google App Engine is a Platform as a Service (PaaS) offering that lets you build and run applications on Google’s infrastructure. App Engine applications are easy to build, easy to maintain, and easy to scale as your traffic and data storage needs change. With App Engine, there are no servers for you to maintain. You simply upload your application and it’s ready to go. Source: http://www.cloudcomputingpatterns.org/Platform_as_a_Service_%28PaaS%29
  • 94. © 2013 IBM Corporation94 Google AppEngine Database Services
  • 95. © 2013 IBM Corporation95
  • 96. © 2013 IBM Corporation96 IBM BlueMix BlueMix is a Platform as a Service Cloud, based on Cloud Foundry, employing Enterprise grade services enriched with IBM Software and hosted at SOFTLAYER
  • 97. © 2013 IBM Corporation97 IBM BlueMix, a Cloudfoundry runtime Linux VM Linux VM Code Runtime Framework+ Droplet Linux VM Container Container Container SQL Push SSO Services: ... DropletDroplet
  • 98. © 2013 IBM Corporation98 ● Summary ● BigData is born on the cloud ● Cloud facilitates resource provisioning, configuration and deployment ● Highly innovative area ● Technology ● UseCases ● Links ● http://en.wikipedia.org/wiki/MapReduce ● http://www.se-radio.net/2013/12/episode-199-michael-stonebraker/ ● Sign up for the free BlueMix beta ● http://bluemix.net ● Come to the BlueMix Days ● http://bit.ly/1lsIY8J ● Use our software ● Biginsights: http://www.ibm.com/software/data/infosphere/biginsights/quick-start/