SlideShare a Scribd company logo
www.iactglobal.in
1
www.iactglobal.in
What this Module 1 about ?
After completing this Module, you should be able to:
Understand what is Big Data and its characteristics
Detailed Understanding about the need for a Big Data solution
Understand where Big Data is appropriate
List the IBM products that make up IBM’s Big Data strategy
Describe the type of data appropriate for:
- Infosphere BigInsights
- Infosphere Streams
List the open source programs that are a part of Infosphere BigInsights. 2
www.iactglobal.in
System Of Units / Binary System of Units
3
International
System
Of Units(SI)
Binary
Usage(deprecated)
Kilobyte KB 10^3 2^10
megabyte MB 10^6 2^20
gigabyte GB 10^9 2^30
terabyte TB 10^12 2^40
petabyte PB 10^15 2^50
exabyte EB 10^18 2^60
zettabyte ZB 10^21 2^70
yottabyte YB 10^24 2^80
www.iactglobal.in
2.5 petabytes
Memory capacity of the human brain
13 petabytes
Amount that could be downloaded from the internet in two minutes, if every
American (300M) got on a computer at the same time
4.75 exabytes
Total genome sequences of all people on the earth
422 exabytes
Total digital data created in 2008
 1 Zetabyte
World’s current digital storage capacity
1.8 Zettabytes
Total digital data expected to be created in 2011
4
BigData @ Scale
www.iactglobal.in
Explosion in data and real world events
5Source : IBM internal : http://www.slideshare.net/jowen_evansdata/keynote-randy-newell-of-ibm
www.iactglobal.in
Commercial
 Web Events / Data Base Logs
 Sensor Networks
 RFID
 Internet Text and Documents
 Internet Search Indexing
 CDR (Call Detail Records)
 Medical Records ….. Etc
Government
 Regular Government Business & Commerce Needs
 Military & Homeland Security Surveillance
6
Examples Of BigData
www.iactglobal.in
Science
 Astronomy
 Atmosphere
 Biological
 Genomics
Social
 Social Networks
 Social Data
7
Examples Of BigData
www.iactglobal.in
BigData @ Organizations
8Source: http://www.slideshare.net/albertspijkers/2011-07-27baoclientpresentation
www.iactglobal.in
Perception gap surrounding social media
9Source: IBM internal
www.iactglobal.in
Big Data Characteristics
10Source: http://www.linguamatics.com/blog/big-data-real-world-data-where-does-text-analytics-fit
www.iactglobal.in
Challenge @ BigData to find new insights:
11
Source: IBM Internal:
http://www.slideshare.net/cmeniche/1524-how-ibms-big-data-solution-can-help-you-gain-insight-into-your-data-center-v2
www.iactglobal.in
Is there really a need for Big Data?
12
Source:
http://www.slideshare.net/cmeniche/1524-how-ibms-big-data-solution-can-help-you-gain-insight-into-your-data-center-v2
www.iactglobal.in
Case Study and Implementation @ Vestas
13
Vestas wind systems has 43,000 wind turbines in 65 countries over 5
continents
Customer Pain Point:
 Optimal place to install wind turbine
 Must consider large number of location dependant factors like temperature, precipitation,
wind velocity and humidity
 Existing legacy process doesn’t support all data to be analyzed
 Analyzing the data must be completed in hours
Solution Required:
 Allow to leverage all available data, drastically reduce modeling time, support future
expansions in modeling techniques.
 Improve accuracy of decisions for wind turbine placement
www.iactglobal.in
Case Study and Implementation @ Vestas
14
Implementation using InfoSphere BigInsights :
 Has created a “wind and site competence center”
 Engineers will be modeling data and forecasting optimal turbine
locations
 Initially to use publically available weather data from nation weather
data services as well as own recorded weather data
 Data sources considered: global deforestation metrics, satellite images,
historical metrics, geospatial data
 InfoSphere BigInsights will be used to as a core infrastructure to hold
generated weather data
www.iactglobal.in
Big Data presents big opportunities ?
15
Source:IBM Internal:
http://www.slideshare.net/cmeniche/1524-how-ibms-big-data-solution-can-help-you-gain-insight-into-your-data-center-v2
www.iactglobal.in
Traditional Vs BigData approaches:
16
Source:
http://image.slidesharecdn.com/1524howibmsbigdatasolutioncanhelpyougaininsightintoyourdatacenterv2-130306205122-php
www.iactglobal.in
17
Merging the Traditional and Big Data Approaches
Source:IBM Internal: http://www.rosebt.com/uploads/8/1/8/1/8181762/3861342_orig.jpg?1
www.iactglobal.in
Enterprise information architecture:
Big Data will be a
Permanent part of your
Information architecture
It cannot be a silo- It
Must be fully integrated
In order to leverage its
Value
It must be easy to
deploy and integrate
18Source: IBM Internal:http://www.slideshare.net/albertspijkers/2011-07-27baoclientpresentation
www.iactglobal.in
IBM Big Data platform strategy:
 Integrate and manage the full variety, velocity and volume of Big
Data
 Apply advanced analytics to information in its native form
 Visualize all available data for ad- hoc analysis
 Development environment for building new analytic applications
 Support workload optimization and scheduling
 Provide for security and governance
 Integrate with enterprise software
19
www.iactglobal.in
IBM Big Data platform strategy:
Source: http://www.slideshare.net/cmeniche/1524-how-ibms-big-data-solution-can-help-you-gain-insight-into-your-data-center-v2
20
www.iactglobal.in
Enterprise class BigData Product @ IBM:
Failure Tolerance:
 High availability architecture to support hardware or
application failure.
Scale Economically:
 Runs on scalable hardware with the ability to dynamically add
additional nodes.
Security & Privacy:
 Security protection for granular data access control.
21
Source: IBM internal
www.iactglobal.in
Different BigInsights editions for varying needs
22Source:IBM Internal: http://www.bloter.net/wp-content/uploads/2013/04/ibm_biginsights_2_1.jpg
www.iactglobal.in
Different BigInsights editions for varying needs
Characteristics that distinguish BigInsights include its built-
in support for analytics its integration with other enterprise
software, and its production readiness.
For InfoSphere BigInsights , there are Two Releases:
Basic Edition
Enterprise Edition
23
www.iactglobal.in
Infosphere Streams:
24Source:IBM Internal: https://bruceweed.wordpress.com/tag/ibm-infosphere-streams/
www.iactglobal.in
To Summarize
• An enterprise-ready Big Data platform
• Innovative, customer-tested products-InfoSphere
BigInsights-InfoSphere Streams
• Platform and products enabled for integration with
the overall enterprise infrastructure
• Even though BigInsights contains open source
code-Licensing is like other IBM software offering
25
www.iactglobal.in
Having completed this Module, you should be able to
Understand need for a Big Data solution
List the IBM products that make up IBM’s Big Data Strategy
Describe the type of data appropriate for:
-InfoSphere BigInsights
-InfoSphere Streams
List the open source programs that are a part of InfoSphere
BigInsights
26
To Summarize

More Related Content

PPTX
Big data? No. Big Decisions are What You Want
PPTX
Big data
PPTX
What is Big Data ?
PPTX
Introduction to Big Data & Big Data 1.0 System
PDF
Introduction to BigData
PDF
Intro to big data and applications - day 2
PDF
Analysis of big data in pandemic case
Big data? No. Big Decisions are What You Want
Big data
What is Big Data ?
Introduction to Big Data & Big Data 1.0 System
Introduction to BigData
Intro to big data and applications - day 2
Analysis of big data in pandemic case

What's hot (20)

PPTX
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
PPTX
Big Data
PDF
Big data introduction
PPT
Big Data
PPTX
Big data
PDF
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
PPTX
Big Data Projects Research Ideas
PPTX
View on big data technologies
PPTX
Hadoop Training Tutorial for Freshers
PPTX
Big Stream Processing Systems, Big Graphs
PPTX
Data mining with big data
PDF
Big data tools
PPTX
PPSX
Big Data
PPTX
Big data
PPTX
Bigdata Analytics using Hadoop
PPTX
Big Data & Data Science
PPTX
Bigdata " new level"
PPTX
Big Data - The 5 Vs Everyone Must Know
PDF
Big data analytics, research report
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
Big Data
Big data introduction
Big Data
Big data
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Projects Research Ideas
View on big data technologies
Hadoop Training Tutorial for Freshers
Big Stream Processing Systems, Big Graphs
Data mining with big data
Big data tools
Big Data
Big data
Bigdata Analytics using Hadoop
Big Data & Data Science
Bigdata " new level"
Big Data - The 5 Vs Everyone Must Know
Big data analytics, research report
Ad

Viewers also liked (16)

PPTX
Six Sigma
PPT
Big data analytics
PPTX
Final purchasing and materials management ppt
PPTX
Final purchasing and materials management ppt
PPT
Certification and Training in International Financial Reporting Standards (IFRS)
PPT
IFRS - IACT Global
PPT
Purchasing and Material Management training Certification with iACT Global
PPT
Saxen van coller on wild photography
PPTX
10. Пачатак Вялікай Айчыннай вайны
PPTX
Myths about lung cancer disease
PDF
NEWresume (1)
DOC
CV (2) 2016 (3)
PDF
BLOG-POST_DATA CENTER INCENTIVE PROGRAMS
PPTX
Subconsultas y consultas multitabla en bases de datos de sql server
PDF
ENOG 9 (2015)
PDF
نتيجه السادس الإبتدائي دمياط 2016
Six Sigma
Big data analytics
Final purchasing and materials management ppt
Final purchasing and materials management ppt
Certification and Training in International Financial Reporting Standards (IFRS)
IFRS - IACT Global
Purchasing and Material Management training Certification with iACT Global
Saxen van coller on wild photography
10. Пачатак Вялікай Айчыннай вайны
Myths about lung cancer disease
NEWresume (1)
CV (2) 2016 (3)
BLOG-POST_DATA CENTER INCENTIVE PROGRAMS
Subconsultas y consultas multitabla en bases de datos de sql server
ENOG 9 (2015)
نتيجه السادس الإبتدائي دمياط 2016
Ad

Similar to Introduction to Big Data & Hadoop (20)

PDF
Future of Power: Big Data - Søren Ravn
PDF
Overview - IBM Big Data Platform
PDF
IBM Technology Day 2013 BigData Salle Rome
PDF
Key note big data analytics ecosystem strategy
PDF
Get Started Quickly with IBM's Hadoop as a Service
PDF
Big Data: InterConnect 2016 Session on Getting Started with Big Data Analytics
PPT
Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...
PDF
InfoSphere BigInsights
PDF
Ibm big data-platform
PPTX
Big data4businessusers
PPTX
Integrate Big Data into Your Organization with Informatica and Perficient
PDF
Why You Need to Govern Big Data
PDF
Level Seven - Expedient Big Data presentation
PPTX
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
PDF
OC Big Data Monthly Meetup #6 - Session 1 - IBM
PDF
SD Big Data Monthly Meetup #4 - Session 1 - IBM
PPSX
De-Mystifying Big Data
PDF
2014.07.11 biginsights data2014
PPT
Value proposition for big data isv partners 0714
PPTX
A Big Data Concept
Future of Power: Big Data - Søren Ravn
Overview - IBM Big Data Platform
IBM Technology Day 2013 BigData Salle Rome
Key note big data analytics ecosystem strategy
Get Started Quickly with IBM's Hadoop as a Service
Big Data: InterConnect 2016 Session on Getting Started with Big Data Analytics
Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...
InfoSphere BigInsights
Ibm big data-platform
Big data4businessusers
Integrate Big Data into Your Organization with Informatica and Perficient
Why You Need to Govern Big Data
Level Seven - Expedient Big Data presentation
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
OC Big Data Monthly Meetup #6 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBM
De-Mystifying Big Data
2014.07.11 biginsights data2014
Value proposition for big data isv partners 0714
A Big Data Concept

Recently uploaded (20)

PDF
advance database management system book.pdf
PDF
LNK 2025 (2).pdf MWEHEHEHEHEHEHEHEHEHEHE
PDF
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
PPTX
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
PDF
What if we spent less time fighting change, and more time building what’s rig...
PPTX
History, Philosophy and sociology of education (1).pptx
PDF
Hazard Identification & Risk Assessment .pdf
PDF
Complications of Minimal Access Surgery at WLH
PDF
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
PDF
LDMMIA Reiki Yoga Finals Review Spring Summer
PPTX
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
PPTX
A powerpoint presentation on the Revised K-10 Science Shaping Paper
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
Weekly quiz Compilation Jan -July 25.pdf
PPTX
Lesson notes of climatology university.
PDF
Trump Administration's workforce development strategy
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PPTX
UV-Visible spectroscopy..pptx UV-Visible Spectroscopy – Electronic Transition...
PPTX
Digestion and Absorption of Carbohydrates, Proteina and Fats
PDF
Practical Manual AGRO-233 Principles and Practices of Natural Farming
advance database management system book.pdf
LNK 2025 (2).pdf MWEHEHEHEHEHEHEHEHEHEHE
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
What if we spent less time fighting change, and more time building what’s rig...
History, Philosophy and sociology of education (1).pptx
Hazard Identification & Risk Assessment .pdf
Complications of Minimal Access Surgery at WLH
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
LDMMIA Reiki Yoga Finals Review Spring Summer
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
A powerpoint presentation on the Revised K-10 Science Shaping Paper
Supply Chain Operations Speaking Notes -ICLT Program
Weekly quiz Compilation Jan -July 25.pdf
Lesson notes of climatology university.
Trump Administration's workforce development strategy
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
UV-Visible spectroscopy..pptx UV-Visible Spectroscopy – Electronic Transition...
Digestion and Absorption of Carbohydrates, Proteina and Fats
Practical Manual AGRO-233 Principles and Practices of Natural Farming

Introduction to Big Data & Hadoop

  • 2. www.iactglobal.in What this Module 1 about ? After completing this Module, you should be able to: Understand what is Big Data and its characteristics Detailed Understanding about the need for a Big Data solution Understand where Big Data is appropriate List the IBM products that make up IBM’s Big Data strategy Describe the type of data appropriate for: - Infosphere BigInsights - Infosphere Streams List the open source programs that are a part of Infosphere BigInsights. 2
  • 3. www.iactglobal.in System Of Units / Binary System of Units 3 International System Of Units(SI) Binary Usage(deprecated) Kilobyte KB 10^3 2^10 megabyte MB 10^6 2^20 gigabyte GB 10^9 2^30 terabyte TB 10^12 2^40 petabyte PB 10^15 2^50 exabyte EB 10^18 2^60 zettabyte ZB 10^21 2^70 yottabyte YB 10^24 2^80
  • 4. www.iactglobal.in 2.5 petabytes Memory capacity of the human brain 13 petabytes Amount that could be downloaded from the internet in two minutes, if every American (300M) got on a computer at the same time 4.75 exabytes Total genome sequences of all people on the earth 422 exabytes Total digital data created in 2008  1 Zetabyte World’s current digital storage capacity 1.8 Zettabytes Total digital data expected to be created in 2011 4 BigData @ Scale
  • 5. www.iactglobal.in Explosion in data and real world events 5Source : IBM internal : http://www.slideshare.net/jowen_evansdata/keynote-randy-newell-of-ibm
  • 6. www.iactglobal.in Commercial  Web Events / Data Base Logs  Sensor Networks  RFID  Internet Text and Documents  Internet Search Indexing  CDR (Call Detail Records)  Medical Records ….. Etc Government  Regular Government Business & Commerce Needs  Military & Homeland Security Surveillance 6 Examples Of BigData
  • 7. www.iactglobal.in Science  Astronomy  Atmosphere  Biological  Genomics Social  Social Networks  Social Data 7 Examples Of BigData
  • 8. www.iactglobal.in BigData @ Organizations 8Source: http://www.slideshare.net/albertspijkers/2011-07-27baoclientpresentation
  • 9. www.iactglobal.in Perception gap surrounding social media 9Source: IBM internal
  • 10. www.iactglobal.in Big Data Characteristics 10Source: http://www.linguamatics.com/blog/big-data-real-world-data-where-does-text-analytics-fit
  • 11. www.iactglobal.in Challenge @ BigData to find new insights: 11 Source: IBM Internal: http://www.slideshare.net/cmeniche/1524-how-ibms-big-data-solution-can-help-you-gain-insight-into-your-data-center-v2
  • 12. www.iactglobal.in Is there really a need for Big Data? 12 Source: http://www.slideshare.net/cmeniche/1524-how-ibms-big-data-solution-can-help-you-gain-insight-into-your-data-center-v2
  • 13. www.iactglobal.in Case Study and Implementation @ Vestas 13 Vestas wind systems has 43,000 wind turbines in 65 countries over 5 continents Customer Pain Point:  Optimal place to install wind turbine  Must consider large number of location dependant factors like temperature, precipitation, wind velocity and humidity  Existing legacy process doesn’t support all data to be analyzed  Analyzing the data must be completed in hours Solution Required:  Allow to leverage all available data, drastically reduce modeling time, support future expansions in modeling techniques.  Improve accuracy of decisions for wind turbine placement
  • 14. www.iactglobal.in Case Study and Implementation @ Vestas 14 Implementation using InfoSphere BigInsights :  Has created a “wind and site competence center”  Engineers will be modeling data and forecasting optimal turbine locations  Initially to use publically available weather data from nation weather data services as well as own recorded weather data  Data sources considered: global deforestation metrics, satellite images, historical metrics, geospatial data  InfoSphere BigInsights will be used to as a core infrastructure to hold generated weather data
  • 15. www.iactglobal.in Big Data presents big opportunities ? 15 Source:IBM Internal: http://www.slideshare.net/cmeniche/1524-how-ibms-big-data-solution-can-help-you-gain-insight-into-your-data-center-v2
  • 16. www.iactglobal.in Traditional Vs BigData approaches: 16 Source: http://image.slidesharecdn.com/1524howibmsbigdatasolutioncanhelpyougaininsightintoyourdatacenterv2-130306205122-php
  • 17. www.iactglobal.in 17 Merging the Traditional and Big Data Approaches Source:IBM Internal: http://www.rosebt.com/uploads/8/1/8/1/8181762/3861342_orig.jpg?1
  • 18. www.iactglobal.in Enterprise information architecture: Big Data will be a Permanent part of your Information architecture It cannot be a silo- It Must be fully integrated In order to leverage its Value It must be easy to deploy and integrate 18Source: IBM Internal:http://www.slideshare.net/albertspijkers/2011-07-27baoclientpresentation
  • 19. www.iactglobal.in IBM Big Data platform strategy:  Integrate and manage the full variety, velocity and volume of Big Data  Apply advanced analytics to information in its native form  Visualize all available data for ad- hoc analysis  Development environment for building new analytic applications  Support workload optimization and scheduling  Provide for security and governance  Integrate with enterprise software 19
  • 20. www.iactglobal.in IBM Big Data platform strategy: Source: http://www.slideshare.net/cmeniche/1524-how-ibms-big-data-solution-can-help-you-gain-insight-into-your-data-center-v2 20
  • 21. www.iactglobal.in Enterprise class BigData Product @ IBM: Failure Tolerance:  High availability architecture to support hardware or application failure. Scale Economically:  Runs on scalable hardware with the ability to dynamically add additional nodes. Security & Privacy:  Security protection for granular data access control. 21 Source: IBM internal
  • 22. www.iactglobal.in Different BigInsights editions for varying needs 22Source:IBM Internal: http://www.bloter.net/wp-content/uploads/2013/04/ibm_biginsights_2_1.jpg
  • 23. www.iactglobal.in Different BigInsights editions for varying needs Characteristics that distinguish BigInsights include its built- in support for analytics its integration with other enterprise software, and its production readiness. For InfoSphere BigInsights , there are Two Releases: Basic Edition Enterprise Edition 23
  • 24. www.iactglobal.in Infosphere Streams: 24Source:IBM Internal: https://bruceweed.wordpress.com/tag/ibm-infosphere-streams/
  • 25. www.iactglobal.in To Summarize • An enterprise-ready Big Data platform • Innovative, customer-tested products-InfoSphere BigInsights-InfoSphere Streams • Platform and products enabled for integration with the overall enterprise infrastructure • Even though BigInsights contains open source code-Licensing is like other IBM software offering 25
  • 26. www.iactglobal.in Having completed this Module, you should be able to Understand need for a Big Data solution List the IBM products that make up IBM’s Big Data Strategy Describe the type of data appropriate for: -InfoSphere BigInsights -InfoSphere Streams List the open source programs that are a part of InfoSphere BigInsights 26 To Summarize