SlideShare a Scribd company logo
BIG Data
Desai Karan A
https://in.linkedin.com/in/karan28
SYNOPSIS:
1. Handy Hands-on
2. Introduction to big data
3. Big Data Niceties
4. Specifics of Big Data
5. Big Data Management Tools
6. Practical use-cases
7. Conclusions
8. References
1 Handy Hands-On
Introduction to Big Data
Introduction to Big Data
Introduction to Big Data
2. Introduction to big data
-2.1 What is big data?
-2.2 Etymology.
-2.3 Hype and Facts.
2.1 What is big data?
• “Big data” refers to datasets whose size is
beyond the ability of typical database software
tools to capture, store, manage, and analyze.
• Big Data is the extremely large data sets that
may be analyzed computationally to reveal
patterns, trends, and associations, especially
relating to human behavior and interactions.
• Big data is the data of range more than 1000
gigabytes or 100 zettabytes.
2.2 Etymology: Word Origination
Big data is the simplest,
shortest phrase to convey that
the boundaries of computing
keep advancing, growing,
diversifying and intensifying
rapidly..
John R Mashey, chief
scientist at Silicon Graphics
coined the term “Big Data”.
2.3 Hype and Facts
2.3 Hype and Facts
Introduction to Big Data
GLOBALLY, EVERY 60 SECONDS…
• 204 Million emails are
sent.
• 300k logins to .
• 1.3 Million views on
YouTube.
• 2 Million Google searches.
• 100k tweets.
• 62,000 hours of Music
Downloads
• WE GENERATE 2.5 QUINTILION BYTES
EVERYDAY
• IN 2012, WORLD’S INFORMATION
CROSSED 2 ZETTA BYTES =2
TRILLION GIGABYTES!!
2.3 Hype and Facts (contd.)
3. Big Data Niceties.
-3.1 Evolution of Big Data
-3.2 Why traditional tools fail?
-3.3 Utilities of Big Data
3.1 Evolution Story:
Introduction to Big Data
• E-TSUNAMI and Heavy RAINS of DATA…
3.2 Why traditional tools fail? (contd.)
3.2 Why traditional tools fail?
• The present data is highly BIG for the
traditional data managers.
-Can work only with small samples of
data
-It is same as looking through keyhole
and finding size of room…
• High Turnaround time for meaningful
results
– Means Deciding to cross road based on
picture taken 5 minutes earlier!!
3.2 Why traditional tools fail? (contd.)
3.3 Big data utilities:
• Dealing with real time data.
• A new level of insight and
opportunity.
• More effective, fact based
decision making.
• A new source of business
values.
• A competitive advantage.
4. Specifics of Big Data
-4.1 Characteristics
-4.2 Life cycle
4.1 Characteristics
Big
data
Volume
Variety
Velocity
Veracity
Introduction to Big Data
Introduction to Big Data
Introduction to Big Data
Introduction to Big Data
Introduction to Big Data
4.2 Big Data Life Cycle
Insight
Enrich
Manage
• Manage and secure data of any size.
• Enrich by connecting world’s data.
• Insights on any data irrespective of
location
3.2 Big Data Life Cycle
Introduction to Big Data
5. Big Data Management tools.
-5.1 Cow story
-5.2 Introduction to Hadoop
-5.3 Basic Working of Hadoop.
5.1 Cow story: Case 1
It is easy for me
to handle my
resources.(Data)
.
Data
Storage device
MB/GB
Case 2 I am strong…I
can handle my
resources
Data Data
Data Data
Data Data
Storage device
TB
Case 3
Oof…There are so
many resources!!!
I am not strong!
Storage device
PB
Case 4
I call my
friends
for help
Big Data Management tools
5.2 Introduction to Hadoop
Apache Hadoop is an open-source software
framework for storage and large-scale
processing of data-sets on clusters of
commodity hardware.
Introduction to Hadoop
• Doug Cutting created the Apache Hadoop.
• Logo of Hadoop is a tiny yellow elephant.
5.3 Basic working of Hadoop
Read 1 TB of Data
1 Machine 10 Machine
• 4 I/O Channels
• Each channel: 100
MB/s
• ~ 45 minutes
• 4 I/O Channels
• Each channel: 100
MB/s
• ~4.5 Minutes
Present Hadoop basic
architecture.
Introduction to Big Data
Introduction to Big Data
Schematic Working.
Schematic Working.
• Application written in java for Big Data Processing
• Uses the “Map-Reduce” Processing Paradigm
• Optimized for distributed storage and computing
of data
• Open Source
• Very low cost for acquisition and storage
Hadoop .
HadoopData Analytics
Other big data management
tools: Overview…
Introduction to Big Data
6. Practical Use-Cases
-6.1 Big apps of Big Data tools
-6.2 How big data affects small business
-6.3 Relevance of big data in market
6.1 Big apps of big data tools.
Introduction to Big Data
Who is using big data?
Who is using big data?
6.2 How big data affects
small businesses?
• Every organization has a tipping point, and
most organizations – regardless of size –
will eventually reach a point where the
volume, variety and velocity of their data
will be something that they have to
address.
• This new big data world is not only about
running problems faster, but about solving
problems that were not solvable before.
6.3 Relevance of big data in
market.
Introduction to Big Data
7. Conclusions
Conclusions: Through pics..
Conclusions: Through pics..
Conclusions: Through pics..
Introduction to Big Data
8. References:
• www.microsoft.com
• http://en.wikipedia.org/wiki/Hadoop
• http://en.wikipedia.org/wiki/Big_data
• www.google.com
• www.slideshare.net
• Pdf: Mgkinskey Global Institute
• Pdf: 101 Big data by Pradeep Vardan
• Workshop in college by ‘Ecsttasys’ on big
data
Introduction to Big Data

More Related Content

PPTX
Team 2 Big Data Presentation
PPTX
Big data Presentation
PPTX
Presentation on Big Data Analytics
PPTX
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
PPTX
Big data
PDF
Visualisation & Storytelling in Data Science & Analytics
PPTX
Capstone Project on IBM Data Analytics Program
Team 2 Big Data Presentation
Big data Presentation
Presentation on Big Data Analytics
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data
Visualisation & Storytelling in Data Science & Analytics
Capstone Project on IBM Data Analytics Program

What's hot (20)

PDF
Gathering Business Requirements for Data Warehouses
PPTX
Introduction to Data Engineering
PDF
Data Warehouse Agility Array Conference2011
PPTX
Big Data, Business Intelligence and Data Analytics
PPT
Data warehouse
PPT
Idiro Analytics - Analytics & Big Data
PDF
Gartner 2021 Magic Quadrant for Cloud Database Management Systems.pdf
PDF
Visual analytics
PDF
Moving to Databricks & Delta
PPTX
What is Big Data?
PDF
DATA ANALYTICS FOR SOLVING BUSINESS PROBLEMS
PPTX
Big Data - 25 Amazing Facts Everyone Should Know
PPTX
Consistent hashing
PPT
Big Data
PPTX
BUSINESS INTELLIGENCE AND DATA ANALYTICS presentation
PPTX
Data analytics introduction
PPT
Data Warehousing and Data Mining
PDF
Data science
PPTX
Big Data Analytics
PDF
Big data Analytics
Gathering Business Requirements for Data Warehouses
Introduction to Data Engineering
Data Warehouse Agility Array Conference2011
Big Data, Business Intelligence and Data Analytics
Data warehouse
Idiro Analytics - Analytics & Big Data
Gartner 2021 Magic Quadrant for Cloud Database Management Systems.pdf
Visual analytics
Moving to Databricks & Delta
What is Big Data?
DATA ANALYTICS FOR SOLVING BUSINESS PROBLEMS
Big Data - 25 Amazing Facts Everyone Should Know
Consistent hashing
Big Data
BUSINESS INTELLIGENCE AND DATA ANALYTICS presentation
Data analytics introduction
Data Warehousing and Data Mining
Data science
Big Data Analytics
Big data Analytics
Ad

Viewers also liked (20)

PDF
Big Data: an introduction
PPTX
Big Data for Beginners
PPT
Big data introduction - Big Data from a Consulting perspective - Sogeti
PDF
Introduction to big data
PDF
Big data Introduction by Mohan
PPTX
Big data ppt
PPTX
What is big data?
PDF
Hadoop basics
PPTX
Introduction to Big Data
PPT
Big data ppt
PPTX
Big Data Processing in the Cloud: A Hydra/Sufia Experience
PPTX
Big data experiments
PPT
Sept 24 NISO Virtual Conference: Library Data in the Cloud
PPTX
Big Data Use Cases for Different Verticals and Adoption Patterns - Impetus We...
PPTX
Introduction to Big Data
PDF
Big Data introduction - Café Numérique Bruxelles
PPTX
Big data
PDF
Introduction to Big Data
PDF
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...
PDF
Taming Big Data with NoSQL
Big Data: an introduction
Big Data for Beginners
Big data introduction - Big Data from a Consulting perspective - Sogeti
Introduction to big data
Big data Introduction by Mohan
Big data ppt
What is big data?
Hadoop basics
Introduction to Big Data
Big data ppt
Big Data Processing in the Cloud: A Hydra/Sufia Experience
Big data experiments
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Big Data Use Cases for Different Verticals and Adoption Patterns - Impetus We...
Introduction to Big Data
Big Data introduction - Café Numérique Bruxelles
Big data
Introduction to Big Data
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...
Taming Big Data with NoSQL
Ad

Similar to Introduction to Big Data (20)

PPTX
WisdomEye Technologies
PPTX
WisdomEye Technologies
PPTX
BigData.pptx
PPTX
Data mining with big data
PPTX
Data mining with big data
PPTX
PresentationBig Data111111111111111.pptx
PPTX
Big data4businessusers
PDF
Introduction to Big Data
PPTX
Intro big data analytics
PPTX
Big data by Mithlesh sadh
PDF
Level Seven - Expedient Big Data presentation
PPTX
Presentation on Big Data
PPTX
data science unit 2 bigdata introduction .pptx
PPTX
Big Data_Big Data_Big Data-Big Data_Big Data
PPTX
Big Data, NoSQL, NewSQL & The Future of Data Management
PPSX
Big data with Hadoop - Introduction
PPTX
PPTX
Big data ppt
PPTX
CO1_Session_1&2 modified on introduction
PDF
Overview - IBM Big Data Platform
WisdomEye Technologies
WisdomEye Technologies
BigData.pptx
Data mining with big data
Data mining with big data
PresentationBig Data111111111111111.pptx
Big data4businessusers
Introduction to Big Data
Intro big data analytics
Big data by Mithlesh sadh
Level Seven - Expedient Big Data presentation
Presentation on Big Data
data science unit 2 bigdata introduction .pptx
Big Data_Big Data_Big Data-Big Data_Big Data
Big Data, NoSQL, NewSQL & The Future of Data Management
Big data with Hadoop - Introduction
Big data ppt
CO1_Session_1&2 modified on introduction
Overview - IBM Big Data Platform

Recently uploaded (20)

PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PPTX
Computer network topology notes for revision
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPT
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
PPTX
1_Introduction to advance data techniques.pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Business Acumen Training GuidePresentation.pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
Database Infoormation System (DBIS).pptx
Data_Analytics_and_PowerBI_Presentation.pptx
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
Computer network topology notes for revision
Business Ppt On Nestle.pptx huunnnhhgfvu
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
oil_refinery_comprehensive_20250804084928 (1).pptx
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
1_Introduction to advance data techniques.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Business Acumen Training GuidePresentation.pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
STUDY DESIGN details- Lt Col Maksud (21).pptx
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
Major-Components-ofNKJNNKNKNKNKronment.pptx
Reliability_Chapter_ presentation 1221.5784
Supervised vs unsupervised machine learning algorithms
Database Infoormation System (DBIS).pptx

Introduction to Big Data

  • 1. BIG Data Desai Karan A https://in.linkedin.com/in/karan28
  • 2. SYNOPSIS: 1. Handy Hands-on 2. Introduction to big data 3. Big Data Niceties 4. Specifics of Big Data 5. Big Data Management Tools 6. Practical use-cases 7. Conclusions 8. References
  • 7. 2. Introduction to big data -2.1 What is big data? -2.2 Etymology. -2.3 Hype and Facts.
  • 8. 2.1 What is big data? • “Big data” refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze. • Big Data is the extremely large data sets that may be analyzed computationally to reveal patterns, trends, and associations, especially relating to human behavior and interactions. • Big data is the data of range more than 1000 gigabytes or 100 zettabytes.
  • 9. 2.2 Etymology: Word Origination Big data is the simplest, shortest phrase to convey that the boundaries of computing keep advancing, growing, diversifying and intensifying rapidly.. John R Mashey, chief scientist at Silicon Graphics coined the term “Big Data”.
  • 10. 2.3 Hype and Facts
  • 11. 2.3 Hype and Facts
  • 13. GLOBALLY, EVERY 60 SECONDS… • 204 Million emails are sent. • 300k logins to . • 1.3 Million views on YouTube. • 2 Million Google searches. • 100k tweets. • 62,000 hours of Music Downloads
  • 14. • WE GENERATE 2.5 QUINTILION BYTES EVERYDAY • IN 2012, WORLD’S INFORMATION CROSSED 2 ZETTA BYTES =2 TRILLION GIGABYTES!! 2.3 Hype and Facts (contd.)
  • 15. 3. Big Data Niceties. -3.1 Evolution of Big Data -3.2 Why traditional tools fail? -3.3 Utilities of Big Data
  • 18. • E-TSUNAMI and Heavy RAINS of DATA… 3.2 Why traditional tools fail? (contd.)
  • 19. 3.2 Why traditional tools fail? • The present data is highly BIG for the traditional data managers. -Can work only with small samples of data -It is same as looking through keyhole and finding size of room…
  • 20. • High Turnaround time for meaningful results – Means Deciding to cross road based on picture taken 5 minutes earlier!! 3.2 Why traditional tools fail? (contd.)
  • 21. 3.3 Big data utilities: • Dealing with real time data. • A new level of insight and opportunity. • More effective, fact based decision making. • A new source of business values. • A competitive advantage.
  • 22. 4. Specifics of Big Data -4.1 Characteristics -4.2 Life cycle
  • 29. 4.2 Big Data Life Cycle Insight Enrich Manage
  • 30. • Manage and secure data of any size. • Enrich by connecting world’s data. • Insights on any data irrespective of location 3.2 Big Data Life Cycle
  • 32. 5. Big Data Management tools. -5.1 Cow story -5.2 Introduction to Hadoop -5.3 Basic Working of Hadoop.
  • 33. 5.1 Cow story: Case 1 It is easy for me to handle my resources.(Data) . Data Storage device MB/GB
  • 34. Case 2 I am strong…I can handle my resources Data Data Data Data Data Data Storage device TB
  • 35. Case 3 Oof…There are so many resources!!! I am not strong! Storage device PB
  • 36. Case 4 I call my friends for help Big Data Management tools
  • 37. 5.2 Introduction to Hadoop Apache Hadoop is an open-source software framework for storage and large-scale processing of data-sets on clusters of commodity hardware.
  • 38. Introduction to Hadoop • Doug Cutting created the Apache Hadoop. • Logo of Hadoop is a tiny yellow elephant.
  • 39. 5.3 Basic working of Hadoop
  • 40. Read 1 TB of Data 1 Machine 10 Machine • 4 I/O Channels • Each channel: 100 MB/s • ~ 45 minutes • 4 I/O Channels • Each channel: 100 MB/s • ~4.5 Minutes
  • 46. • Application written in java for Big Data Processing • Uses the “Map-Reduce” Processing Paradigm • Optimized for distributed storage and computing of data • Open Source • Very low cost for acquisition and storage Hadoop . HadoopData Analytics
  • 47. Other big data management tools: Overview…
  • 49. 6. Practical Use-Cases -6.1 Big apps of Big Data tools -6.2 How big data affects small business -6.3 Relevance of big data in market
  • 50. 6.1 Big apps of big data tools.
  • 52. Who is using big data?
  • 53. Who is using big data?
  • 54. 6.2 How big data affects small businesses? • Every organization has a tipping point, and most organizations – regardless of size – will eventually reach a point where the volume, variety and velocity of their data will be something that they have to address. • This new big data world is not only about running problems faster, but about solving problems that were not solvable before.
  • 55. 6.3 Relevance of big data in market.
  • 62. 8. References: • www.microsoft.com • http://en.wikipedia.org/wiki/Hadoop • http://en.wikipedia.org/wiki/Big_data • www.google.com • www.slideshare.net • Pdf: Mgkinskey Global Institute • Pdf: 101 Big data by Pradeep Vardan • Workshop in college by ‘Ecsttasys’ on big data

Editor's Notes

  • #2: ©Karan Desai(Follow me on twitter/@karlmit or https://in.linkedin.com/in/karan28) DISCLAIMER: The images or diagrams or content presented in the presentations are meant for educational purpose only. The author don’t guarantee the originality of any media of the presentation. The author has only combined and summed up the details regarding the topic from varied sources. The author is not subjected to any violation or copyrights.
  • #18: SSAS: SQL Server Analysis Services, SSAS, is an online analytical processing (OLAP), data mining and reporting tool in Microsoft SQL Server. Essbase is a multidimensional database management system (MDBMS) that provides a multidimensional database platform upon which to build analytic applications.  BM Cognos TM1 (formerly Applix TM1) is enterprise planning software used to implement collaborative planning, budgeting and forecasting solutions, as well as analytical and reporting applications. Power Pivot is a free add-in to the 2010 version of the spreadsheet application Microsoft Excel. PowerPivot workbooks are self contained web applications, merely requiring a 'Save as' to make them accessible in the browser as interactive solutions.”. K is a proprietary array processing language developed by Arthur Whitney and commercialized by Kx Systems. Since then, an open-source implementation known as Kona has also been developed. ... kdb is both a database (kdb) and a vector language (q). It's used by almost every major financial institution Vertica Systems is an analytic database management software company. QlikView is the most flexible Business Intelligence platform for turning data into knowledge. TIBCO Spotfire® designs, develops and distributes in-memory analytics software for next generation business intelligence. Tableau Software is an American computer software company headquartered in Seattle, Washington. It produces a family of interactive data visualization products focused on business intelligence Omniscope is single, in-memory, file-based application that enables agile, 'best practise' data sharing solutions An in-memory database (IMDB; also main memory database system or MMDB or memory resident database) is a database management system that primarily relies on main memory for computer data storage. It is contrasted with database management systems that employ a disk storage mechanism. Relational databases are row oriented, as the data in each row of a table is stored together. In a columnar, or column-oriented database, the data is stored across rows.