SlideShare a Scribd company logo
Guide to Big Data
Analytics
BY GAHYA PANDIAN
Big Data Analytics
What is Big Data
Data analysis is nothing new. Even before computers were used, information gained in the course
of business or other activities was reviewed with the aim of making those processes more efficient
and more profitable. These were, of course, comparatively small-scale undertakings given the
limitations posed by resources and manpower; analysis had to be manual and was slow by
modern standards, but it was still worthwhile. Opinion polling, for example, has been carried out
since early in the 19th century, almost 200 years ago. The first national survey took place in 1916
and involved the publication Literary Digest sending out millions of postcards and counting the
returns. As a result, they correctly predicted Woodrow Wilson’s election as president.
Since then, volumes of data have grown exponentially. The advent of the internet and faster
computing has meant that huge quantities of information can now be harvested and used to
optimise business processes. The problem is that conventional methods were simply not suited to
crunching through all the numbers and making sense of them. The amount of information is
phenomenal, and within that information lies insights that can be extremely beneficial. Once
patterns are identified, they can be used to adjust business practices, create targeted campaigns
and discard ones that are not effective. However, as well as large amounts of storage, it takes
specialised software to be able to make sense of all this data in a useful way.
Big Data
Big Data’ is the emerging discipline of capturing, storing, processing,
analysing and visualising these huge quantities of information. The data sets
may start at a few terabytes and run to many petabytes – far more than
traditional data analysis packages can handle. In 2012 Gartner defined it as,
‘high volume, high velocity, and/or high variety information assets that
require new forms of processing to enable enhanced decision making, insight
discovery and process optimization.’ This ‘3V’ classification has been built on
since (particularly with the addition of veracity), such that Big Data is often
described in terms of the following characteristics:
Big Data
 Volume. Terabytes or petabytes of data are analysed. An estimated 2.5 quintillion bytes
of data (2.5 trillion gigabytes) are created every day, an amount which will only rise in
the future. However, the size of the dataset is not the only variable that characterises Big
Data.
 Variety. The dataset may contain many different forms of data – not simply a large
amount of the same type. The profusion of different kinds of mobile device and the
variety of content consumed on them on a wide range of platforms, for example, means
that companies can harvest data from an enormous array of sources, each telling them a
different part of the same picture.
 Velocity. Data may change on a constant basis. For example, modern cars may have 100
or so different sensors that continually monitor different aspects of performance.
Markets change on a moment-to-moment scale. Data is highly fluid, and snapshots are
not always enough.
 Veracity. The data acquired may not all be accurate, or much of it may be uncertain or
provisional in nature. Data quality is unreliable, especially when there is so much of it.
Any system of analysis must take this into account.
Big Data
In addition to the 4V characteristics, there are also two others to deal with:
 Variability. Data capture and volume may be inconsistent, not just
inaccurate, so varying quantities and qualities of data will be acquired at
different times.
 Together, these factors mean that managing the data can be an extremely
complex process, since there are many data sources with differing types
and formats of data, but these need to be correlated and made sense of if
they are to be useful.
Conclusion
 Big data isn’t just an emerging phenomenon. It’s already here and being used by
major companies to drive their business forwards. Traditional analytics packages
simply aren’t capable of dealing with the quantity, variety and changeability of data
that can now be harvested from diverse sources – machine sensors, text documents,
structured and unstructured data, social media and more. When these are combined
and analysed as a whole, new patterns emerge. The right big data package will allow
enterprises to track these trends in real time, spotting them as they occur and
enabling businesses to leverage the insights provided.
 However, not all big data platforms and software are alike. As ever, which you decide
on will depend on a number of factors. These include not just the nature of the data
you are working with, but organisational budgets, infrastructure and the skillset of your
team, amongst other things. Some solutions are designed to be used off-the-peg,
providing powerful visualisations and connecting easily to your data stores. Others are
intended to be more flexible but should only be used by those with coding expertise.
You should also think to the future, and the long-term implications of being tied to
your platform of choice – particularly in terms of open-source vs proprietary software.
Other Guides:
 Public Cloud: Top Rated Public Cloud Computing Providers, Services,
Security & Technologies
 Cloud Backup: Guide to Cloud Backup Services, Companies, Software and
Solutions
 Hybrid Cloud: Guide to Hybrid Cloud Storage and Computing Companies
 Virtual Data Room: Virtual Data Room Providers, Services, Reviews and
Comparisons

More Related Content

PDF
The Impact of IoT on Cloud Computing, Big Data & Analytics
PPTX
PPTX
The future of big data analytics
PPTX
IoT and Big Data
PDF
L21 Big Data and Analytics
PDF
Future challenges in computer science
PPTX
Big data Introduction
PPTX
Big data ppt
The Impact of IoT on Cloud Computing, Big Data & Analytics
The future of big data analytics
IoT and Big Data
L21 Big Data and Analytics
Future challenges in computer science
Big data Introduction
Big data ppt

What's hot (20)

PDF
2 pc enterprise summit cronin newfinal aug 18
PDF
2018 Big Data Trends: Liberate, Integrate, and Trust Your Data
PPTX
Internet of Things and Big Data: Vision and Concrete Use Cases
PDF
Introduction to edge analytics- Intelligent IoT
PPTX
PDF
Short introduction to Big Data Analytics, the Internet of Things, and their s...
PDF
Internet of Things
PDF
Big data: the next frontier for innovation, competition and productivity
PPTX
Big Data Analytics - A Glimpse
PPTX
Essential Tools For Your Big Data Arsenal
PDF
Big Data
PPTX
BIG DATA & DATA ANALYTICS
PPTX
The Internet of Things
PDF
Harnessing the Power of IoT - Xamarin Experience 2017
PPT
Keamanan Siber di Era Big Data
PPTX
Ppt for Application of big data
PPTX
Key Data Management Requirements for the IoT
PPTX
PDF
SymEx 2015 - Agile Process for Big Data Analytic
PPTX
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
2 pc enterprise summit cronin newfinal aug 18
2018 Big Data Trends: Liberate, Integrate, and Trust Your Data
Internet of Things and Big Data: Vision and Concrete Use Cases
Introduction to edge analytics- Intelligent IoT
Short introduction to Big Data Analytics, the Internet of Things, and their s...
Internet of Things
Big data: the next frontier for innovation, competition and productivity
Big Data Analytics - A Glimpse
Essential Tools For Your Big Data Arsenal
Big Data
BIG DATA & DATA ANALYTICS
The Internet of Things
Harnessing the Power of IoT - Xamarin Experience 2017
Keamanan Siber di Era Big Data
Ppt for Application of big data
Key Data Management Requirements for the IoT
SymEx 2015 - Agile Process for Big Data Analytic
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Ad

Viewers also liked (16)

PPTX
Cloud computing security
PPTX
ATAGTR2017 Artificial Intelligence in Software Testing – Demystified
PDF
Artificial Intelligence in Project Management by Dr. Khaled A. Hamdy
PPTX
When Content Meets Data, Big Things Happen - Peter Krmpotic, Adobe
PDF
Oferta en artículos para Ortodoncia - primavera 2017 - DM Ceosa / Ortoceosa
PDF
Adobe Media Optimizer
PPS
CloudSecurity
PDF
Adobe Media Optimizer_What is Adobe Media Optimizer
PPTX
Cloud Computing Presentation
PPTX
Call of Duty
ODP
Ps4 vs xbox one | àlex Gómez i Arnau Marín
PPTX
iPhone Cost Components
PPT
Emergence of ITOA: An Evolution in IT Monitoring and Management
PPTX
Civil War and American Literature (General Perspective )
PPTX
образовательный форум экспосибирь 2017
PPTX
The What, Why and How of Big Data
Cloud computing security
ATAGTR2017 Artificial Intelligence in Software Testing – Demystified
Artificial Intelligence in Project Management by Dr. Khaled A. Hamdy
When Content Meets Data, Big Things Happen - Peter Krmpotic, Adobe
Oferta en artículos para Ortodoncia - primavera 2017 - DM Ceosa / Ortoceosa
Adobe Media Optimizer
CloudSecurity
Adobe Media Optimizer_What is Adobe Media Optimizer
Cloud Computing Presentation
Call of Duty
Ps4 vs xbox one | àlex Gómez i Arnau Marín
iPhone Cost Components
Emergence of ITOA: An Evolution in IT Monitoring and Management
Civil War and American Literature (General Perspective )
образовательный форум экспосибирь 2017
The What, Why and How of Big Data
Ad

Similar to Guide to big data analytics (20)

PDF
Big Data Analytics: Recent Achievements and New Challenges
PDF
Analysis of Big Data
DOCX
Introduction to big data – convergences.
PPTX
Evolution & Introduction to Big data-2.pptx
PPTX
What is big data
PPTX
Bigdata Hadoop introduction
DOCX
Big data lecture notes
PDF
Analysis on big data concepts and applications
PDF
Introduction to visualizing Big Data
PDF
An Encyclopedic Overview Of Big Data Analytics
DOCX
Handling and Analyzing Big Data_ A Professional Guide
PDF
Big data – A Review
PDF
Whitebook on Big Data
PPTX
Lec_1_Introduction_to_Big_Data_Analytics.pptx
PDF
Big-Data-Analytics.8592259.powerpoint.pdf
PDF
PDF
Big data analytics with Apache Hadoop
PDF
BigData Analytics_1.7
PPTX
Presentation on Big Data
PDF
Big Data - Insights & Challenges
Big Data Analytics: Recent Achievements and New Challenges
Analysis of Big Data
Introduction to big data – convergences.
Evolution & Introduction to Big data-2.pptx
What is big data
Bigdata Hadoop introduction
Big data lecture notes
Analysis on big data concepts and applications
Introduction to visualizing Big Data
An Encyclopedic Overview Of Big Data Analytics
Handling and Analyzing Big Data_ A Professional Guide
Big data – A Review
Whitebook on Big Data
Lec_1_Introduction_to_Big_Data_Analytics.pptx
Big-Data-Analytics.8592259.powerpoint.pdf
Big data analytics with Apache Hadoop
BigData Analytics_1.7
Presentation on Big Data
Big Data - Insights & Challenges

Recently uploaded (20)

PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPT
Teaching material agriculture food technology
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
Cloud computing and distributed systems.
PPTX
A Presentation on Artificial Intelligence
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Encapsulation theory and applications.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
cuic standard and advanced reporting.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Machine learning based COVID-19 study performance prediction
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Teaching material agriculture food technology
“AI and Expert System Decision Support & Business Intelligence Systems”
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Understanding_Digital_Forensics_Presentation.pptx
Chapter 3 Spatial Domain Image Processing.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
Unlocking AI with Model Context Protocol (MCP)
Cloud computing and distributed systems.
A Presentation on Artificial Intelligence
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
NewMind AI Weekly Chronicles - August'25 Week I
Building Integrated photovoltaic BIPV_UPV.pdf
Encapsulation theory and applications.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
The AUB Centre for AI in Media Proposal.docx
cuic standard and advanced reporting.pdf
Network Security Unit 5.pdf for BCA BBA.
Machine learning based COVID-19 study performance prediction
Agricultural_Statistics_at_a_Glance_2022_0.pdf

Guide to big data analytics

  • 1. Guide to Big Data Analytics BY GAHYA PANDIAN
  • 3. What is Big Data Data analysis is nothing new. Even before computers were used, information gained in the course of business or other activities was reviewed with the aim of making those processes more efficient and more profitable. These were, of course, comparatively small-scale undertakings given the limitations posed by resources and manpower; analysis had to be manual and was slow by modern standards, but it was still worthwhile. Opinion polling, for example, has been carried out since early in the 19th century, almost 200 years ago. The first national survey took place in 1916 and involved the publication Literary Digest sending out millions of postcards and counting the returns. As a result, they correctly predicted Woodrow Wilson’s election as president. Since then, volumes of data have grown exponentially. The advent of the internet and faster computing has meant that huge quantities of information can now be harvested and used to optimise business processes. The problem is that conventional methods were simply not suited to crunching through all the numbers and making sense of them. The amount of information is phenomenal, and within that information lies insights that can be extremely beneficial. Once patterns are identified, they can be used to adjust business practices, create targeted campaigns and discard ones that are not effective. However, as well as large amounts of storage, it takes specialised software to be able to make sense of all this data in a useful way.
  • 4. Big Data Big Data’ is the emerging discipline of capturing, storing, processing, analysing and visualising these huge quantities of information. The data sets may start at a few terabytes and run to many petabytes – far more than traditional data analysis packages can handle. In 2012 Gartner defined it as, ‘high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization.’ This ‘3V’ classification has been built on since (particularly with the addition of veracity), such that Big Data is often described in terms of the following characteristics:
  • 5. Big Data  Volume. Terabytes or petabytes of data are analysed. An estimated 2.5 quintillion bytes of data (2.5 trillion gigabytes) are created every day, an amount which will only rise in the future. However, the size of the dataset is not the only variable that characterises Big Data.  Variety. The dataset may contain many different forms of data – not simply a large amount of the same type. The profusion of different kinds of mobile device and the variety of content consumed on them on a wide range of platforms, for example, means that companies can harvest data from an enormous array of sources, each telling them a different part of the same picture.  Velocity. Data may change on a constant basis. For example, modern cars may have 100 or so different sensors that continually monitor different aspects of performance. Markets change on a moment-to-moment scale. Data is highly fluid, and snapshots are not always enough.  Veracity. The data acquired may not all be accurate, or much of it may be uncertain or provisional in nature. Data quality is unreliable, especially when there is so much of it. Any system of analysis must take this into account.
  • 6. Big Data In addition to the 4V characteristics, there are also two others to deal with:  Variability. Data capture and volume may be inconsistent, not just inaccurate, so varying quantities and qualities of data will be acquired at different times.  Together, these factors mean that managing the data can be an extremely complex process, since there are many data sources with differing types and formats of data, but these need to be correlated and made sense of if they are to be useful.
  • 7. Conclusion  Big data isn’t just an emerging phenomenon. It’s already here and being used by major companies to drive their business forwards. Traditional analytics packages simply aren’t capable of dealing with the quantity, variety and changeability of data that can now be harvested from diverse sources – machine sensors, text documents, structured and unstructured data, social media and more. When these are combined and analysed as a whole, new patterns emerge. The right big data package will allow enterprises to track these trends in real time, spotting them as they occur and enabling businesses to leverage the insights provided.  However, not all big data platforms and software are alike. As ever, which you decide on will depend on a number of factors. These include not just the nature of the data you are working with, but organisational budgets, infrastructure and the skillset of your team, amongst other things. Some solutions are designed to be used off-the-peg, providing powerful visualisations and connecting easily to your data stores. Others are intended to be more flexible but should only be used by those with coding expertise. You should also think to the future, and the long-term implications of being tied to your platform of choice – particularly in terms of open-source vs proprietary software.
  • 8. Other Guides:  Public Cloud: Top Rated Public Cloud Computing Providers, Services, Security & Technologies  Cloud Backup: Guide to Cloud Backup Services, Companies, Software and Solutions  Hybrid Cloud: Guide to Hybrid Cloud Storage and Computing Companies  Virtual Data Room: Virtual Data Room Providers, Services, Reviews and Comparisons