It is a very important file for our course of advance information system MIS 206 conducted by Prof. Rakibul Hoque. A very gooood could course . Highly recommend. Best faculty.
3. Big Data
The Data Size Is Getting Bigger and Bigger
Huge volumes of data which is in,
-> Terabytes(1024 Gigabytes)
-> Petabytes(1024 Terabytes)
-> Exabytes(1024 Petabytes)
-> Zettabytes(1024 Exabytes)
-> Yottabytes(1024 Zettabytes)
-> Brontobytes(1024 yottabytes)
-> Gegobyte (1024 Brontobytes)
Name Symbol Value
Kilobyte k B 103
Megabyte M B 106
Gigabyte G B 109
Terabyte T B 1012
Petabyte P B 1015
Exabyte E B 1018
Zettabyte Z B 1021
Yottabyte Y B 1024
Brontobyte* B B 1027
Gegobyte* G e B 1030
5. Big Data
About 2.5 quintillion bytes of data are generated
every day and almost 90% of the global existing
data has been created during the past two years.
Data is growing faster than ever before and
about 1.7 megabytes of new information will be
created every second for every human being on the
planet.
463 exabytes of data will be generated each day by
humans as of 2025.
95 million photos and videos are shared every day
on Instagram.
6. Big Data
Facebook users generating 90 pieces of contents
(notes, photos, link, stories, posts), while 600 million
active users of social platform spent over 9.3 billion
hours a month on the site.
Every day, 306.4 billion emails are sent, and 5 million
Tweets are made.
Every minute 24 hours of video is uploaded in
YouTube.
We perform 40,000 search queries every second
(on Google alone), which makes it 3.5 searches per
day and 1.2 trillion searches per year.
7. Big Data
A flight generates 240 terabytes of new data every
year. Black Box captures voices of the flight crew,
recordings of microphones and earphones, and
the performance information of the aircraft.
Walmart handles 1 million customer
transaction/hour.
This volume of data equates to 2-hourlong HD
movies, which one person would need 47 million
years to watch in their entirety.
8. Big Data is primarily
measured by the volume
of the data.
Big Data also includes
data that is coming in fast
and at huge varieties.
Big Data
9. Big Data
Big Data is a collection of large and complex
data sets which are difficult to process using
common database management tools or
traditional data processing applications.
The US Congress defines big data as “a term
that describes large volumes of high velocity,
complex, and variable data that require
advanced techniques and technologies to
enable the capture, storage, distribution,
management, and analysis of the
11. Big data
Big Data requires a scalable architecture for
efficient storage, manipulation, and analysis.
■ Volume refers to the vast amount of data generated
every second.
■ Velocity refers to the speed at which new data is
generated and the speed at which data moves around.
■ Variety refers to the different types of data we can now
use.
■ Veracity refers to the uncertainty or trustworthiness of the
data.
■ Value refers to our ability turn our data into value.
23. Mobile Device
There are 5.22 Billion people that have a mobile
device in the world. This means that 66.83% of the
world's population has a mobile device.
The number of smartphone users in the world is
3.5 billion, and this means 44.81% of the world's
population owns a smartphone.
There are more than 3.5 billion mobile internet
subscribers
There are 1 million apps available, which have been
downloaded more than 100 billion times.
192 countries have active 3G mobile network
28. Internet of Things
The Internet of Things (IoT) is a scenario in which
objects, animals or people are provided with unique
identifiers and the ability to transfer data over a
network without requiring human-to-human or
human-to-computer interaction.
The Internet of things is the network of physical
devices, vehicles, home appliances and other items
embedded with electronics, software, sensors,
actuators, and network connectivity which enables
these objects to connect and exchange
29. Internet of Things
The Internet of Things (IoT), also sometimes
referred to as the Internet of Everything (IoE),
consists of all the web-enabled devices that collect,
send and act on data they acquire from their
surrounding environments using embedded
sensors, processors and communication hardware.
According to Gartner Inc. (a technology research
and advisory corporation), there are nearly 26
billion devices on the Internet of Things.
38. Lots of Data, Little Insights
Fewer than 10% of companies have
a 360- degree view of their
customers, and only about 5% are
able to use this view to systemically
grow their businesses.”----Gartner
“Businesses spend 80% of their
time preparing and managing their
data rather than using it.”-----Forbes
39. Data Analytics
Data is the new gold but by itself, it is completely
useless
Data Analytics: Systematic analysis of raw data
(statistics) to make conclusions about that information
Big Data analysis blends traditional statistical data
analysis approaches with new generation
computational algorithms.
The overall goal of big data analysis is to support better
decision-making.
Carrying out big data analysis helps establish patterns and
relationships among the data being analyzed.
40. Data Analytics
“The critical job in the next 20
years will be the analytic
scientist … the individual with
the ability to understand a
problem domain, to understand
and know what data to collect
about it, to identify analytics to
process that data/information, to
discover its meaning, and to
extract knowledge from it—that’s
going to be a very critical skill.”
41. Data Analytics
Data analytics is the science of drawing insights from
raw information sources.
It is the process of analyzing raw data to find trends
and answer questions
Data analytics (DA) is the process of examining
data sets in order to draw conclusions about the
information they contain, increasingly with the aid of
specialized systems and software.
Data Analytics refers to the set of quantitative and
qualitative approach in order to derive valuable
insights from data.
43. Data Analytics: All the buzz-
where to start?
AI/Data
Science
Data Lake
Data
Warehouse/
Data Mart
Data
Security/Privacy
Meta Data
Master
Data/Reference
Data
BI/Dashboard
Data Governance
Data Source
Extract,
Transform,
Load
44. Types of Data Analytics
Descriptive analytics: the use of data to understand past and
current business performance and make informed decisions
Diagnostic analytics: A set of techniques for determine what
has happened and why.
Predictive analytics: predict the future by examining historical
data, detecting patterns or relationships in these data, and then
extrapolating these relationships forward in time.
Prescriptive analytics: identify the best alternatives to
minimize or maximize some objective.
46. Types of Data Analytics
Descriptive analytics: The interpretation of historical
data to identify trends and patterns.
Diagnostic analytics: The interpretation of historical
data to determine why something has happened.
Predictive analytics: The use of statistics to forecast
future outcomes.
Prescriptive analytics: The application of testing
and other techniques to determine which outcome
will yield the best result in a given scenario.
48. Descriptive Analytics
• Descriptive analytics is a preliminary stage of data
processing that creates a summary of historical data to
yield useful information and possibly prepare the data for
further analysis.
• Descriptive analytics, such as reporting, dashboards, and
data visualization, have been widely used for some time.
• They are the core of traditional BI.
49. Descriptive Analytics
Process:
Identify the attributes, then assess/evaluate the attributes
Estimate the magnitude to correlate the relative contribution of each attribute
to the final solution
Accumulate more instances of data from the data sources
If possible, perform the steps of evaluation, classification and categorization
quickly
At some threshold, crossover into diagnostic and predictive analytics
50. 50
Diagnostic Analytics
• Diagnostic analytics takes a deeper look at data to attempt to
understand the causes of events and behaviors.
• Process:
– Begin with descriptive analytics
– Extract patterns from large data quantities via data mining
– Correlate data types for explanation of near-term behavior – past
and present
– Estimate linear/non-linear behavior not easily identifiable through
other approaches.
51. Predictive Analytics
Predictive analytics: is the branch of advanced
analytics which is used to make prediction about
unknown future events. Software and/or
hardware solutions that allow firms to discover,
evaluate, optimize, and deploy predictive models
by analyzing big data sources to improve business
performance or mitigate risk. We use machine
learning, artificial intelligence, data mining,
statistical modeling for perdition. SAP Predictive
Analytics, SAS Predictive Analytics, SPSS,
Python, R, Microsoft Azure.
52. Predictive Analytics
Process:
Begin with descriptive AND diagnostic analytics
Choose the right data based on domain knowledge
and relationships among variables
Choose the right techniques to yield insight into
possible outcomes
Determine the likelihood of possible outcomes given
initial boundary conditions
Remember! Data driven analytics is non-linear; do
NOT treat like an engineering project
52
53. Prescriptive Analytics
• Prescriptive analytics allows users to “prescribe” a number
of different possible actions to and guide them towards a
solution.
• Process:
– Begin with predictive analytics
– Determine what should occur and how to make it so
– Determine the mitigating factors that lead to desirable/undesirable
outcomes
– “What-if” analysis local or global optimization
– Ex: Find the best set of prices and advertising frequency to maximize
revenue
“Make it so”
54. Titanic Survival Rates by Gender
Inference
Among the
passengers in Titanic -
65% are Males and
35% are Females.
But the analysis
indicates that Females
(74%) have a better
survival rate than Males
(18%).
55. Titanic Survival Rates by Class
Inference
Taking a look at our bar chart for
the survival rate of children by
passenger class also illustrates a
similar survival story. 85.7% of
children in first class survived,
whereas 96.3% of children in
second class survived, and
38.7% of children in third class
survived. This was interesting
because this time, a higher
proportion of children in second
class survived in comparison to
first class.
56. CASE 1: Managing Pandemic
Data-driven technology used to manage impact of
COVID 19 in Singapore
57. CASE 1: Managing Pandemic
The Government has progressively built up the
digital infrastructure and engineering capabilities as
the foundation of our Smart Nation.
These technology led investments enable Singapore
to respond decisively and swiftly to the COVID-19
outbreak with a suite of digital tools to help
disseminate timely and accurate information to
Singaporeans, and to enable our fellow agencies to
better manage the crisis.
Source:www.tech.gov.sg
58. CASE 1: Managing Pandemic
Locating hot spots for effective
containment
The Singapore Government has
adopted evidence based approach to
manage pandemic, restart social and
economic activities, all backed by data-
driven technology.
GovTech developed TraceTogether and
SafeEntry helped the country adjust to
changes brought about in daily lives by
COVID-19.
59. CASE 1: Managing Pandemic
Locating hot spots for effective
containment
Temperature screening stations
promoted by government. It can
automatically take body temperatures
to identify cases and to avoid long
queues.
The real time data used for informed
decisions regarding tracing, allocating
resources and circuit breaker
relaxation.
60. CASE 1: Managing Pandemic
Sharing information: AskJamie
Chatbot
Developed by GovTech,
AskJamie is a virtual assistant
designed to answer queries within
specific domains on Government
agency websites.
Launched in 2014, AskJamie has
been implemented across 70
Government agency websites.
61. CASE 1: Managing Pandemic
Since 1 February 2020, the chatbot has been
enhanced to address queries related to COVID-19,
and uses machine learning to improve accuracy of
the replies, and data analytics to detect trending
topics.
Citizens are also able to access the chatbot via
Facebook, Messenger and Telegram
63. CASE 2: Addressing Economic
Inequalities
Fair and Inclusive
Society at UC Berkeley
proposed six evidence-
based policy solutions
on reversing inequality,
closing economic
disparities among
subgroups and
enhancing economic
mobility.
64. CASE 2: Addressing Economic
Inequalities
■ Recent Fed's 2018 Survey of Consumer
Finances show that the top 3 percent own
54.4 percent of America's wealth and the
bottom 90 percent own only 24.7 percent
of wealth
■ The survey found significant disparities
by race, class, homeownership status and
education.
■ Income and wealth decreased for blacks,
lower income households, renters and those
with less than a college education
65. CASE 3: Education Data
Analytics
Georgia State University (GSU) mining the student data to
identify courses in which they showed a poor performance
and created a supplemental instruction program to help
with those courses
By doing this, they found that its graduation rate went up
from 32% to 54% between the year 2003 and 2014, on
adopting data to solve issues of retention and course
completion
66. CASE 3: Education Data
Analytics
Using data dashboards, University of Central Florida can use data points
collected in real time to help students who are displaying patterns that
show they are struggling academically, improve professors’ curricula and
more effectively collect money for new campus initiatives and scholarships
At the University of Alabama, the use of predictive analytics found that
students who asked for copies of their transcripts might be at risk of
leaving the university
Now, administrators can note when a student puts in such a request and
offer academic and campus resources to encourage those students to stay
67. CASE 4: Health Data Analytics (Real-
time Alerting)
Though the used of wearables to collect patients’ health data
continuously and
Clinical Decision Support (a type of Data Analytics software
used in healthcare) for real-time analysis of clinical medical
data help health practitioners with advice as they make
prescriptive decisions
Health professionals are able to examine patient and modify
the delivery strategies accordingly
68. CASE 4: Health Data Analytics
(Real-time Alerting)
Examples;
■Trigger warnings and reminders when a patient should get a new lab test
or track prescriptions to see if a patient has been following doctors’ orders
■If a patient’s blood pressure increases alarmingly, the system will send
an alert in real time to the doctor who will then take action to reach the
patient and administer measures to lower the pressure
■Asthmapolisis suing inhalers with GPS-enabled trackers to identify
asthma trends both on an individual level and looking at larger populations
■This data is being used in conjunction with data from the CDC in order to
develop better treatment plans for asthmatics
70. Explosion of Big Data From
Mobile
Consumers increasingly use mobile devices to locate
and buy products.
47% of users would provide their location to receive
relevant offers and discounts (mBlox 2013).
The mobile device has become the central control
system in consumers‘ lives.”
Traditional BI focuses on "what happened". Data
science and big data analytics focuses on "what will
happen".
72. Mobile Marketing Analytics
Do you know where you will be 285 days from now at 2
pm?
We (data scientists) do!
Predictable in our movements.
Use Big Data to predict with very high accuracy the
correct location of individuals even months into the
future.
Used experiments to offer causal explanations into
human behavior and help enterprises with IT and
marketing strategies.
74. Location Analytics and
Geographic Information Systems
Location analytics
Ability to gain business insight from the location
(geographic) component of data
Mobile phones
Sensors, scanning devices
Map data
Geographic information systems (G I S)
Ties location-related data to maps
Example: For helping local governments calculate
response times to disasters
75. Social Media Data Analytics:
Social Listening
Big Data Analytics: Customer Insights via
Automated Text Mining
Product feature extraction from customer
comments
Sentiment analysis from customer comments
Linguistic style analysis of customer comments
77. Did you know?
Big Data Hits Real Life
http://www.nytimes.com/video/business/
100000002206849/big-data-hits-real-life.html
78. Cooking Up an Analytic Meal:
Recipe for Maximum Analytic Value
Ingredients Instructions
30% Data Gather, clean and connect disparate data.
Use the freshest data you can afford. Partner
with Finance or Operations to share work
burden and create early partnership. Outliers
can teach you much about data quality.
5%
Stakeholdering
Collect key hypotheses from executives. This is
a great way to sift lumps out of your research
questions. Keep the conversations brief so they
don’t taint your ability to treat the data with an
open mind. Caution: Too much can spoil
productive collaboration.
79. Cooking Up an Analytic Meal:
Recipe for Maximum Analytic Value
Ingredients Instructions
15%
Analysis
A few professionals need to get good at advanced
math. Or you can in-source resources from your
customer, marketing, or strategy teams. Go as deep
as you need to behind the scenes, but remember
the savory flavor of regression T-tests and P-values
is an acquired taste for most.
20%
Storytelling
Reduce the research data to one memorable slide.
Explain what the insights mean and how to take
them into action. Shake financial acumen liberally
into the story, as no proposal is worthy of a leader’s
time unless it expresses itself in financial outcomes.
80. Cooking Up an Analytic Meal:
Recipe for Maximum Analytic Value
Ingredients Instructions
20%
Implementation
Taking insights into action separates consultants
from business partners. Here is where homemade
flavor really stands out! Resistance of obstacles
encountered likely point to shortcuts in analysis.
10%
Embedment
Define accountabilities, embedding purposeful
reporting, and transfer operational ownership.
Celebrate short-term wins. Set a specific date to
monitor outcomes. Remain flexible and modify the
change plan, if necessary.
#48:Descriptive analytics, such as reporting/OLAP, dashboards, and data visualization, have been widely used for some time. They are the core of traditional BI.
#74:This slide looks at various additional examples of BI applications. Note that BI is also used in the public sector for analyzing data and determining public policy, such as allocating school resources, an example discussed in a chapter case.