SlideShare a Scribd company logo
Datapreneurs
&
Building Data Centric Business
http://www.slideshare.net/ssood/datapreneurs
linkedin.com/in/sureshsood
@soody
Areas for Discussion
1.) Data Driving Trends – Big Data & Machine Engineering
2.) Building Data Centric Business
3.) Future of Professions
4.) Talent Scarcity
5.) Democratisation of Data Science
Datapreneurs
http://www.marketingdistillery.com/2014/11/29/is-data-science-a-buzzword-modern-data-scientist-
defined/
2020 Global Data Forecast (Bytes)
2020 estimates suggest four times more digital data than all the grains of sand on Earth
Source: Pg. 4, Building a Digital Analytics Organization: Create Value by Integrating Analytical Processes,
Technology, and People into Business Operations by Judah Phillips, FT Press, 30 Jul 2013
Data Origination Trends
1) Mobile = multiple sensors comprising camera, microphone, GPS, accelerometer
2) Sensors everywhere including “eyes in the sky” via drones, satellites and roads
3) Online customer interactions generate IP addresses, time, geocode, page visits
4) Large scale data curation e.g. Airbnb, Google (Art, Street View, N Gram, Gdelt** ),
Guardian, Million Songs, OpenCalifornia, Openflights, OpenStreetMap,
Planethunters, Pandora, Shazam, Wikipedia
5) Data fusion e.g. LA Times Homicide Blog using coroner reports
6) Reviews e.g. Tripadvisor, Amazon, Yelp
7) Open Data e.g. data.gov and O Data protocol (http://www.odata.org/)
**The GDELT Project pushes the boundaries of “big data,” weighing in at over a quarter-billion rows with
59 fields for each record, spanning the geography of the entire planet, and covering a time horizon of
more than 35 years. GDELT is the largest open-access database on human society in existence. Its
archives contain nearly 400M latitude/longitude geographic coordinates spanning over 12,900 days,
making it one of the largest open-access spatio-temporal datasets as well.
Internet of Things “trillion sensors”
Source: www.tsensorssummit.org
Black Box Insurance
• Big data transforms actuarial insurance from using probability methods to estimate premiums into dynamic risk management
using real data generating individually tailored premiums
• Estimate 20 km work or home journey, data point acquired every min and journey captures 12 points per km. Assume 1000 km
per month driving or generating 12,000 points per month resulting in 144,000 points per car/annum. Hence, 1,000 cars leads to
144 million points per annum.
• Telematics technology (black box) monitor helps assess the driving behavior and prices policy based on true driver centric
premiums by capturing:
– Number of journeys
– Distances travelled
– Types of roads
– Speed
– Time of travel
– Acceleration and braking
– Any accidents
– Location ?
• Benefits low mileage, smooth and safe drivers
• Privacy vs. Saving monies on insurance (Canada ; http://bit.ly/Black_box)
The ANZ Heavy Traffic Index comprises flows
of vehicles weighing more than 3.5 tonnes
(primarily trucks) on 11 selected roads
around NZ. It is contemporaneous with GDP
growth.
The ANZ Light Traffic Index is made up of
light or total traffic flows (primarily cars and
vans) on 10 selected roads around the
country. It gives a six month lead on GDP
growth in normal circumstances (but cannot
predict sudden adverse events such as the
Global Financial Crisis).
http://www.anz.co.nz/about-us/economic-markets-research/truckometer/
ANZ TRUCKOMETER
http://tacocopter.com/
New Sources of Information (Big data) : Social Media + Internet of Things
Innovations
7,919 40,204
2,003,254,102 51
Gridded Data Sources
Variety of Data Types & Big Data Challenge
1. Astronomical
2. Documents
3. Earthquake
4. Email
5. Environmental sensors
6. Fingerprints
7. Health (personal) Images
8. Graph data (social network)
9. Location
10.Marine
11.Particle accelerator
12.Satellite
13.Scanned survey data
14.Sound
15.Text
16.Transactions
17.Video
Big Data consists of extensive datasets primarily in the characteristics of
volume, variety, velocity, and/or variability that require a scalable
architecture for efficient storage, manipulation, and analysis.
Computational portability is the movement of the computation to the location of the data.
HadoopConfigurations(SingleandMulti-Rack)
Adapted from: http://stackiq.com/
Cluster manager e.g. Apache Ambari, Apache Mesos, or Rocks
3 TB drives ,18 data nodes
configuration represents 648 TB
of raw storage HDFS standard
replication factor of 3
216 TB of usable storage
Name/secondary/data nodes – 6 core 96 GB
Management node – 4 core 16 GB
Computer
Data
Program
Output
Computer
Data
Output
Algorithms
Traditional Computing Paradigm & Machine Engineering
Machine learning is a scientific discipline that deals with the construction
and study of algorithms that can learn from data. Such algorithms operate
by building a model based on inputs and using that to make predictions or
decisions, rather than following only explicitly programmed instructions.
http://en.wikipedia.org/wiki/Machine_learning
8 Steps Towards Building the Data Centric Business
1. Put digital service (Vargo & Lusch) at centre of business blurring distinction with
physical products via sensors and apps
2. Identify data and monetisation opportunities using business model canvas
3. Select unique sources of data to help drive innovation
4. Uses data to drive interactions and customer experiences
5. Understand the data lifecycle from creation to storage
6. Value extraction from data (economic or social)
7. Review patterns of big data businesses
8. Got on top of big data technology trends and analytics software
Netflix – A Picture of A Data Driven Company
• ~75 million users
• 8.5 million events per second
• Zero loss?
• 550 billion events per day
• Hundreds of event types
• 1.3 PB/day
• 21GB /sec (peak)
• 37% of peak US internet bandwidth
• Operates on Amazon Web Services
Source : http://techblog.netflix.com/2016/02/evolution-of-netflix-data-pipeline.html
• Next generation radio telescope
• 100 x more sensitive & 1,000,000 X faster
• 5 square km of dish over 3000 km
• Two sites: Western Australia & Karoo Desert RSA
• Worlds most ambitious IT Project
• First real exascale ready application
• Largest global big-data challenge
• SKA SDP exascale systems:
• 100,000 nodes
• 800 cabinets
• consume 20 MW
• Expected failure rates of 300 nodes per week
Square Kilometre Array
http://www.ska.gov.au/
The Future of the Professions
(Susskind & Susskind 2015)
– Tax and audit work replaced by computer assisted techniques
– Technology automating and innovating
– Accounting work reconfiguring
– New business models
– Move from bespoke to “off the peg”
– Mastery of data with new tools and techniques - Big Data
– Diversification
– Shift to proactivity from reactivity
– Professionals replaced by less expert people and high performing systems
– Post-professional society expertise available online
The Future of the Professions How Technology Will Transform the Work of Human Experts, Richard Susskind and Daniel Susskind
(2015)
Adapted From:
The Future of the Professions How Technology Will Transform the Work of Human Experts, Richard Susskind and Daniel Susskind
(2015)
A HEALTH
KNOWLEDGE
COMMONS
New knowledge used
differently: people
managing their own health
information, personalising
their care and creating
new kinds of health
knowledge
https://www.nesta.org.uk/sites/default/files/the-nhs-in-2030.pdf
The Future of the Professions How Technology Will Transform the Work of Human Experts, Richard Susskind and Daniel
Susskind (2015)
We’re sitting on a big data time bomb
Catastrophic loss of transparency. Few IT professionals
have experience managing big data platforms at scale
— a situation that has created a massive skills
shortage in the industry. By 2018, U.S. companies will
be short 1.5 million managers able to make data-
based decisions. A recent McKinsey Quarterly report
estimates that, in order to close this gap, companies
would need to spend 50 percent of their data and
analytics budget on training frontline managers; it also
notes that few companies realize this need.
Source: CAMERON SIM, CREWSPARK, OCTOBER 24, 2015
http://venturebeat.com/2015/10/24/were-sitting-on-a-big-data-time-bomb/
Australia/NZ needs “30,000 data savvy managers by 2018”
• This statement derives from the McKinsey (2011) study “a shortage of talent necessary for organizations
to take advantage of big data. By 2018, the United States requires a talent pool of 140-190,000 deep
analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big
data to make effective decisions”.
• Taking 2% of the US economy as a rule of thumb, in 2018 Australia will require another 30,000 managers
or analysts. However, the shortage commences well before 2018. These numbers do not accommodate
the training of managers or analysts for overseas destinations.
• Another 2011 study from EMC Corporation interviewed nearly 500 data science and business intelligence
professionals globally. Two-thirds of the informants believe demand will outpace supply and 30% from
disciplines outside of computer science. Additionally, the study found the biggest obstacle to data science
as being education and training.
• A 2012 study “Data Equity: Unlocking the value of big data” commissioned by SAS UK and conducted by
the Centre for Economics and Business Research, an independent business research consultancy, found
unlocking big data leads to adding another 58,000 jobs to the UK economy (2012-2017).
• Gartner (2012) estimates by 2015 4.4 Million IT Jobs will be created globally to Support big data or 1.9
million jobs in the United Sates alone.
• Closer afield, the 2013 Hudson study “Tackling the Big Data Challenge” found 78% of the Australian
research informants, “believe organisations do not have the skills and competencies to successfully
undertake a big data project.
• Building on the McKinsey (2011), Gartner (2012) and Hudson (2013) estimations, Australia and the world
requires 3 distinct but related skills. Most specifically, the demand is very strong for data savvy managers
conversant with big data practice.
Australian Demand 2018:
-Deep skills 3,800
- Technology 8,000
- Data Savvy 30,000
India’s high demand for big data workers contrasts
with scarcity of skilled talent
The talent deficit is on two fronts, said Velamakanni: data
scientists who can perform analytics, and analytics consultants
who can understand and use the data. The first, big data
engineers and scientists, are extremely scarce. "In the second
category, we need better quality, and India is going to be short
of a million data consultants soon," he said.
Source: India's high demand for big data workers contrasts with scarcity of
skilled talent, Saritha Rai, June, 2014, http://www.techrepublic.com/article/indias-
high-demand-for-big-data-workers-contrasts-with-scarcity-of-skilled-talent/
Google Trends Worldwide, Australia and New Zealand - Accounting + Analytics
January 2004 - September 2015
Worldwide
Australia
New
Zealand
'The Predictive Accountant’ and Data Centric Practice
1. Data savvy
3. Focus shifts from being reactive to proactive and predictive
4. Leverages accounting data and predictive analytics software to find patterns in data and
insights
5. Uses the tools and dashboards to predict client scenarios before time: maximising
opportunity, limiting risks and proactively advising.
6. Accountant benefit from analytics by adding value when connecting with client
challenges and opportunities to identified customer patterns. Sharing these insights
delivers more value in the accounting conversations and helps tackle the real business
problems facing clients.
The Predictive Accountant Portal: Democratisation of Data Science
The Predictive Accountant Data Sources
Predictive
Analytics
Excel style
dashboard
Connected Practice
Digital Marketing / eNewsletters/ Integrated
business tools software
Apps Marketplace
Accounting Analytic Apps
Education
Analytic Training
“I had come to an entirely erroneous
conclusion which shows, my dear
Watson, how dangerous it is to
reason from insufficient data.”
The Adventure of the Speckled Bird

More Related Content

PPT
Jobs Complexity
PPTX
Spark Social Media
PPTX
Bigdatacooltools
PPTX
Systemof insight
PPTX
PPTX
PPT
Future of jobs, big data & innovation
PDF
Big Data Landscape 2018
Jobs Complexity
Spark Social Media
Bigdatacooltools
Systemof insight
Future of jobs, big data & innovation
Big Data Landscape 2018

What's hot (20)

PPTX
Beyond dashboards
PPTX
Data Science Innovations : Democratisation of Data and Data Science
PDF
NewMR 2016 presents: 9 Big Applications of Big Data
PPTX
Big data characteristics, value chain and challenges
PDF
Big Data & Analytics (Conceptual and Practical Introduction)
PPTX
Data Science Courses - BigData VS Data Science
PPTX
Team 2 Big Data Presentation
PDF
Applications of Big Data
PPTX
Big data
PPT
Big Data and Computer Science Education
PPTX
A Big Data Concept
PDF
Personalized News and Video Recomendation System at LinkSure
PDF
The promise and challenge of Big Data
PDF
BIG Data and Methodology-A review
PPTX
Bigdata analytics
PDF
How to design ai functions to the cloud native infra
PPTX
Big Data Analytics
PDF
Approaching Big Data: Lesson Plan
PDF
Geospatial Intelligence Middle East 2013_Big Data_Steven Ramage
PPTX
Big Data Analytics
Beyond dashboards
Data Science Innovations : Democratisation of Data and Data Science
NewMR 2016 presents: 9 Big Applications of Big Data
Big data characteristics, value chain and challenges
Big Data & Analytics (Conceptual and Practical Introduction)
Data Science Courses - BigData VS Data Science
Team 2 Big Data Presentation
Applications of Big Data
Big data
Big Data and Computer Science Education
A Big Data Concept
Personalized News and Video Recomendation System at LinkSure
The promise and challenge of Big Data
BIG Data and Methodology-A review
Bigdata analytics
How to design ai functions to the cloud native infra
Big Data Analytics
Approaching Big Data: Lesson Plan
Geospatial Intelligence Middle East 2013_Big Data_Steven Ramage
Big Data Analytics
Ad

Viewers also liked (13)

DOCX
Interview workshop guide oct 15
PPTX
Calculating the Cost of IT Downtime for Law Firms
PPTX
[AU SPC 2011] Backup Restore SharePoint 2010
DOCX
PPTX
Bullying
DOC
CV-DEBASIS_NARENDRA
DOCX
DOCX
DOCX
Tugas Ekonomi Teknik
PPTX
Evaluation question 3 ppt
PDF
Microprocessor lab
PPT
iOS Multithreading
PPT
Minimal standard c program
Interview workshop guide oct 15
Calculating the Cost of IT Downtime for Law Firms
[AU SPC 2011] Backup Restore SharePoint 2010
Bullying
CV-DEBASIS_NARENDRA
Tugas Ekonomi Teknik
Evaluation question 3 ppt
Microprocessor lab
iOS Multithreading
Minimal standard c program
Ad

Similar to Datapreneurs (20)

PPSX
Intro to Data Science Big Data
PPTX
Data Science Innovations
PDF
Data science and its potential to change business as we know it. The Roadmap ...
PPTX
Usama Fayyad talk in South Africa: From BigData to Data Science
PPTX
Big data primer - an introduction to data exploitation.
PPTX
Why Everything You Know About bigdata Is A Lie
PDF
Big data Analytics
PPTX
Big Data Mining Keynote presentation Sept 2013 09012013
PDF
SuanIct-Bigdata desktop-final
PPTX
Lecture 1.13 & 1.14 &1.15_Business Profiles in Big Data.pptx
PPTX
Big data4businessusers
PDF
Level Seven - Expedient Big Data presentation
KEY
Exploring Big Data value for your business
PPTX
Big Data and Data Science: The Technologies Shaping Our Lives
PPTX
Big data
PDF
BIG DATA, small workforce
PDF
Embracing data science
PPTX
Big Data, NoSQL, NewSQL & The Future of Data Management
PPTX
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
PPTX
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Intro to Data Science Big Data
Data Science Innovations
Data science and its potential to change business as we know it. The Roadmap ...
Usama Fayyad talk in South Africa: From BigData to Data Science
Big data primer - an introduction to data exploitation.
Why Everything You Know About bigdata Is A Lie
Big data Analytics
Big Data Mining Keynote presentation Sept 2013 09012013
SuanIct-Bigdata desktop-final
Lecture 1.13 & 1.14 &1.15_Business Profiles in Big Data.pptx
Big data4businessusers
Level Seven - Expedient Big Data presentation
Exploring Big Data value for your business
Big Data and Data Science: The Technologies Shaping Our Lives
Big data
BIG DATA, small workforce
Embracing data science
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...

More from suresh sood (20)

PPTX
Getting to the Edge of the Future - Tools & Trends of Foresight to Nowcasting
PPTX
Bigdata AI
PPTX
Bigdata ai
PPTX
Foresight conversation
PPTX
Data science Innovations January 2018
PPTX
future2020
PPTX
Data science innovations
PPTX
Swarm jobs
PPTX
Netnography online course part 1 of 3 17 november 2016
PPTX
Foresight Analytics
PPTX
Datainnovation
PPTX
Bigdatahuman
PPTX
Bigdataforesight
PPTX
PPTX
Australian Business Culture
PPT
Cool Tools
PPTX
Transforming instagram data into location intelligence
PPTX
Crowdsourcing Social Media
PPTX
Crowdsourcing co creation and ideation
PPTX
Analytic innovation transforming instagram data into predicitive analytics wi...
Getting to the Edge of the Future - Tools & Trends of Foresight to Nowcasting
Bigdata AI
Bigdata ai
Foresight conversation
Data science Innovations January 2018
future2020
Data science innovations
Swarm jobs
Netnography online course part 1 of 3 17 november 2016
Foresight Analytics
Datainnovation
Bigdatahuman
Bigdataforesight
Australian Business Culture
Cool Tools
Transforming instagram data into location intelligence
Crowdsourcing Social Media
Crowdsourcing co creation and ideation
Analytic innovation transforming instagram data into predicitive analytics wi...

Recently uploaded (20)

PDF
Complications of Minimal Access Surgery at WLH
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
Business Ethics Teaching Materials for college
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Basic Mud Logging Guide for educational purpose
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
Pre independence Education in Inndia.pdf
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PPTX
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
Insiders guide to clinical Medicine.pdf
PPTX
Cell Types and Its function , kingdom of life
PPTX
Cell Structure & Organelles in detailed.
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
PDF
TR - Agricultural Crops Production NC III.pdf
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Complications of Minimal Access Surgery at WLH
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Business Ethics Teaching Materials for college
O5-L3 Freight Transport Ops (International) V1.pdf
Basic Mud Logging Guide for educational purpose
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Pre independence Education in Inndia.pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
FourierSeries-QuestionsWithAnswers(Part-A).pdf
2.FourierTransform-ShortQuestionswithAnswers.pdf
Microbial disease of the cardiovascular and lymphatic systems
Insiders guide to clinical Medicine.pdf
Cell Types and Its function , kingdom of life
Cell Structure & Organelles in detailed.
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
TR - Agricultural Crops Production NC III.pdf
Chapter 2 Heredity, Prenatal Development, and Birth.pdf

Datapreneurs

  • 1. Datapreneurs & Building Data Centric Business http://www.slideshare.net/ssood/datapreneurs linkedin.com/in/sureshsood @soody
  • 2. Areas for Discussion 1.) Data Driving Trends – Big Data & Machine Engineering 2.) Building Data Centric Business 3.) Future of Professions 4.) Talent Scarcity 5.) Democratisation of Data Science
  • 5. 2020 Global Data Forecast (Bytes) 2020 estimates suggest four times more digital data than all the grains of sand on Earth Source: Pg. 4, Building a Digital Analytics Organization: Create Value by Integrating Analytical Processes, Technology, and People into Business Operations by Judah Phillips, FT Press, 30 Jul 2013
  • 6. Data Origination Trends 1) Mobile = multiple sensors comprising camera, microphone, GPS, accelerometer 2) Sensors everywhere including “eyes in the sky” via drones, satellites and roads 3) Online customer interactions generate IP addresses, time, geocode, page visits 4) Large scale data curation e.g. Airbnb, Google (Art, Street View, N Gram, Gdelt** ), Guardian, Million Songs, OpenCalifornia, Openflights, OpenStreetMap, Planethunters, Pandora, Shazam, Wikipedia 5) Data fusion e.g. LA Times Homicide Blog using coroner reports 6) Reviews e.g. Tripadvisor, Amazon, Yelp 7) Open Data e.g. data.gov and O Data protocol (http://www.odata.org/) **The GDELT Project pushes the boundaries of “big data,” weighing in at over a quarter-billion rows with 59 fields for each record, spanning the geography of the entire planet, and covering a time horizon of more than 35 years. GDELT is the largest open-access database on human society in existence. Its archives contain nearly 400M latitude/longitude geographic coordinates spanning over 12,900 days, making it one of the largest open-access spatio-temporal datasets as well.
  • 7. Internet of Things “trillion sensors” Source: www.tsensorssummit.org
  • 8. Black Box Insurance • Big data transforms actuarial insurance from using probability methods to estimate premiums into dynamic risk management using real data generating individually tailored premiums • Estimate 20 km work or home journey, data point acquired every min and journey captures 12 points per km. Assume 1000 km per month driving or generating 12,000 points per month resulting in 144,000 points per car/annum. Hence, 1,000 cars leads to 144 million points per annum. • Telematics technology (black box) monitor helps assess the driving behavior and prices policy based on true driver centric premiums by capturing: – Number of journeys – Distances travelled – Types of roads – Speed – Time of travel – Acceleration and braking – Any accidents – Location ? • Benefits low mileage, smooth and safe drivers • Privacy vs. Saving monies on insurance (Canada ; http://bit.ly/Black_box)
  • 9. The ANZ Heavy Traffic Index comprises flows of vehicles weighing more than 3.5 tonnes (primarily trucks) on 11 selected roads around NZ. It is contemporaneous with GDP growth. The ANZ Light Traffic Index is made up of light or total traffic flows (primarily cars and vans) on 10 selected roads around the country. It gives a six month lead on GDP growth in normal circumstances (but cannot predict sudden adverse events such as the Global Financial Crisis). http://www.anz.co.nz/about-us/economic-markets-research/truckometer/ ANZ TRUCKOMETER
  • 10. http://tacocopter.com/ New Sources of Information (Big data) : Social Media + Internet of Things Innovations 7,919 40,204 2,003,254,102 51 Gridded Data Sources
  • 11. Variety of Data Types & Big Data Challenge 1. Astronomical 2. Documents 3. Earthquake 4. Email 5. Environmental sensors 6. Fingerprints 7. Health (personal) Images 8. Graph data (social network) 9. Location 10.Marine 11.Particle accelerator 12.Satellite 13.Scanned survey data 14.Sound 15.Text 16.Transactions 17.Video Big Data consists of extensive datasets primarily in the characteristics of volume, variety, velocity, and/or variability that require a scalable architecture for efficient storage, manipulation, and analysis. Computational portability is the movement of the computation to the location of the data.
  • 12. HadoopConfigurations(SingleandMulti-Rack) Adapted from: http://stackiq.com/ Cluster manager e.g. Apache Ambari, Apache Mesos, or Rocks 3 TB drives ,18 data nodes configuration represents 648 TB of raw storage HDFS standard replication factor of 3 216 TB of usable storage Name/secondary/data nodes – 6 core 96 GB Management node – 4 core 16 GB
  • 13. Computer Data Program Output Computer Data Output Algorithms Traditional Computing Paradigm & Machine Engineering Machine learning is a scientific discipline that deals with the construction and study of algorithms that can learn from data. Such algorithms operate by building a model based on inputs and using that to make predictions or decisions, rather than following only explicitly programmed instructions. http://en.wikipedia.org/wiki/Machine_learning
  • 14. 8 Steps Towards Building the Data Centric Business 1. Put digital service (Vargo & Lusch) at centre of business blurring distinction with physical products via sensors and apps 2. Identify data and monetisation opportunities using business model canvas 3. Select unique sources of data to help drive innovation 4. Uses data to drive interactions and customer experiences 5. Understand the data lifecycle from creation to storage 6. Value extraction from data (economic or social) 7. Review patterns of big data businesses 8. Got on top of big data technology trends and analytics software
  • 15. Netflix – A Picture of A Data Driven Company • ~75 million users • 8.5 million events per second • Zero loss? • 550 billion events per day • Hundreds of event types • 1.3 PB/day • 21GB /sec (peak) • 37% of peak US internet bandwidth • Operates on Amazon Web Services Source : http://techblog.netflix.com/2016/02/evolution-of-netflix-data-pipeline.html
  • 16. • Next generation radio telescope • 100 x more sensitive & 1,000,000 X faster • 5 square km of dish over 3000 km • Two sites: Western Australia & Karoo Desert RSA • Worlds most ambitious IT Project • First real exascale ready application • Largest global big-data challenge • SKA SDP exascale systems: • 100,000 nodes • 800 cabinets • consume 20 MW • Expected failure rates of 300 nodes per week Square Kilometre Array http://www.ska.gov.au/
  • 17. The Future of the Professions (Susskind & Susskind 2015) – Tax and audit work replaced by computer assisted techniques – Technology automating and innovating – Accounting work reconfiguring – New business models – Move from bespoke to “off the peg” – Mastery of data with new tools and techniques - Big Data – Diversification – Shift to proactivity from reactivity – Professionals replaced by less expert people and high performing systems – Post-professional society expertise available online
  • 18. The Future of the Professions How Technology Will Transform the Work of Human Experts, Richard Susskind and Daniel Susskind (2015)
  • 19. Adapted From: The Future of the Professions How Technology Will Transform the Work of Human Experts, Richard Susskind and Daniel Susskind (2015) A HEALTH KNOWLEDGE COMMONS New knowledge used differently: people managing their own health information, personalising their care and creating new kinds of health knowledge https://www.nesta.org.uk/sites/default/files/the-nhs-in-2030.pdf
  • 20. The Future of the Professions How Technology Will Transform the Work of Human Experts, Richard Susskind and Daniel Susskind (2015)
  • 21. We’re sitting on a big data time bomb Catastrophic loss of transparency. Few IT professionals have experience managing big data platforms at scale — a situation that has created a massive skills shortage in the industry. By 2018, U.S. companies will be short 1.5 million managers able to make data- based decisions. A recent McKinsey Quarterly report estimates that, in order to close this gap, companies would need to spend 50 percent of their data and analytics budget on training frontline managers; it also notes that few companies realize this need. Source: CAMERON SIM, CREWSPARK, OCTOBER 24, 2015 http://venturebeat.com/2015/10/24/were-sitting-on-a-big-data-time-bomb/
  • 22. Australia/NZ needs “30,000 data savvy managers by 2018” • This statement derives from the McKinsey (2011) study “a shortage of talent necessary for organizations to take advantage of big data. By 2018, the United States requires a talent pool of 140-190,000 deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions”. • Taking 2% of the US economy as a rule of thumb, in 2018 Australia will require another 30,000 managers or analysts. However, the shortage commences well before 2018. These numbers do not accommodate the training of managers or analysts for overseas destinations. • Another 2011 study from EMC Corporation interviewed nearly 500 data science and business intelligence professionals globally. Two-thirds of the informants believe demand will outpace supply and 30% from disciplines outside of computer science. Additionally, the study found the biggest obstacle to data science as being education and training. • A 2012 study “Data Equity: Unlocking the value of big data” commissioned by SAS UK and conducted by the Centre for Economics and Business Research, an independent business research consultancy, found unlocking big data leads to adding another 58,000 jobs to the UK economy (2012-2017). • Gartner (2012) estimates by 2015 4.4 Million IT Jobs will be created globally to Support big data or 1.9 million jobs in the United Sates alone. • Closer afield, the 2013 Hudson study “Tackling the Big Data Challenge” found 78% of the Australian research informants, “believe organisations do not have the skills and competencies to successfully undertake a big data project. • Building on the McKinsey (2011), Gartner (2012) and Hudson (2013) estimations, Australia and the world requires 3 distinct but related skills. Most specifically, the demand is very strong for data savvy managers conversant with big data practice.
  • 23. Australian Demand 2018: -Deep skills 3,800 - Technology 8,000 - Data Savvy 30,000
  • 24. India’s high demand for big data workers contrasts with scarcity of skilled talent The talent deficit is on two fronts, said Velamakanni: data scientists who can perform analytics, and analytics consultants who can understand and use the data. The first, big data engineers and scientists, are extremely scarce. "In the second category, we need better quality, and India is going to be short of a million data consultants soon," he said. Source: India's high demand for big data workers contrasts with scarcity of skilled talent, Saritha Rai, June, 2014, http://www.techrepublic.com/article/indias- high-demand-for-big-data-workers-contrasts-with-scarcity-of-skilled-talent/
  • 25. Google Trends Worldwide, Australia and New Zealand - Accounting + Analytics January 2004 - September 2015 Worldwide Australia New Zealand
  • 26. 'The Predictive Accountant’ and Data Centric Practice 1. Data savvy 3. Focus shifts from being reactive to proactive and predictive 4. Leverages accounting data and predictive analytics software to find patterns in data and insights 5. Uses the tools and dashboards to predict client scenarios before time: maximising opportunity, limiting risks and proactively advising. 6. Accountant benefit from analytics by adding value when connecting with client challenges and opportunities to identified customer patterns. Sharing these insights delivers more value in the accounting conversations and helps tackle the real business problems facing clients.
  • 27. The Predictive Accountant Portal: Democratisation of Data Science The Predictive Accountant Data Sources Predictive Analytics Excel style dashboard Connected Practice Digital Marketing / eNewsletters/ Integrated business tools software Apps Marketplace Accounting Analytic Apps Education Analytic Training
  • 28. “I had come to an entirely erroneous conclusion which shows, my dear Watson, how dangerous it is to reason from insufficient data.” The Adventure of the Speckled Bird