SlideShare a Scribd company logo
Big Data Analytics
David Strom
david@strom.com
Twitter: @dstrom
July 2015 Wash Univ.
Download this here:
http://slideshare.net/davidstrom
Three necessary skills
• Strategic data planning. Understand how data
is the new raw material for any modern
business.
• Analytical skills. What is the data trying to tell
you?
• Technology skills. Embrace the technology
and make it a key part of your skill set.
Editorial management positions:
My background
Some examples
• Tracking Twitter airline sentiment
• Using car-generated GPS data
• Analyzing maps
• What you can glean from your log files
• How P&G does it big-time
• Betting on Big Data with IBM
• The infamous Enron email data set
• Trends from AP’s news archive
Big data analytics
Big data analytics
Big data analytics
Big data analytics
Big data analytics
Big data analytics
Big data analytics
Big data analytics
Big data analytics
Local Big Data Meetups
Thanks for your ideas!
• Copies of this presentation:
http://slideshare.net/davidstrom
• My blog: http://strominator.com
• Follow me on Twitter: @dstrom
• Old school: david@strom.com
http://strominator.com 15

More Related Content

PDF
Seven Deadly Sins of Infographics Design (and How to Fix Them) SxSW 2016 teaser
PDF
Big data Scratch The Surface
PPTX
redpill Forensics
PPTX
How to create a wining paper-based resume
PPT
How To Find Your Next Job - Day 1
PDF
Perspectives of Professionals Changing Careers
PPT
How To Find Your Next Job - Day 2 slides
PDF
M.Phil thesis by S.Lakshmanan
Seven Deadly Sins of Infographics Design (and How to Fix Them) SxSW 2016 teaser
Big data Scratch The Surface
redpill Forensics
How to create a wining paper-based resume
How To Find Your Next Job - Day 1
Perspectives of Professionals Changing Careers
How To Find Your Next Job - Day 2 slides
M.Phil thesis by S.Lakshmanan

Viewers also liked (19)

PPT
How to Find Your Next Job Workshop - Day 1
PPT
BounceBack Workshop Day 1 Slides
PPT
How to Find Your Next Job Workshop - Day 2
PPTX
How to create your electronic resume using LinkedIn
PPT
How To Find Your Next Job Day 1 Slides
PDF
Career direct compared 20 other assessments
PPTX
How to doing interviews
PDF
Careers in Psychology - Alan Redman (Criterion Partnership)
PPT
Therapeutic impasses ppt
PPTX
Unit 8 - Waves
PPTX
U10,l10.1
PPTX
Networking tips and techniques
PPT
Endocrine system
PPT
Previewing Occupational Personality Questionnaire Used In Selection
PPTX
Neurotransmitter and neuroendocrinology
PPTX
A comparison of two occupational therapy models
PPT
Occupational therapy
PPTX
Unit 6, Lesson 5 - Newton's Laws of Motion
PDF
High school career guide presentation
How to Find Your Next Job Workshop - Day 1
BounceBack Workshop Day 1 Slides
How to Find Your Next Job Workshop - Day 2
How to create your electronic resume using LinkedIn
How To Find Your Next Job Day 1 Slides
Career direct compared 20 other assessments
How to doing interviews
Careers in Psychology - Alan Redman (Criterion Partnership)
Therapeutic impasses ppt
Unit 8 - Waves
U10,l10.1
Networking tips and techniques
Endocrine system
Previewing Occupational Personality Questionnaire Used In Selection
Neurotransmitter and neuroendocrinology
A comparison of two occupational therapy models
Occupational therapy
Unit 6, Lesson 5 - Newton's Laws of Motion
High school career guide presentation
Ad

Similar to Big data analytics (20)

PPTX
Umsl big data
PPTX
Big Data Applied, Data Warehouse Institute St. Louis December 2013 speech
PPT
DMP & DMPonline
PDF
How Big Data Can Help Your Business: Case Studies from ReadWriteWeb - Stamped...
PDF
Foundation for Success: How Big Data Fits in an Information Architecture
PDF
Data-Ed: Data Architecture Requirements
PDF
Data-Ed Webinar: Data Architecture Requirements
PPTX
Chapter 2 Introduction to CR_Process.pptx
PPTX
Data science unit1
PPTX
351315535-Module-1-Intro-to-Data-Science-pptx.pptx
PPTX
Data science.chapter-1,2,3
PDF
What Managers Need to Know about Data Science
PPTX
Data sciences and marketing analytics
PPTX
How news organizations are using data to tell
PPTX
Emerging computing trends 2015
PPTX
Interesting ways Big Data is used today
PPTX
Data science workflow v1.1
PDF
Data Preparation Fundamentals
PDF
Big data from the trenches
PPTX
Usama Fayyad talk in South Africa: From BigData to Data Science
Umsl big data
Big Data Applied, Data Warehouse Institute St. Louis December 2013 speech
DMP & DMPonline
How Big Data Can Help Your Business: Case Studies from ReadWriteWeb - Stamped...
Foundation for Success: How Big Data Fits in an Information Architecture
Data-Ed: Data Architecture Requirements
Data-Ed Webinar: Data Architecture Requirements
Chapter 2 Introduction to CR_Process.pptx
Data science unit1
351315535-Module-1-Intro-to-Data-Science-pptx.pptx
Data science.chapter-1,2,3
What Managers Need to Know about Data Science
Data sciences and marketing analytics
How news organizations are using data to tell
Emerging computing trends 2015
Interesting ways Big Data is used today
Data science workflow v1.1
Data Preparation Fundamentals
Big data from the trenches
Usama Fayyad talk in South Africa: From BigData to Data Science
Ad

More from David Strom Inc. (10)

PPTX
Major PR fails with marketing business technologies
PPTX
Spark Twitter fails Nov2022
PPTX
Notable social media fails and lessons learned
PPTX
Using NetGalley to promote your book launch
PPTX
using netgalley to promote your book launch
PPTX
Advanced Firewalls Progress Report
PPT
Cloud Integration Tools
PDF
UMSL College of Business 2010 Skills Gap
PPTX
How to use email newsletters
PPTX
How to create a blog and use a Web site for your personal brand
Major PR fails with marketing business technologies
Spark Twitter fails Nov2022
Notable social media fails and lessons learned
Using NetGalley to promote your book launch
using netgalley to promote your book launch
Advanced Firewalls Progress Report
Cloud Integration Tools
UMSL College of Business 2010 Skills Gap
How to use email newsletters
How to create a blog and use a Web site for your personal brand

Recently uploaded (20)

PDF
Empathic Computing: Creating Shared Understanding
PPTX
1. Introduction to Computer Programming.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Spectroscopy.pptx food analysis technology
PPTX
OMC Textile Division Presentation 2021.pptx
PPTX
Tartificialntelligence_presentation.pptx
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Getting Started with Data Integration: FME Form 101
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Encapsulation theory and applications.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
cloud_computing_Infrastucture_as_cloud_p
PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPT
Teaching material agriculture food technology
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Empathic Computing: Creating Shared Understanding
1. Introduction to Computer Programming.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Network Security Unit 5.pdf for BCA BBA.
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Spectroscopy.pptx food analysis technology
OMC Textile Division Presentation 2021.pptx
Tartificialntelligence_presentation.pptx
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Per capita expenditure prediction using model stacking based on satellite ima...
Getting Started with Data Integration: FME Form 101
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Encapsulation theory and applications.pdf
Programs and apps: productivity, graphics, security and other tools
cloud_computing_Infrastucture_as_cloud_p
A comparative study of natural language inference in Swahili using monolingua...
NewMind AI Weekly Chronicles - August'25-Week II
Teaching material agriculture food technology
Build a system with the filesystem maintained by OSTree @ COSCUP 2025

Big data analytics

  • 1. Big Data Analytics David Strom david@strom.com Twitter: @dstrom July 2015 Wash Univ. Download this here: http://slideshare.net/davidstrom
  • 2. Three necessary skills • Strategic data planning. Understand how data is the new raw material for any modern business. • Analytical skills. What is the data trying to tell you? • Technology skills. Embrace the technology and make it a key part of your skill set.
  • 4. Some examples • Tracking Twitter airline sentiment • Using car-generated GPS data • Analyzing maps • What you can glean from your log files • How P&G does it big-time • Betting on Big Data with IBM • The infamous Enron email data set • Trends from AP’s news archive
  • 14. Local Big Data Meetups
  • 15. Thanks for your ideas! • Copies of this presentation: http://slideshare.net/davidstrom • My blog: http://strominator.com • Follow me on Twitter: @dstrom • Old school: david@strom.com http://strominator.com 15

Editor's Notes

  • #2: V4 J School additions Use older Stampede deck for URL sources
  • #3: http://www.readwriteweb.com/cloud/2012/02/strata-2012-3-essential-skills.php Diego Saenz of Data Driven CEO
  • #5: So let's talk this morning about how Big Data does come from all corners of the globe and while it may not be evil, there are some fascinating examples of where it is being used by companies today and I'll review some of these case studies pulled from some of the articles that I and my colleagues in the IT trade press have been writing about over the past several months.
  • #6: As you know the US department of transportation collects monthly on-time statistics of each of the major airlines. But a better method is from Jeffrey Breen of Cambridge Aviation Research. He put this together to show sentiment analysis using the immediacy and accessibility of Twitter. He provides a real-time glimpse into consumer’s frustration, using this flowchart with R and various other data collection tools to score the tweets and summarize it for each airline and compare it with what the federal government provides.
  • #7: Your car has become a data hub, with USB ports, a SD card reader, Bluetooth connections to your phone and even a mobile Wifi hotspot. This next picture is a shot of the latest Ford My Touch dashboard that can be found in many of their cars. It provides all sorts of controls on what music you listen to, the indoor climate controls of your car, and a connection to your phone to dial your address book. Currently, Ford collects and aggregates data from the 4 million vehicles that use in-car sensing and remote app management software to create a virtuous cycle of information. The data allows Ford engineers to glean information on a range of issues, from how drivers are using their vehicles, to the driving environment, to electromagnetic forces affecting the vehicle, and feedback on other road conditions that could help them improve the quality, safety, fuel economy and emissions of the vehicle. Drivers willing to share how many miles they’ve traveled could get discounts between 10 and 40 percent in exchange for providing State Farm with a more accurate picture of their vehicle-use habits, which they obtain from directly accessing the Sync telematics systems in the cars electronically.
  • #8: Using Tableau and open street map data, you can spot trends in Austin’s teacher turnover. While it is a city-wide problem, it is particularly acute in the poorer areas of the east side.
  • #9: But Big Data can be used in the corporate situations that are fairly mundane. Here we are looking at a hospital autoclave, which is used for sterilizing instruments. This is just one type of Industrial equipment which are among the products that Axeda is working with other companies to rig with sensors and cellular connections. Each of these devices has an IP address and an Internet connection, so that use of those devices can then be monitored remotely, so that their supply, maintenance and management can all be optimized, without having to go and look at the machines themselves. "Typically engineers would find logs through customer tickets and it would take months to find trends based on call center traffic,” You can collect data about uptime, need for repairs, machine run completion and detergent levels into a smartphone app that hospital employees can use.
  • #10: Big Data is also being used in some of the world's largest corporations. We are looking at Proctor and Gamble’s Business Sphere big data situation room in their Cincinnati HQ. A big data analyst drives these large screens that display data visualizations on sales, market share, ad spending and the like, so everyone in the meeting is seeing the same information based on 4 billion daily transactions of P&G products. P&G isn’t after new data types; it still wants to share and analyze point-of-sale, inventory, ad spending, and shipment data. What’s new is the higher frequency and speed at which P&G gets that data, and the finer granularity. Even with all this gear, P&G has about two-thirds of the real-time data it needs.
  • #11: Let's move on to some of the Big Data rock stars that I have interviewed and really enjoy hearing from. Jeff Jonas is a data scientist that now works for IBM. One of his jobs was designing the casino security systems in Las Vegas, where he currently lives. He worked for the surveillance intelligence group of several casinos, and automated various manual processes, adding facial recognition software that was key to slowing down the MIT card counting group. "We built [another] system to immediately identify risk in real time so they could get these people out of the casino quickly." This software is still offered by IBM as its InfoSphere Identity Insight event processing and identity tracking technology.
  • #12: Mason and others have mentioned the now iconic Enron email archive that has since passed into the public domain and is used by a number of big data researchers to test their email algorithms and is available from a number of online academic websites -- It is an example of actual emails that forms the basis of many anti-spam programs these days, which is ironic given that their emails have outlasted the company where everyone once worked.
  • #13: Here we are looking at a facsimile of an old newspaper – you remember newspapers, right? Ironically, it was called the New York Mirror. And while this and so many other newspapers have bit the dust, one operation that is still in business is The Associated Press. If you are looking for large content repositories, you probably can't get much larger than the article archive of the Associated Press. They have launched a content analysis tool that is used to search the millions of articles in their archives to create custom archive products for their customers. The project makes use of a solution from MarkLogic, a major Big Data enabler that is used by many different kinds of publishers for this type of purpose, such as Lexis/Nexis. The AP didn't start out by using the MarkLogic solution, but tried to implement a more traditional relational database structure only to run into problems. Their archives are in XML, which was difficult to design the right kind of data structures. Plus, they didn't have a consistent metadata collection across the archives. The MarkLogic implementation took 16 weeks from start to finish and was the first time that the AP had made use of their services. It enables them to run complex, Boolean searches across millions of articles in our content archive and get back precise returns in seconds or minutes instead of days or weeks. This much quicker response time is already transforming their B2B product offerings and helps them to manage searching for unstructured content in near real-time. Users can query for particular keywords, and the AP can use the search query traffic to see trending topics and deliver article collections to particular B2B customers. For example, they could create references on a particular subject or moment in time.
  • #14: One of my favorite Big Data hotbeds is Kaggle. They routinely hosts various big data contests and this one that concluded last month was a way for Facebook to evaluate prospective employees. More than 400 people submitted entries.
  • #15: Here are some of the local meetups if you want to learn more about Big Data.
  • #16: Thanks everyone for listening to me and good luck with your own Big Data explorations.