SlideShare a Scribd company logo
Big Data Analytics
David Strom
david@strom.com
Twitter: @dstrom
July 2015 Wash Univ.
Download this here:
http://slideshare.net/davidstrom
Three necessary skills
• Strategic data planning. Understand how data
is the new raw material for any modern
business.
• Analytical skills. What is the data trying to tell
you?
• Technology skills. Embrace the technology
and make it a key part of your skill set.
Editorial management positions:
My background
Some examples
• Tracking Twitter airline sentiment
• Using car-generated GPS data
• Analyzing maps
• What you can glean from your log files
• How P&G does it big-time
• Betting on Big Data with IBM
• The infamous Enron email data set
• Trends from AP’s news archive
Big data analytics
Big data analytics
Big data analytics
Big data analytics
Big data analytics
Big data analytics
Big data analytics
Big data analytics
Big data analytics
Local Big Data Meetups
Thanks for your ideas!
• Copies of this presentation:
http://slideshare.net/davidstrom
• My blog: http://strominator.com
• Follow me on Twitter: @dstrom
• Old school: david@strom.com
http://strominator.com 15

More Related Content

PDF
Seven Deadly Sins of Infographics Design (and How to Fix Them) SxSW 2016 teaser
PDF
Big data Scratch The Surface
PPTX
redpill Forensics
PPTX
Keeping the customer in mind: a lesson for Telco's
PPT
Social Media Research at Comms Service Providers
PPTX
Using OpenStack to Control VM Chaos
PPTX
Notable Twitter fails
PPTX
Advanced Firewalls Progress Report
Seven Deadly Sins of Infographics Design (and How to Fix Them) SxSW 2016 teaser
Big data Scratch The Surface
redpill Forensics
Keeping the customer in mind: a lesson for Telco's
Social Media Research at Comms Service Providers
Using OpenStack to Control VM Chaos
Notable Twitter fails
Advanced Firewalls Progress Report

Similar to Big data analytics (20)

PPTX
Umsl big data
PPTX
Big Data & Business Analytics: Understanding the Marketspace
PDF
Big Data Analysis and Business Intelligence
PPTX
Why Everything You Know About bigdata Is A Lie
PDF
Why Big Data is Really about Small Data
PPTX
Big Data Analytics
PDF
Big data Analytics
PDF
From Big Data to Business Value
PDF
Turning Big Data Analytics To Knowledge PowerPoint Presentation Slides
PPTX
Big data analytics primer for w2 e startups
PPSX
Intro to Data Science Big Data
PPTX
BigData-Challenges.pptx
PPTX
Capitalize On Social Media With Big Data Analytics
PPTX
Making advanced analytics work for you
PDF
Big Data & Analytics 101: How Customer Lifetime Value Enhances Predictive Mar...
PDF
Mighty Guides Data Disruption
PPTX
Big Data Analytics for BI, BA and QA
PPTX
Making advanced analytics work for you
PDF
What Big Data Means for PR and Why It Matters to Us
 
Umsl big data
Big Data & Business Analytics: Understanding the Marketspace
Big Data Analysis and Business Intelligence
Why Everything You Know About bigdata Is A Lie
Why Big Data is Really about Small Data
Big Data Analytics
Big data Analytics
From Big Data to Business Value
Turning Big Data Analytics To Knowledge PowerPoint Presentation Slides
Big data analytics primer for w2 e startups
Intro to Data Science Big Data
BigData-Challenges.pptx
Capitalize On Social Media With Big Data Analytics
Making advanced analytics work for you
Big Data & Analytics 101: How Customer Lifetime Value Enhances Predictive Mar...
Mighty Guides Data Disruption
Big Data Analytics for BI, BA and QA
Making advanced analytics work for you
What Big Data Means for PR and Why It Matters to Us
 
Ad

More from David Strom (20)

PPTX
Spark Twitter fails Mar2023
PPTX
Getting Your First Cybersecurity Job
PPTX
Understanding passwordless technologies
PPTX
What endpoint protection solutions are available on the market today?
PPTX
Fears and fulfillment with IT security
PPTX
Protecting your digital and online privacy
PPTX
AI and cyber security: new directions, old fears
PPTX
The legalities of hacking back
PPTX
How to market your book in today's social media world
PPTX
​Understanding the Internet of Things
PPTX
How to make your mobile phone safe from hackers
PPTX
Implications and response to large security breaches
PPT
Using social networks to find your next job (2017)
PPTX
Security v. Privacy: the great debate
PPTX
How to make the move towards hybrid cloud computing
PPTX
Listen to Your Customers: How IT Can Provide Better Support
PPTX
Network security practice: then and now
PPTX
Biggest startup mistakes
PPTX
Picking the right Single Sign On Tool to protect your network
PPTX
Emerging computing trends 2015
Spark Twitter fails Mar2023
Getting Your First Cybersecurity Job
Understanding passwordless technologies
What endpoint protection solutions are available on the market today?
Fears and fulfillment with IT security
Protecting your digital and online privacy
AI and cyber security: new directions, old fears
The legalities of hacking back
How to market your book in today's social media world
​Understanding the Internet of Things
How to make your mobile phone safe from hackers
Implications and response to large security breaches
Using social networks to find your next job (2017)
Security v. Privacy: the great debate
How to make the move towards hybrid cloud computing
Listen to Your Customers: How IT Can Provide Better Support
Network security practice: then and now
Biggest startup mistakes
Picking the right Single Sign On Tool to protect your network
Emerging computing trends 2015
Ad

Recently uploaded (20)

PDF
Approach and Philosophy of On baking technology
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
Spectroscopy.pptx food analysis technology
PPTX
Tartificialntelligence_presentation.pptx
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
A Presentation on Artificial Intelligence
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
Big Data Technologies - Introduction.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Electronic commerce courselecture one. Pdf
Approach and Philosophy of On baking technology
Assigned Numbers - 2025 - Bluetooth® Document
Mobile App Security Testing_ A Comprehensive Guide.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
cuic standard and advanced reporting.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Spectroscopy.pptx food analysis technology
Tartificialntelligence_presentation.pptx
SOPHOS-XG Firewall Administrator PPT.pptx
Programs and apps: productivity, graphics, security and other tools
A Presentation on Artificial Intelligence
Network Security Unit 5.pdf for BCA BBA.
NewMind AI Weekly Chronicles - August'25-Week II
Big Data Technologies - Introduction.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
Group 1 Presentation -Planning and Decision Making .pptx
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Dropbox Q2 2025 Financial Results & Investor Presentation
Electronic commerce courselecture one. Pdf

Big data analytics

  • 1. Big Data Analytics David Strom david@strom.com Twitter: @dstrom July 2015 Wash Univ. Download this here: http://slideshare.net/davidstrom
  • 2. Three necessary skills • Strategic data planning. Understand how data is the new raw material for any modern business. • Analytical skills. What is the data trying to tell you? • Technology skills. Embrace the technology and make it a key part of your skill set.
  • 4. Some examples • Tracking Twitter airline sentiment • Using car-generated GPS data • Analyzing maps • What you can glean from your log files • How P&G does it big-time • Betting on Big Data with IBM • The infamous Enron email data set • Trends from AP’s news archive
  • 14. Local Big Data Meetups
  • 15. Thanks for your ideas! • Copies of this presentation: http://slideshare.net/davidstrom • My blog: http://strominator.com • Follow me on Twitter: @dstrom • Old school: david@strom.com http://strominator.com 15

Editor's Notes

  • #2: V4 J School additions Use older Stampede deck for URL sources
  • #3: http://www.readwriteweb.com/cloud/2012/02/strata-2012-3-essential-skills.php Diego Saenz of Data Driven CEO
  • #5: So let's talk this morning about how Big Data does come from all corners of the globe and while it may not be evil, there are some fascinating examples of where it is being used by companies today and I'll review some of these case studies pulled from some of the articles that I and my colleagues in the IT trade press have been writing about over the past several months.
  • #6: As you know the US department of transportation collects monthly on-time statistics of each of the major airlines. But a better method is from Jeffrey Breen of Cambridge Aviation Research. He put this together to show sentiment analysis using the immediacy and accessibility of Twitter. He provides a real-time glimpse into consumer’s frustration, using this flowchart with R and various other data collection tools to score the tweets and summarize it for each airline and compare it with what the federal government provides.
  • #7: Your car has become a data hub, with USB ports, a SD card reader, Bluetooth connections to your phone and even a mobile Wifi hotspot. This next picture is a shot of the latest Ford My Touch dashboard that can be found in many of their cars. It provides all sorts of controls on what music you listen to, the indoor climate controls of your car, and a connection to your phone to dial your address book. Currently, Ford collects and aggregates data from the 4 million vehicles that use in-car sensing and remote app management software to create a virtuous cycle of information. The data allows Ford engineers to glean information on a range of issues, from how drivers are using their vehicles, to the driving environment, to electromagnetic forces affecting the vehicle, and feedback on other road conditions that could help them improve the quality, safety, fuel economy and emissions of the vehicle. Drivers willing to share how many miles they’ve traveled could get discounts between 10 and 40 percent in exchange for providing State Farm with a more accurate picture of their vehicle-use habits, which they obtain from directly accessing the Sync telematics systems in the cars electronically.
  • #8: Using Tableau and open street map data, you can spot trends in Austin’s teacher turnover. While it is a city-wide problem, it is particularly acute in the poorer areas of the east side.
  • #9: But Big Data can be used in the corporate situations that are fairly mundane. Here we are looking at a hospital autoclave, which is used for sterilizing instruments. This is just one type of Industrial equipment which are among the products that Axeda is working with other companies to rig with sensors and cellular connections. Each of these devices has an IP address and an Internet connection, so that use of those devices can then be monitored remotely, so that their supply, maintenance and management can all be optimized, without having to go and look at the machines themselves. "Typically engineers would find logs through customer tickets and it would take months to find trends based on call center traffic,” You can collect data about uptime, need for repairs, machine run completion and detergent levels into a smartphone app that hospital employees can use.
  • #10: Big Data is also being used in some of the world's largest corporations. We are looking at Proctor and Gamble’s Business Sphere big data situation room in their Cincinnati HQ. A big data analyst drives these large screens that display data visualizations on sales, market share, ad spending and the like, so everyone in the meeting is seeing the same information based on 4 billion daily transactions of P&G products. P&G isn’t after new data types; it still wants to share and analyze point-of-sale, inventory, ad spending, and shipment data. What’s new is the higher frequency and speed at which P&G gets that data, and the finer granularity. Even with all this gear, P&G has about two-thirds of the real-time data it needs.
  • #11: Let's move on to some of the Big Data rock stars that I have interviewed and really enjoy hearing from. Jeff Jonas is a data scientist that now works for IBM. One of his jobs was designing the casino security systems in Las Vegas, where he currently lives. He worked for the surveillance intelligence group of several casinos, and automated various manual processes, adding facial recognition software that was key to slowing down the MIT card counting group. "We built [another] system to immediately identify risk in real time so they could get these people out of the casino quickly." This software is still offered by IBM as its InfoSphere Identity Insight event processing and identity tracking technology.
  • #12: Mason and others have mentioned the now iconic Enron email archive that has since passed into the public domain and is used by a number of big data researchers to test their email algorithms and is available from a number of online academic websites -- It is an example of actual emails that forms the basis of many anti-spam programs these days, which is ironic given that their emails have outlasted the company where everyone once worked.
  • #13: Here we are looking at a facsimile of an old newspaper – you remember newspapers, right? Ironically, it was called the New York Mirror. And while this and so many other newspapers have bit the dust, one operation that is still in business is The Associated Press. If you are looking for large content repositories, you probably can't get much larger than the article archive of the Associated Press. They have launched a content analysis tool that is used to search the millions of articles in their archives to create custom archive products for their customers. The project makes use of a solution from MarkLogic, a major Big Data enabler that is used by many different kinds of publishers for this type of purpose, such as Lexis/Nexis. The AP didn't start out by using the MarkLogic solution, but tried to implement a more traditional relational database structure only to run into problems. Their archives are in XML, which was difficult to design the right kind of data structures. Plus, they didn't have a consistent metadata collection across the archives. The MarkLogic implementation took 16 weeks from start to finish and was the first time that the AP had made use of their services. It enables them to run complex, Boolean searches across millions of articles in our content archive and get back precise returns in seconds or minutes instead of days or weeks. This much quicker response time is already transforming their B2B product offerings and helps them to manage searching for unstructured content in near real-time. Users can query for particular keywords, and the AP can use the search query traffic to see trending topics and deliver article collections to particular B2B customers. For example, they could create references on a particular subject or moment in time.
  • #14: One of my favorite Big Data hotbeds is Kaggle. They routinely hosts various big data contests and this one that concluded last month was a way for Facebook to evaluate prospective employees. More than 400 people submitted entries.
  • #15: Here are some of the local meetups if you want to learn more about Big Data.
  • #16: Thanks everyone for listening to me and good luck with your own Big Data explorations.