SlideShare a Scribd company logo
Speakers
Simon Elliston Ball – Solutions Architect, Hortonworks
Adam Morton – Enterprise Data Architect, Admiral Group plc
• Over 10 years experience in Data Warehousing, Business Intelligence and
Analytics
• Working at Admiral for the past 2 years delivering a greenfield Enterprise Data
Warehouse as part of an overall Data Architecture modernisation programme
The Admiral Group
Admiral Group has grown from a small start up to one of the largest car
insurance providers in the UK with a presence in seven countries.
Our strategy is simple: To continue to progress in the UK Car Insurance market
whilst taking what we do well to new markets and products: keep doing what
we’re doing and do it better year after year.
Admiral – International Operations
Admiral employs more than 7,000 people at its offices in the UK, Spain, Italy, France,
USA, Canada and India.
"People who like what they do, do it better"
R&D at Admiral
• Strong history of using data to drive innovation which needs to be continued
• New function aimed at testing and learning through technology
• Time-boxed iterative efforts of no more than 4-6 weeks
• Fail fast, fail quickly approach; success or failure can end the PoC early
• Understand ‘Big Data’ and trial Hadoop ecosystem projects
Why Telematics?
• Scalability – A product with large potential and potentially huge volumes
• Timeliness - Data & Scoring was processed in batch – how quickly can this be done?
• Granularity - Suppliers provide aggregated data – could map matching be improved?
• Event Notification – Can we respond quickly
to NRT events in the data?
• Data Enrichment - Opportunity to uncover
further insights by integrating with interesting
data sources
Objectives of the Telematics PoC
• Scalability - Prove that data storage and high performance analytics can be
accomplished on large data sets cost effectively
• Timeliness - Reduce scoring time
• Data Enrichment
• NRT data processing – acting on events such as proximity to an airport
• Improve stability and flexibility
• Test the viability of a cloud solution
• Data Visualisation
Technical Challenges – Networking and Security
• Privacy Sensitive
• Third Party Sources
• Real-time data
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
There’s a VPN, it will be fine!
Admiral vNET
Third Party vNET
Telematics
Provider
DC
External
Users
Internal
Users
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Kafka SSL
Admiral vNET
Telematics
Provider
DC
External
Users
Internal
Users
K
SSL
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Ingest with NiFi
Admiral vNET
Telematics
Provider
DC
External
Users
Internal
Users
K
HDF
Other
Providers
Other
Providers
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Real-time Scoring
 Clean up done in NiFi
– Basic data correctness
– Format changes
 Fed To Kafka
 Spark Streaming
– NEAR Real time requirement
– Mixing Scala RDD and Data Frames code
– Integrating with map matching library
 Output fed into Kafka
– Kafka to WebSockets bridge for real-time visualization
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Batch Scoring
 More Spark!
 Zeppelin for ease of use, interaction
 Productionized into batch Spark Jobs
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
SAS on Hive
 Spark as ETL engine
 Hive for Large Scale processing
 SAS connector using Hive
 ORC as a file format
– Significantly smaller than JSON
– So much faster to process
Technical Challenges – Map Matching
• GPS data is messy
• Open Data sources based on roads
• Nearest road is fast, but not very good
• Hidden Markov Models. Know where you’re going,
and where you’ve been.
• Open source to the rescue…
14
Barefoot – Map Matching
• https://github.com/bmwcarit/barefoot
• Docker based service
• PostGIS map server loaded from OSM data
• Serializable map, distributed in Spark
15
Next Steps
 Completing knowledge transfer workshops with Hortonworks
 How to move from a POC to Production – ready?
 Establishing a in-house R&D function
 Deciding on the tools and frameworks to use within a POC
environment in the future

More Related Content

PPTX
Log I am your father
PPTX
How do you decide where your customer was?
PPTX
PPTX
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
PPTX
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
PPTX
Accelerating Big Data Insights
PPTX
Insights into Real World Data Management Challenges
PPTX
Keys for Success from Streams to Queries
Log I am your father
How do you decide where your customer was?
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
Accelerating Big Data Insights
Insights into Real World Data Management Challenges
Keys for Success from Streams to Queries

What's hot (20)

PDF
Fast SQL on Hadoop, Really?
PPTX
Big Data at your Desk with KNIME
PPTX
Dancing Elephants - Efficiently Working with Object Stories from Apache Spark...
PDF
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
PPTX
Big data at United Airlines
PPTX
PDF
Real World Use Cases: Hadoop and NoSQL in Production
PPTX
How Hadoop Makes the Natixis Pack More Efficient
POTX
Addressing Enterprise Customer Pain Points with a Data Driven Architecture
PDF
High Performance Spatial-Temporal Trajectory Analysis with Spark
PPTX
Real-Time Robot Predictive Maintenance in Action
PPTX
Enabling Modern Application Architecture using Data.gov open government data
PDF
Common and unique use cases for Apache Hadoop
PPTX
From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...
PPTX
Benefits of an Agile Data Fabric for Business Intelligence
PPTX
Zero ETL analytics with LLAP in Azure HDInsight
PPTX
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
PDF
Visualizing Big Data in Realtime
PPTX
Lessons learned processing 70 billion data points a day using the hybrid cloud
PPTX
Spark & Hadoop at Production at Scale
Fast SQL on Hadoop, Really?
Big Data at your Desk with KNIME
Dancing Elephants - Efficiently Working with Object Stories from Apache Spark...
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...
Big data at United Airlines
Real World Use Cases: Hadoop and NoSQL in Production
How Hadoop Makes the Natixis Pack More Efficient
Addressing Enterprise Customer Pain Points with a Data Driven Architecture
High Performance Spatial-Temporal Trajectory Analysis with Spark
Real-Time Robot Predictive Maintenance in Action
Enabling Modern Application Architecture using Data.gov open government data
Common and unique use cases for Apache Hadoop
From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...
Benefits of an Agile Data Fabric for Business Intelligence
Zero ETL analytics with LLAP in Azure HDInsight
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
Visualizing Big Data in Realtime
Lessons learned processing 70 billion data points a day using the hybrid cloud
Spark & Hadoop at Production at Scale
Ad

Viewers also liked (11)

PPTX
Apache NiFi- MiNiFi meetup Slides
PPTX
Apache NiFi in the Hadoop Ecosystem
PDF
HDF: Hortonworks DataFlow: Technical Workshop
PPTX
Log Analytics Optimization
PPTX
Integrating Apache NiFi and Apache Flink
PDF
Dataflow with Apache NiFi - Crash Course - HS16SJ
PPTX
Design a Dataflow in 7 minutes with Apache NiFi/HDF
PPTX
Webinar Series Part 5 New Features of HDF 5
PPTX
Real-Time Data Flows with Apache NiFi
PPTX
File Format Benchmark - Avro, JSON, ORC & Parquet
PPTX
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Apache NiFi- MiNiFi meetup Slides
Apache NiFi in the Hadoop Ecosystem
HDF: Hortonworks DataFlow: Technical Workshop
Log Analytics Optimization
Integrating Apache NiFi and Apache Flink
Dataflow with Apache NiFi - Crash Course - HS16SJ
Design a Dataflow in 7 minutes with Apache NiFi/HDF
Webinar Series Part 5 New Features of HDF 5
Real-Time Data Flows with Apache NiFi
File Format Benchmark - Avro, JSON, ORC & Parquet
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Ad

Similar to Admiral Group (20)

PPTX
Data sharing between private companies and research facilities
PDF
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
PPTX
Big Data LDN 2016: Case Studies of Business Transformation through Big Data
PDF
Analytics&IoT
PDF
Hortonworks - IBM Cognitive - The Future of Data Science
PPT
Big Data: Operational Excellence
PPTX
Hortonworks laurie maclachlan
PDF
Data driven approaches in a technology startup
PDF
Open Source Data Management for Industry 4.0
PPTX
Extracting Value from Big Data - Stuart Higgins
PDF
Powering the Future of Data  
PDF
03-NOV-1510-Ognjen-Antonic-Telemach-stream-1
PDF
Accelerating Data Science and Real Time Analytics at Scale
PPTX
Driving Network and Marketing Investments at O2 by Focusing on Improving the ...
PPTX
Big Data Expo 2015 - Pentaho The Future of Analytics
PPTX
Achieving a 360 degree view of manufacturing
PPTX
Big data analytics and machine intelligence v5.0
PDF
The New Model
PDF
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
PPTX
7 Predictive Analytics, Spark , Streaming use cases
Data sharing between private companies and research facilities
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
Big Data LDN 2016: Case Studies of Business Transformation through Big Data
Analytics&IoT
Hortonworks - IBM Cognitive - The Future of Data Science
Big Data: Operational Excellence
Hortonworks laurie maclachlan
Data driven approaches in a technology startup
Open Source Data Management for Industry 4.0
Extracting Value from Big Data - Stuart Higgins
Powering the Future of Data  
03-NOV-1510-Ognjen-Antonic-Telemach-stream-1
Accelerating Data Science and Real Time Analytics at Scale
Driving Network and Marketing Investments at O2 by Focusing on Improving the ...
Big Data Expo 2015 - Pentaho The Future of Analytics
Achieving a 360 degree view of manufacturing
Big data analytics and machine intelligence v5.0
The New Model
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
7 Predictive Analytics, Spark , Streaming use cases

More from DataWorks Summit/Hadoop Summit (20)

PPT
Running Apache Spark & Apache Zeppelin in Production
PPT
State of Security: Apache Spark & Apache Zeppelin
PDF
Unleashing the Power of Apache Atlas with Apache Ranger
PDF
Enabling Digital Diagnostics with a Data Science Platform
PDF
Revolutionize Text Mining with Spark and Zeppelin
PDF
Double Your Hadoop Performance with Hortonworks SmartSense
PDF
Hadoop Crash Course
PDF
Data Science Crash Course
PDF
Apache Spark Crash Course
PDF
Dataflow with Apache NiFi
PPTX
Schema Registry - Set you Data Free
PPTX
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
PDF
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
PPTX
Mool - Automated Log Analysis using Data Science and ML
PPTX
HBase in Practice
PPTX
The Challenge of Driving Business Value from the Analytics of Things (AOT)
PDF
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
PPTX
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
PPTX
Backup and Disaster Recovery in Hadoop
PPTX
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Running Apache Spark & Apache Zeppelin in Production
State of Security: Apache Spark & Apache Zeppelin
Unleashing the Power of Apache Atlas with Apache Ranger
Enabling Digital Diagnostics with a Data Science Platform
Revolutionize Text Mining with Spark and Zeppelin
Double Your Hadoop Performance with Hortonworks SmartSense
Hadoop Crash Course
Data Science Crash Course
Apache Spark Crash Course
Dataflow with Apache NiFi
Schema Registry - Set you Data Free
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Mool - Automated Log Analysis using Data Science and ML
HBase in Practice
The Challenge of Driving Business Value from the Analytics of Things (AOT)
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
Backup and Disaster Recovery in Hadoop
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes

Recently uploaded (20)

PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPT
Teaching material agriculture food technology
PDF
Machine learning based COVID-19 study performance prediction
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
Chapter 3 Spatial Domain Image Processing.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Teaching material agriculture food technology
Machine learning based COVID-19 study performance prediction
20250228 LYD VKU AI Blended-Learning.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Understanding_Digital_Forensics_Presentation.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Dropbox Q2 2025 Financial Results & Investor Presentation
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Spectral efficient network and resource selection model in 5G networks
Network Security Unit 5.pdf for BCA BBA.
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Reach Out and Touch Someone: Haptics and Empathic Computing

Admiral Group

  • 1. Speakers Simon Elliston Ball – Solutions Architect, Hortonworks Adam Morton – Enterprise Data Architect, Admiral Group plc • Over 10 years experience in Data Warehousing, Business Intelligence and Analytics • Working at Admiral for the past 2 years delivering a greenfield Enterprise Data Warehouse as part of an overall Data Architecture modernisation programme
  • 2. The Admiral Group Admiral Group has grown from a small start up to one of the largest car insurance providers in the UK with a presence in seven countries. Our strategy is simple: To continue to progress in the UK Car Insurance market whilst taking what we do well to new markets and products: keep doing what we’re doing and do it better year after year.
  • 3. Admiral – International Operations Admiral employs more than 7,000 people at its offices in the UK, Spain, Italy, France, USA, Canada and India. "People who like what they do, do it better"
  • 4. R&D at Admiral • Strong history of using data to drive innovation which needs to be continued • New function aimed at testing and learning through technology • Time-boxed iterative efforts of no more than 4-6 weeks • Fail fast, fail quickly approach; success or failure can end the PoC early • Understand ‘Big Data’ and trial Hadoop ecosystem projects
  • 5. Why Telematics? • Scalability – A product with large potential and potentially huge volumes • Timeliness - Data & Scoring was processed in batch – how quickly can this be done? • Granularity - Suppliers provide aggregated data – could map matching be improved? • Event Notification – Can we respond quickly to NRT events in the data? • Data Enrichment - Opportunity to uncover further insights by integrating with interesting data sources
  • 6. Objectives of the Telematics PoC • Scalability - Prove that data storage and high performance analytics can be accomplished on large data sets cost effectively • Timeliness - Reduce scoring time • Data Enrichment • NRT data processing – acting on events such as proximity to an airport • Improve stability and flexibility • Test the viability of a cloud solution • Data Visualisation
  • 7. Technical Challenges – Networking and Security • Privacy Sensitive • Third Party Sources • Real-time data
  • 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved There’s a VPN, it will be fine! Admiral vNET Third Party vNET Telematics Provider DC External Users Internal Users
  • 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Kafka SSL Admiral vNET Telematics Provider DC External Users Internal Users K SSL
  • 10. 10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Ingest with NiFi Admiral vNET Telematics Provider DC External Users Internal Users K HDF Other Providers Other Providers
  • 11. 11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Real-time Scoring  Clean up done in NiFi – Basic data correctness – Format changes  Fed To Kafka  Spark Streaming – NEAR Real time requirement – Mixing Scala RDD and Data Frames code – Integrating with map matching library  Output fed into Kafka – Kafka to WebSockets bridge for real-time visualization
  • 12. 12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Batch Scoring  More Spark!  Zeppelin for ease of use, interaction  Productionized into batch Spark Jobs
  • 13. 13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved SAS on Hive  Spark as ETL engine  Hive for Large Scale processing  SAS connector using Hive  ORC as a file format – Significantly smaller than JSON – So much faster to process
  • 14. Technical Challenges – Map Matching • GPS data is messy • Open Data sources based on roads • Nearest road is fast, but not very good • Hidden Markov Models. Know where you’re going, and where you’ve been. • Open source to the rescue… 14
  • 15. Barefoot – Map Matching • https://github.com/bmwcarit/barefoot • Docker based service • PostGIS map server loaded from OSM data • Serializable map, distributed in Spark 15
  • 16. Next Steps  Completing knowledge transfer workshops with Hortonworks  How to move from a POC to Production – ready?  Establishing a in-house R&D function  Deciding on the tools and frameworks to use within a POC environment in the future

Editor's Notes

  • #3: Launched in 1993 Admiral Group is an insurance company based out of Cardiff in the UK. It has grown from a start up to become a household name of car insurance in the UK. Historically the business model has been simple and straightforward; “keep doing what we’re doing and do it better year after year” Admiral adopts a culture which encourages people to innovate and suggest new ways of working; whether through new products, processes or technology. All staff are shareholders and lay claim to a small stake of the company.
  • #4: Admiral is also the youngest company in the FTSE 100 employing more than 7,000 staff worldwide. Our philosophy at Admiral is that people who like what they do, do it better so we go out of our way to ensure coming to work here is enjoyable. As a result the Admiral Group is consistently being voted in the top 5 of the best places to work in each office it operates in.