SlideShare a Scribd company logo
Consultancy –
Pragmatic
Analytics
Irish Centre for High End Computing
Dr. Eoin Brazil
www.ichec.ie/consultancy
Technology Transfer @ ICHEC
• Started just over eighteen months ago

• Core competencies include:
– Performance Optimization
– Data Mining/Analytics (e.g. Computational Finance)

• Consultancy
• Training (e.g. R - & TSA / & AC, CUDA, HPC, etc.)
SFI Enterprise Workshop - 25th July 2011

2
SFI Enterprise Workshop - 25th July 2011

3
Visual Exploration

SFI Enterprise Workshop - 25th July 2011

4
Example – Wine Vintage
• Hot, dry summers give
higher prices in mature
wines
• Chȃteau Pétrus 2000 ~$60,000 (liv-ex.com)
• Bordeaux Equation

• Wine quality = 12.145 + 0.00117 Winter Rainfall +
0.0614 Averarge Growing Season Temperature – 0.00386

Harvest Rainfall

SFI Enterprise Workshop - 25th July 2011

5
Financial services – Computational Finance

SFI Enterprise Workshop - 25th July 2011

6
Real-World Constraints
• My application / workflow:
– Deal with +2B transactions per day per site
– Less than 50ms for end-to-end processing
– Need real-time detection of fraud
– Multiple coupled models in ensemble
– Production platform is X
– Cannot incorrectly classify good client as
fraudster
– Data size is too large for my infrastructure
SFI Enterprise Workshop - 25th July 2011

7
Are you ready
for Big Data ?
• Hadoop is x50+ slower on relation data, can
be x1000+ slower on graph data
• Make sure you hone the tool first:
–
–
–
–

MCMC x53 faster using Rcpp Versus R
Linear Regression x8 using Eigen via R
x15 BLAS/LAPACK with ICC flags and hardware in R
Rmpi / multicore / MKL / pnmath / MR / gputools
SFI Enterprise Workshop - 25th July 2011

8
What are GPGPUs ?
• Disruptive Innovation in Parallel Computing
– HPC from desktop to supercomputers (10 Gen leap)

SFI Enterprise Workshop - 25th July 2011

9
SFI Enterprise Workshop - 25th July 2011

10
SFI Enterprise Workshop - 25th July 2011

11
Typical Business Results
Domain

Result

Computational
Finance

1 or 8 Cards (x121/x950) = Do in 1 second what used to

Oil and Gas

Data processing = x2 – x6 (profiling at this stage), e.g. if
volume took 44 mins could be done in 22 – 7 ½ mins

Life Sciences

Patient analytics, initial prototype for cardio-vascular
disease detection (~72% accuracy), ongoing work.

Telecomms

Fraud detection prototype for subscription fraud,
Detection (~99% accuracy), avoided predicting good
clients as fraudster*

Electronic
Commerce

Demand forecasting & customer segmentation = Using
historic data to predict future demand (~90% accuracy)
& identified valuable clients (~80% accuracy)

take 2/16 minutes, 10 generations of processor

SFI Enterprise Workshop - 25th July 2011

12
Acknowledgements
Supported by Science Foundation
Ireland under grant 08/HEC/I1450
and by HEA’s PRTLI-C4.

More Related Content

PPT
An example of discovering simple patterns using basic data mining
PPT
Cloud Computing Examples at ICHEC
PPTX
Bringing HPC to tackle your business problems
PPT
Fat Nodes & GPGPUs - Red-shifting your infrastructure without breaking the bu...
PDF
IC-SDV 2018: Srin Achanta (SciTech Patent Art) Global Competitive Technology ...
PDF
ICIC 2013 Conference Proceedings Tony Trippe Patinformatics
PDF
Scaling Deep Learning Algorithms on Extreme Scale Architectures
PDF
EDF2013: Selected Talk, Peter Haase: Optique: Scalable End-User Access to Big...
An example of discovering simple patterns using basic data mining
Cloud Computing Examples at ICHEC
Bringing HPC to tackle your business problems
Fat Nodes & GPGPUs - Red-shifting your infrastructure without breaking the bu...
IC-SDV 2018: Srin Achanta (SciTech Patent Art) Global Competitive Technology ...
ICIC 2013 Conference Proceedings Tony Trippe Patinformatics
Scaling Deep Learning Algorithms on Extreme Scale Architectures
EDF2013: Selected Talk, Peter Haase: Optique: Scalable End-User Access to Big...

What's hot (18)

PDF
IC-SDV 2018: Harald Jenny (CENTREDOC) When Artificial Intelligence Joins Inte...
PDF
IC-SDV 2018: Diane Webb (BizInt) Challenges in Visualizing Pharmaceutical Inf...
PPTX
Data warehouse migration to oracle data integrator 11g
PDF
Internet of Things Cologne 2015: The Contribution of New Data Storage and Ana...
PDF
Jeroen Cant on IWT Baekeland
PDF
II-DV 2017: Averbis
PDF
AI-SDV 2020: IPscreener
PDF
IC-SDV 2018: BizInt
PDF
Stor c gregynog colloquium
PDF
Gt data mining ai algorithm for fabs
PDF
RECAP at the YERUN Launch Event
PDF
RECAP Project Overview
PPTX
IT is Innovation in Technology
PDF
Telvent Big Data Approach and Case Studies
PDF
AI-SDV 2021: Heiko Wongel - Machine learning tools in patent searching - are ...
PPTX
SPARK USE CASE- Distributed Reinforcement Learning for Electricity Market Bi...
PDF
RECAP Project Overview
PPTX
Smarter Innovation at Scale
IC-SDV 2018: Harald Jenny (CENTREDOC) When Artificial Intelligence Joins Inte...
IC-SDV 2018: Diane Webb (BizInt) Challenges in Visualizing Pharmaceutical Inf...
Data warehouse migration to oracle data integrator 11g
Internet of Things Cologne 2015: The Contribution of New Data Storage and Ana...
Jeroen Cant on IWT Baekeland
II-DV 2017: Averbis
AI-SDV 2020: IPscreener
IC-SDV 2018: BizInt
Stor c gregynog colloquium
Gt data mining ai algorithm for fabs
RECAP at the YERUN Launch Event
RECAP Project Overview
IT is Innovation in Technology
Telvent Big Data Approach and Case Studies
AI-SDV 2021: Heiko Wongel - Machine learning tools in patent searching - are ...
SPARK USE CASE- Distributed Reinforcement Learning for Electricity Market Bi...
RECAP Project Overview
Smarter Innovation at Scale
Ad

Viewers also liked (10)

PDF
Roads? Where We’re Going, We Don’t Need Roads
PDF
Mobile Services from Concept to Reality - Case Studies at the Mobile Service ...
PDF
Introduction to Machine Learning using R - Dublin R User Group - Oct 2013
PPTX
Performance and Application of GIS and Big Data ETL Processes Using FME
PDF
Creating a Business Case for Big Data
PPTX
How to Test Big Data Systems | QualiTest Group
PDF
Three Big Data Case Studies
PDF
Arduino Lecture 4 - Interactive Media CS4062 Semester 2 2009
PPTX
Big Data Case Study: Fortune 100 Telco
PDF
Arduino Lecture 1 - Introducing the Arduino
Roads? Where We’re Going, We Don’t Need Roads
Mobile Services from Concept to Reality - Case Studies at the Mobile Service ...
Introduction to Machine Learning using R - Dublin R User Group - Oct 2013
Performance and Application of GIS and Big Data ETL Processes Using FME
Creating a Business Case for Big Data
How to Test Big Data Systems | QualiTest Group
Three Big Data Case Studies
Arduino Lecture 4 - Interactive Media CS4062 Semester 2 2009
Big Data Case Study: Fortune 100 Telco
Arduino Lecture 1 - Introducing the Arduino
Ad

Similar to Pragmatic Analytics - Case Studies of High Performance Computing for Better Business and Big Data (20)

PDF
Use of data in manufacturing
PDF
Digital transformation driving operational excellence
PDF
MT11 - Turn Science Fiction into Reality by Using SAP HANA to Make Sense of IoT
DOC
Cv andre grootscholten
PDF
SAP HANA "THE WHY"- Value Proposition - Run Simple
PDF
PAD-3126 - Evolving the DevOps Organization around IBM PureApplication System...
PPT
Dr. Jim Murray: How do we Protect our Systems and Meet Compliance in a Rapidl...
PPTX
RPA & AI IN THE CLOUD - THE LATEST IN AS-A-SERVICE SOLUTIONS FOR FINANCE
PDF
Spark Hearts GraphLab Create
PPTX
Assessing New Databases– Translytical Use Cases
PDF
Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...
PDF
Kapacitor Manager
PDF
Streaming Analytics - Comparison of Open Source Frameworks and Products
PDF
Virtualization for Power Industry
PPTX
Corporate presentation TJC Group - SAP Partner
PDF
The Digital Twin For Production Optimization
PDF
Phillips Digitalization of our service business
PDF
Oracle Cloud modernized Finance Process at CSL
 
PPTX
How to Reach 99.99% Uptime
PDF
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Use of data in manufacturing
Digital transformation driving operational excellence
MT11 - Turn Science Fiction into Reality by Using SAP HANA to Make Sense of IoT
Cv andre grootscholten
SAP HANA "THE WHY"- Value Proposition - Run Simple
PAD-3126 - Evolving the DevOps Organization around IBM PureApplication System...
Dr. Jim Murray: How do we Protect our Systems and Meet Compliance in a Rapidl...
RPA & AI IN THE CLOUD - THE LATEST IN AS-A-SERVICE SOLUTIONS FOR FINANCE
Spark Hearts GraphLab Create
Assessing New Databases– Translytical Use Cases
Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...
Kapacitor Manager
Streaming Analytics - Comparison of Open Source Frameworks and Products
Virtualization for Power Industry
Corporate presentation TJC Group - SAP Partner
The Digital Twin For Production Optimization
Phillips Digitalization of our service business
Oracle Cloud modernized Finance Process at CSL
 
How to Reach 99.99% Uptime
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL

More from Eoin Brazil (10)

PDF
Example optimisation using GPGPUs by ICHEC
PDF
Ichec is vs-andthecloud
PDF
Mixing Interaction, Sonification, Rendering and Design - The art of creating ...
PDF
Arduino Lecture 3 - Interactive Media CS4062 Semester 2 2009
PDF
Arduino Lecture 3 - Interactive Media CS4062 Semester 2 2009
PDF
Arduino Lecture 2 - Interactive Media CS4062 Semester 2 2009
PDF
Echoes, Whispers, and Footsteps from the Conflux of Sonic Interaction Design ...
PDF
IOTC08 The Arduino Platform
PDF
Arduino Lecture 2 - Electronic, LEDs, Communications and Datasheets
PDF
Arduino Lecture 3 - Making Things Move and AVR programming
Example optimisation using GPGPUs by ICHEC
Ichec is vs-andthecloud
Mixing Interaction, Sonification, Rendering and Design - The art of creating ...
Arduino Lecture 3 - Interactive Media CS4062 Semester 2 2009
Arduino Lecture 3 - Interactive Media CS4062 Semester 2 2009
Arduino Lecture 2 - Interactive Media CS4062 Semester 2 2009
Echoes, Whispers, and Footsteps from the Conflux of Sonic Interaction Design ...
IOTC08 The Arduino Platform
Arduino Lecture 2 - Electronic, LEDs, Communications and Datasheets
Arduino Lecture 3 - Making Things Move and AVR programming

Recently uploaded (20)

PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
PDF
CloudStack 4.21: First Look Webinar slides
PDF
Architecture types and enterprise applications.pdf
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PPTX
Chapter 5: Probability Theory and Statistics
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
Hybrid model detection and classification of lung cancer
PDF
Unlock new opportunities with location data.pdf
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PPTX
Web Crawler for Trend Tracking Gen Z Insights.pptx
PDF
Getting started with AI Agents and Multi-Agent Systems
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
August Patch Tuesday
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
PDF
WOOl fibre morphology and structure.pdf for textiles
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
CloudStack 4.21: First Look Webinar slides
Architecture types and enterprise applications.pdf
A contest of sentiment analysis: k-nearest neighbor versus neural network
Chapter 5: Probability Theory and Statistics
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
Hybrid model detection and classification of lung cancer
Unlock new opportunities with location data.pdf
sustainability-14-14877-v2.pddhzftheheeeee
Web Crawler for Trend Tracking Gen Z Insights.pptx
Getting started with AI Agents and Multi-Agent Systems
Zenith AI: Advanced Artificial Intelligence
Hindi spoken digit analysis for native and non-native speakers
August Patch Tuesday
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
Final SEM Unit 1 for mit wpu at pune .pptx
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
WOOl fibre morphology and structure.pdf for textiles
Univ-Connecticut-ChatGPT-Presentaion.pdf

Pragmatic Analytics - Case Studies of High Performance Computing for Better Business and Big Data

  • 1. Consultancy – Pragmatic Analytics Irish Centre for High End Computing Dr. Eoin Brazil www.ichec.ie/consultancy
  • 2. Technology Transfer @ ICHEC • Started just over eighteen months ago • Core competencies include: – Performance Optimization – Data Mining/Analytics (e.g. Computational Finance) • Consultancy • Training (e.g. R - & TSA / & AC, CUDA, HPC, etc.) SFI Enterprise Workshop - 25th July 2011 2
  • 3. SFI Enterprise Workshop - 25th July 2011 3
  • 4. Visual Exploration SFI Enterprise Workshop - 25th July 2011 4
  • 5. Example – Wine Vintage • Hot, dry summers give higher prices in mature wines • Chȃteau Pétrus 2000 ~$60,000 (liv-ex.com) • Bordeaux Equation • Wine quality = 12.145 + 0.00117 Winter Rainfall + 0.0614 Averarge Growing Season Temperature – 0.00386 Harvest Rainfall SFI Enterprise Workshop - 25th July 2011 5
  • 6. Financial services – Computational Finance SFI Enterprise Workshop - 25th July 2011 6
  • 7. Real-World Constraints • My application / workflow: – Deal with +2B transactions per day per site – Less than 50ms for end-to-end processing – Need real-time detection of fraud – Multiple coupled models in ensemble – Production platform is X – Cannot incorrectly classify good client as fraudster – Data size is too large for my infrastructure SFI Enterprise Workshop - 25th July 2011 7
  • 8. Are you ready for Big Data ? • Hadoop is x50+ slower on relation data, can be x1000+ slower on graph data • Make sure you hone the tool first: – – – – MCMC x53 faster using Rcpp Versus R Linear Regression x8 using Eigen via R x15 BLAS/LAPACK with ICC flags and hardware in R Rmpi / multicore / MKL / pnmath / MR / gputools SFI Enterprise Workshop - 25th July 2011 8
  • 9. What are GPGPUs ? • Disruptive Innovation in Parallel Computing – HPC from desktop to supercomputers (10 Gen leap) SFI Enterprise Workshop - 25th July 2011 9
  • 10. SFI Enterprise Workshop - 25th July 2011 10
  • 11. SFI Enterprise Workshop - 25th July 2011 11
  • 12. Typical Business Results Domain Result Computational Finance 1 or 8 Cards (x121/x950) = Do in 1 second what used to Oil and Gas Data processing = x2 – x6 (profiling at this stage), e.g. if volume took 44 mins could be done in 22 – 7 ½ mins Life Sciences Patient analytics, initial prototype for cardio-vascular disease detection (~72% accuracy), ongoing work. Telecomms Fraud detection prototype for subscription fraud, Detection (~99% accuracy), avoided predicting good clients as fraudster* Electronic Commerce Demand forecasting & customer segmentation = Using historic data to predict future demand (~90% accuracy) & identified valuable clients (~80% accuracy) take 2/16 minutes, 10 generations of processor SFI Enterprise Workshop - 25th July 2011 12
  • 13. Acknowledgements Supported by Science Foundation Ireland under grant 08/HEC/I1450 and by HEA’s PRTLI-C4.