SlideShare a Scribd company logo
1
Deep Learning for Fast Simulation
HNSciCloud M-PIL-3.2 meeting
June 2018
S. Vallecorsa F.Carminati G. Khattak
2
Our objective
• Activities on-going to speedup Monte Carlo techniques
• Not enough to cope with HL-LHC expected needs
• Current fast simulation solutions are detector dependent
• A general fast simulation tool based on Machine
Learning/Deep Learning
• Optimizing training time becomes crucial
Improved, efficient and accurate fast simulation
2
3
Requirements
Precise simulation results
Detailed validation process
A fast inference step
Generic customizable tool
Easy-to-use and easily extensible framework
Large hyper-parameters scans and meta-optimisation:
Training time under control
Scalability
Possibility to work across platforms
3
4
Generator G generates data from random noise
Discriminator D learns how to distinguish real data
from generated data
4
Simultaneously train two networks that compete and cooperate with each other
Generative adversarial networks
arXiv:1406.2661v1	
Image source:
The (blind) counterfeiter/detective case
Counterfeiter shows the Monalisa
Detective says it is fake and gives feedback
Counterfeiter makes new Monalisa based on feedback
Iterate until detective is fooled
https://arxiv.org/pdf/1701.00160v1.pdf
5
Generated images
Interpret detector output as a 3D image
5
GAN	generated	electron	
shower
Y	moment	(width)
Average	shower	
section
3D convolutional GAN generate realistic detector output
Customized architecture (includes auxiliary regression tasks)
Agreement to standard Monte Carlo in terms of physics is remarkable!
Energy	fraction	measured	by	the	calorimeter	
on Caltech ibanks GPU cluster thanks to Prof M. Spiropulu
6
Distributed training is needed
Inference:
Monte Carlo: 17 s/particle vs 3DGAN: 7 ms/particle
è speedup factor > 2500 on CPU!!
Training:
45 min/epoch on a NVIDIA P100
Introduce data parallel training using mpi-learn
(Elastic Averaging Stochastic Gradient Descent)
Computing performance
Calorimeter energy
response:
GAN prediction stays
stable through 20
nodes!
Strong scaling measured
at CSCS Swiss National
Super Computing Center
(J-R. Vlimant)
Time	to	create	an	electron	shower
Method Machine
Time/Shower
(msec)
Full	Simulation	
(geant4)
Intel	Xeon	Platinum	
8180
17000
3d	GAN
(batch	size	128)
Intel	Xeon	Platinum	
8180
7
3d	GAN
(batchsize 128)
P100 0.04
7
DL with the HNSciCloud
First tests during prototype (2017)
Single GPU training benchmark ( RHEA, T-Systems,
IBM)
P100 (RHEA - Exoscale) vs K80 (IBM)
Current tests
MPI based distributed training (ssh/TCP)
Local input storage
Single GPU per node
Comparison to HPC environment
Trials with HTCondor on Exoscale cloud (5 VMs)
(still under investigation) 2
2 P100 T-Systems
(CSCS)
8
Next steps
Continue with tests/optimisation:
• Schedulers (SLURM)
• Input storage options
• GPU/node configuration
• Possibility to combine GPUs from different resources
Additional GPUs are needed
First results are very promising
8
9
Thanks!
Questions?

More Related Content

PPT
NASA_EPSCoR_poster_2015
PDF
Low Energy Task Scheduling based on Work Stealing
PPT
Das3 Fjseins
PPTX
Federated HPC Clouds Applied to Radiation Therapy
PDF
The Past, Present, and Future of OpenACC
PPTX
OpenACC Monthly Highlights June 2017
PDF
FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote)
PPT
Marek Suplata Projects
NASA_EPSCoR_poster_2015
Low Energy Task Scheduling based on Work Stealing
Das3 Fjseins
Federated HPC Clouds Applied to Radiation Therapy
The Past, Present, and Future of OpenACC
OpenACC Monthly Highlights June 2017
FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote)
Marek Suplata Projects

What's hot (20)

PPTX
MATLAB Projects for Master Thesis Students
PPTX
OpenACC Monthly Highlights April 2018
PPTX
Sparksummit2016 share
PPTX
OpenACC Monthly Highlights: May 2019
PPTX
MATLAB Thesis Projects
PDF
"Embedded Lucas-Kanade Tracking: How it Works, How to Implement It, and How t...
PPTX
MATLAB Project Topics
PDF
SigOpt at GTC - Tuning the Untunable
PDF
On the Capability and Achievable Performance of FPGAs for HPC Applications
PPTX
Performance_and_Cost_Evaluation
PPTX
MATLAB Project Topics for Students
PDF
Adapting to a Cambrian AI/SW/HW explosion with open co-design competitions an...
PPT
20072311272506
PDF
resume_parbhat
PDF
ODVSML_Presentation
PPTX
Automated Program Repair Keynote talk
PDF
HiPEAC 2020: Energy-aware Task Scheduling in LEGaTO: Low Energy Toolset for H...
PPT
Stephan berg track f
PDF
"Designing CNN Algorithms for Real-time Applications," a Presentation from Al...
PDF
Varun Gatne - Resume - Final
MATLAB Projects for Master Thesis Students
OpenACC Monthly Highlights April 2018
Sparksummit2016 share
OpenACC Monthly Highlights: May 2019
MATLAB Thesis Projects
"Embedded Lucas-Kanade Tracking: How it Works, How to Implement It, and How t...
MATLAB Project Topics
SigOpt at GTC - Tuning the Untunable
On the Capability and Achievable Performance of FPGAs for HPC Applications
Performance_and_Cost_Evaluation
MATLAB Project Topics for Students
Adapting to a Cambrian AI/SW/HW explosion with open co-design competitions an...
20072311272506
resume_parbhat
ODVSML_Presentation
Automated Program Repair Keynote talk
HiPEAC 2020: Energy-aware Task Scheduling in LEGaTO: Low Energy Toolset for H...
Stephan berg track f
"Designing CNN Algorithms for Real-time Applications," a Presentation from Al...
Varun Gatne - Resume - Final
Ad

Similar to Deep Learning for Fast Simulation (20)

PPTX
Panel: NRP Science Impacts​
PPTX
OpenACC Monthly Highlights: May 2020
PDF
Early Application experiences on Summit
PDF
Possibility of hpc application on cloud infrastructure by container cluster
PDF
BDW16 London - Ingrid Funie, Imperial College London - Machine Learning and F...
PDF
Real time intrusion detection in network traffic using adaptive and auto-scal...
PDF
Deep learning for FinTech
PDF
Interactive Data Analysis for End Users on HN Science Cloud
PDF
Reproducible Network Research With High-­Fidelity Emulation
PDF
Training ImageNet-1k ResNet50 in 15min pfn
PDF
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
PDF
HPC + Ai: Machine Learning Models in Scientific Computing
PDF
Performance Optimization of CGYRO for Multiscale Turbulence Simulations
PPTX
OpenACC Monthly Highlights: October2020
PDF
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
PDF
Manycores for the Masses
PDF
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
PDF
Checkpointing the Un-checkpointable: MANA and the Split-Process Approach
PDF
PDF
Opportunities of ML-based data analytics in ABCI
Panel: NRP Science Impacts​
OpenACC Monthly Highlights: May 2020
Early Application experiences on Summit
Possibility of hpc application on cloud infrastructure by container cluster
BDW16 London - Ingrid Funie, Imperial College London - Machine Learning and F...
Real time intrusion detection in network traffic using adaptive and auto-scal...
Deep learning for FinTech
Interactive Data Analysis for End Users on HN Science Cloud
Reproducible Network Research With High-­Fidelity Emulation
Training ImageNet-1k ResNet50 in 15min pfn
FlinkDTW: Time-series Pattern Search at Scale Using Dynamic Time Warping - Ch...
HPC + Ai: Machine Learning Models in Scientific Computing
Performance Optimization of CGYRO for Multiscale Turbulence Simulations
OpenACC Monthly Highlights: October2020
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
Manycores for the Masses
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
Checkpointing the Un-checkpointable: MANA and the Split-Process Approach
Opportunities of ML-based data analytics in ABCI
Ad

More from Helix Nebula The Science Cloud (20)

PDF
M-PIL-3.2 Public Session
PDF
Container Federation Use Cases
PDF
CERN Batch in the HNSciCloud
PDF
LHCb on RHEA and T-Systems
PDF
HNSciCloud CMS status-report
PDF
Helix Nebula Science Cloud usage by ALICE
PDF
Hybrid cloud for science
PDF
HNSciCloud PILOT PLATFORM OVERVIEW
PDF
HNSciCloud Overview
PDF
This Helix Nebula Science Cloud Pilot Phase Open Session
PDF
Cloud Services for Education - HNSciCloud applied to the UP2U project
PDF
Network experiences with Public Cloud Services @ TNC2017
PDF
EOSC in practice - Silvana Muscella (chair EOSC HLEG)
PDF
Helix Nebula Science Cloud Pilot Phase, 6 February 2018, Bologna, Italy
PDF
Pilot phase Award Ceremony - INFN Introduction and welcome
PDF
Early adopter group and closing of webinar - João Fernandes (CERN)
PDF
HNSciCloud pilot phase - Andrea Chierici (INFN)
PDF
Pilot phase Award Ceremony - T-Systems
PDF
Pilot phase Award Ceremony - RHEA
PDF
Overview of HNSciCloud - Bob Jones (CERN)
M-PIL-3.2 Public Session
Container Federation Use Cases
CERN Batch in the HNSciCloud
LHCb on RHEA and T-Systems
HNSciCloud CMS status-report
Helix Nebula Science Cloud usage by ALICE
Hybrid cloud for science
HNSciCloud PILOT PLATFORM OVERVIEW
HNSciCloud Overview
This Helix Nebula Science Cloud Pilot Phase Open Session
Cloud Services for Education - HNSciCloud applied to the UP2U project
Network experiences with Public Cloud Services @ TNC2017
EOSC in practice - Silvana Muscella (chair EOSC HLEG)
Helix Nebula Science Cloud Pilot Phase, 6 February 2018, Bologna, Italy
Pilot phase Award Ceremony - INFN Introduction and welcome
Early adopter group and closing of webinar - João Fernandes (CERN)
HNSciCloud pilot phase - Andrea Chierici (INFN)
Pilot phase Award Ceremony - T-Systems
Pilot phase Award Ceremony - RHEA
Overview of HNSciCloud - Bob Jones (CERN)

Recently uploaded (20)

PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PPTX
Introduction to Artificial Intelligence
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PPTX
Operating system designcfffgfgggggggvggggggggg
PPTX
Transform Your Business with a Software ERP System
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PPTX
ManageIQ - Sprint 268 Review - Slide Deck
PDF
AI in Product Development-omnex systems
PDF
top salesforce developer skills in 2025.pdf
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
Digital Strategies for Manufacturing Companies
PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PPTX
ISO 45001 Occupational Health and Safety Management System
PDF
System and Network Administration Chapter 2
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
PTS Company Brochure 2025 (1).pdf.......
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Introduction to Artificial Intelligence
How to Migrate SBCGlobal Email to Yahoo Easily
Operating system designcfffgfgggggggvggggggggg
Transform Your Business with a Software ERP System
Softaken Excel to vCard Converter Software.pdf
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
ManageIQ - Sprint 268 Review - Slide Deck
AI in Product Development-omnex systems
top salesforce developer skills in 2025.pdf
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
VVF-Customer-Presentation2025-Ver1.9.pptx
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
CHAPTER 2 - PM Management and IT Context
Digital Strategies for Manufacturing Companies
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
ISO 45001 Occupational Health and Safety Management System
System and Network Administration Chapter 2
Odoo POS Development Services by CandidRoot Solutions
PTS Company Brochure 2025 (1).pdf.......

Deep Learning for Fast Simulation

  • 1. 1 Deep Learning for Fast Simulation HNSciCloud M-PIL-3.2 meeting June 2018 S. Vallecorsa F.Carminati G. Khattak
  • 2. 2 Our objective • Activities on-going to speedup Monte Carlo techniques • Not enough to cope with HL-LHC expected needs • Current fast simulation solutions are detector dependent • A general fast simulation tool based on Machine Learning/Deep Learning • Optimizing training time becomes crucial Improved, efficient and accurate fast simulation 2
  • 3. 3 Requirements Precise simulation results Detailed validation process A fast inference step Generic customizable tool Easy-to-use and easily extensible framework Large hyper-parameters scans and meta-optimisation: Training time under control Scalability Possibility to work across platforms 3
  • 4. 4 Generator G generates data from random noise Discriminator D learns how to distinguish real data from generated data 4 Simultaneously train two networks that compete and cooperate with each other Generative adversarial networks arXiv:1406.2661v1 Image source: The (blind) counterfeiter/detective case Counterfeiter shows the Monalisa Detective says it is fake and gives feedback Counterfeiter makes new Monalisa based on feedback Iterate until detective is fooled https://arxiv.org/pdf/1701.00160v1.pdf
  • 5. 5 Generated images Interpret detector output as a 3D image 5 GAN generated electron shower Y moment (width) Average shower section 3D convolutional GAN generate realistic detector output Customized architecture (includes auxiliary regression tasks) Agreement to standard Monte Carlo in terms of physics is remarkable! Energy fraction measured by the calorimeter on Caltech ibanks GPU cluster thanks to Prof M. Spiropulu
  • 6. 6 Distributed training is needed Inference: Monte Carlo: 17 s/particle vs 3DGAN: 7 ms/particle è speedup factor > 2500 on CPU!! Training: 45 min/epoch on a NVIDIA P100 Introduce data parallel training using mpi-learn (Elastic Averaging Stochastic Gradient Descent) Computing performance Calorimeter energy response: GAN prediction stays stable through 20 nodes! Strong scaling measured at CSCS Swiss National Super Computing Center (J-R. Vlimant) Time to create an electron shower Method Machine Time/Shower (msec) Full Simulation (geant4) Intel Xeon Platinum 8180 17000 3d GAN (batch size 128) Intel Xeon Platinum 8180 7 3d GAN (batchsize 128) P100 0.04
  • 7. 7 DL with the HNSciCloud First tests during prototype (2017) Single GPU training benchmark ( RHEA, T-Systems, IBM) P100 (RHEA - Exoscale) vs K80 (IBM) Current tests MPI based distributed training (ssh/TCP) Local input storage Single GPU per node Comparison to HPC environment Trials with HTCondor on Exoscale cloud (5 VMs) (still under investigation) 2 2 P100 T-Systems (CSCS)
  • 8. 8 Next steps Continue with tests/optimisation: • Schedulers (SLURM) • Input storage options • GPU/node configuration • Possibility to combine GPUs from different resources Additional GPUs are needed First results are very promising 8