SlideShare a Scribd company logo
January 2019
GPU Performance Tests of the
RiverFlow2D Model
RiverFlow2D GPU Tests
ii
RiverFlow2D© model and documentation produced by Hydronia, LLC. Pembroke Pines, FL. USA.
Information in this document is subject to change without notice and does not represent a commitment on part of Hydronia, LLC.
RiverFlow2D. RiverFlow2D Plus and RiverFlow2D GPU are copyrighted by Hydronia, LLC. 2011-2019.
All other products or service names mentioned herein are trademarks of their respective owners.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted
in any form or by any means electronic, mechanical, photocopying, recording or otherwise, without the prior written permission
of Hydronia, LLC.
Technical documentation prepared by Asier Lacasta and Reinaldo Garcia.
Last document modification date: January, 2019.
Technical Support: support@hydronia.com
Web site: www.hydronia.com
RiverFlow2D GPU Tests
iii
Contents
CONTENTS..................................................................................................................................................III
LIST OF FIGURES ......................................................................................................................................IV
LIST OF TABLES .........................................................................................................................................V
1 INTRODUCTION..................................................................................................................................1
2 TEST CASES.......................................................................................................................................2
2.1 TEST 1................................................................................................................................................................................2
Test 1 Results ...............................................................................................................................................................3
2.2 TEST 2................................................................................................................................................................................4
Test 2 Results ...............................................................................................................................................................5
2.3 TEST 3................................................................................................................................................................................6
Test 3 Results ...............................................................................................................................................................7
2.4 TEST 4................................................................................................................................................................................8
Test 4 Results ...............................................................................................................................................................9
3 COMMENTS.......................................................................................................................................10
RiverFlow2D GPU Tests
iv
List of Figures
Figure 1 RiverFlow2D Plus triangular-cell mesh...............................................................................................................................1
Figure 2 Main window at the end of the simulation for test 1 mesh 3...............................................................................................2
Figure 3 Test 1: Speed up of the GPU solution compared against the non-parallelized CPU version.............................................3
Figure 4 Main window at the end of the simulation for test 2 mesh 3...............................................................................................4
Figure 5 Test 2: Speed up of the GPU solution compared against the non-parallelized CPU version.............................................5
Figure 6 Main window at the end of the simulation for test 3 using Tesla K40 (top) and using Tesla K80 (bottom). .......................6
Figure 7 Test 3: Speed up of the GPU solution compared against the non-parallelized CPU version.............................................7
Figure 8 Main window at the end of the simulation for test 4 using Tesla K40 (top) and using Tesla K80 (bottom). ......................8
Figure 9 Test 4: Computational cost (in seconds) for the Tesla K80, Tesla P100, Tesla V100 and RTX 2080 Ti devices. ............9
RiverFlow2D GPU Tests
v
List of Tables
Table 1 Technical specification summary of NVIDIA GPU hardware. ..............................................................................................1
Table 2 Test 1: Run times for RiverFlow2D in different GPU hardware. Intel CPU corresponds to the non-parallelized model......3
Table 3 Test 2: Run times for RiverFlow2D in different GPU hardware. Intel CPU corresponds to the non-parallelized model.....5
Table 4 Test 3: Run times for RiverFlow2D in different GPU hardware. Intel CPU corresponds to the non-parallelized model......7
Table 5 Test 4: Run times for RiverFlow2D in different GPU hardware. Intel CPU corresponds to the non-parallelized model......9
#Riverflow2 d gpu tests 2019
RiverFlow2D GPU Tests
1
1 Introduction
RiverFlow2D, is suite of two-dimensional finite-volume models for rivers, floodplains and estuaries that include flow
hydrodynamics, and add-on modules for erosion and deposition simulations, mud and debris flows, and pollutant
dispersion. RiverFlow2D can route floods in rivers and simulate inundation over complex terrain at high resolution
and with remarkable stability, accuracy and speed. The use of adaptive triangular-cell meshes enables the flow field
to be resolved around key features in complex river environments. The GPU version allows performing hydrodynamic
computations up to than 680 times faster than non-parallelized models. RiverFlow2D hydraulic simulation core has
been developed in collaboration with the Computational Hydraulics Group of the University of Zaragoza in
Spain.
This document presents several tests to demonstrate the performance of the RiverFlow2D GPU model on a
variety of real project applications using several meshes with different resolutions and utilizing various NVIDIA
GPU hardware cards (see Table 1).
Table 1 Technical specification summary of NVIDIA GPU hardware.
Tesla
K40
Tesla K80 GTX 1080 Ti
Tesla
P100
Tesla V100 RTX 2080 Ti
CUDA cores 2,880 2 x 2,496 3,584 3,584 5,120 4,352
Memory 12 Gb 24 Gb 11 Gb 16 Gb 16 Gb 11 Gb
Note: The sequential version of the code was run on a computer with an Intel Core i7-3820 @ 3.60 GHz CPU.
In the tests described in this document we report runtimes for each application and calculate model speed
ups with respect to the non-parallelized CPU model (using one core), which is the standard procedure to
compute speedups. For instance, if the speedup is reported to be 100, it means that the model performs
100 times faster than the non-parallelized version.
Figure 1 RiverFlow2D Plus triangular-cell mesh.
RiverFlow2D GPU Tests
2
2 Test Cases
We present different tests to illustrate the performance of the RiverFlow2D GPU model in five real
applications using various GPU cards.
2.1 Test 1
The first test case involves the model application to a short reach of the Green River (USA) using three mesh
resolutions: 19,079 cells (Mesh 3), 154,880 cells (Mesh 3), and 1,878,607 (Mesh 3).
Figure 2 Main window at the end of the simulation for test 1 mesh 3.
RiverFlow2D GPU Tests
3
Test 1 Results
Table 2 Test 1: Run times for RiverFlow2D in different GPU hardware. Intel CPU corresponds to the non-parallelized
model.
Mesh No. Cells Intel CPU Tesla K80
GTX 1080
Ti
Tesla P100 Tesla V100
RTX 2080
Ti
Max
Speedup
Mesh1 19,079 00:00:08:14 00:00:00:18 00:00:00:38 00:00:00:13 00:00:00:11 00:00:00:46 45x
Mesh2 154,880 00:03:23:47 00:00:02:38 00:00:02:44 00:00:01:24 00:00:00:51 00:00:03:07 238x
Mesh3 1,878,607 08:23:17:47 00:01:28:04 00:01:08:28 00:00:33:40 00:00:18:49 00:01:00:39 687x
Figure 3 Test 1: Speed up of the GPU solution compared against the non-parallelized CPU version.
27.44
13.00
38.00 44.91
10.74
77.39 74.55
145.56
239.75
65.39
146.68
188.67
383.70
686.51
212.99
0.00
100.00
200.00
300.00
400.00
500.00
600.00
700.00
800.00
Tesla K80 GTX 1080 Ti Tesla P100 Tesla V100 RTX 2080 Ti
Axis Title
Mesh1 Mesh2 Mesh3
RiverFlow2D GPU Tests
4
2.2 Test 2
The second test is about an application of a hydraulic structure in New Orleans at high resolution. We present
results for three meshes: 21,001 Cells in Mesh 1, 539,177 cells in Mesh 2 and 1,640,606 Cells in Mesh 3. The
project was provided by Stantec.
Figure 4 Main window at the end of the simulation for test 2 mesh 3.
RiverFlow2D GPU Tests
5
Test 2 Results
Table 3 Test 2: Run times for RiverFlow2D in different GPU hardware. Intel CPU corresponds to the non-parallelized
model.
Mesh No. Cells Intel CPU Tesla K80
GTX 1080
Ti
Tesla P100 Tesla V100
RTX 2080
Ti
Max
Speedup
Mesh 1 21,001 00:00:37:07 00:00:01:24 00:00:02:42 00:00:00:53 00:00:00:44 00:00:03:15 51x
Mesh 2 539,177 02:22:39:24 00:00:38:12 00:00:35:30 00:00:16:36 00:00:09:49 00:00:32:08 432x
Mesh 3 1,640,606 16:05:34:31 00:03:18:32 00:02:37:37 00:01:15:36 00:00:40:55 00:02:17:59 571x
Figure 5 Test 2: Speed up of the GPU solution compared against the non-parallelized CPU version.
26.51
13.75
42.02 50.61
11.42
110.98 119.42
255.39
431.86
131.93117.74
148.30
309.19
571.27
169.40
0.00
100.00
200.00
300.00
400.00
500.00
600.00
Tesla K80 GTX 1080 Ti Tesla P100 Tesla V100 RTX 2080 Ti
Mesh1 Mesh2 Mesh3
RiverFlow2D GPU Tests
6
2.3 Test 3
This test case represents an event for the simulation of a river in California (USA) including 357,611 cells. The
event covers a period of 6 days and 23 hours.
Figure 6 Main window at the end of the simulation for test 3 using Tesla K40 (top) and using Tesla K80 (bottom).
RiverFlow2D GPU Tests
7
Test 3 Results
Table 4 Test 3: Run times for RiverFlow2D in different GPU hardware. Intel CPU corresponds to the non-parallelized
model.
No. Cells Intel CPU Tesla K80
GTX 1080
Ti
Tesla P100 Tesla V100
RTX 2080
Ti
Max
Speedup
357,611 06:00:30:01 00:01:51:47 00:01:47:10 00:00:49:49 00:00:34:38 00:01:59:56 250x
Figure 7 Test 3: Speed up of the GPU solution compared against the non-parallelized CPU version.
77.56 80.90
174.04
250.34
72.29
0.00
50.00
100.00
150.00
200.00
250.00
300.00
Tesla K80 GTX 1080 Ti Tesla P100 Tesla V100 RTX 2080 Ti
RiverFlow2D GPU Tests
8
2.4 Test 4
This test reports on results of an ongoing collaboration with the National Oceanic and Atmospheric
Administration (NOAA) of the USA. It shows a simulation of 420-mile reach of the Red River of the North located
in Minnesota (USA). The event involves the routing of 3-month hydrographs.
Figure 8 Main window at the end of the simulation for test 4 using Tesla K40 (top) and using Tesla K80 (bottom).
RiverFlow2D GPU Tests
9
Test 4 Results
The computer times of the non-parallelized CPU model is impractical for this test. Therefore, only the
RiverFlow2D GPU model was used.
Table 5 Test 4: Run times for RiverFlow2D in different GPU hardware. Intel CPU corresponds to the non-parallelized
model.
No. of cells Tesla K80 Tesla P100 Tesla V100
RTX 2080
Ti
4,616,546 00:20:50:46 01:02:55:42 01:02:21:04 00:12:22:54
Figure 9 Test 4: Computational cost (in seconds) for the Tesla K80, Tesla P100, Tesla V100 and RTX 2080 Ti devices.
75046
33972
22948
44574
Tesla K80 Tesla P100 Tesla V100 RTX 2080 Ti
RiverFlow2D GPU Tests
10
3 Comments
This report presents performance results of the Riverflow2D GPU model in several NVIDIA GPUs including the
latest generation RTX, Tesla P100 and V100 cards. While the Tesla V100 is still the clear winner of the tested
devices, the NVIDIA GTX 1080 Ti card is much lower in costs and its acceleration capabilities are also
remarkable. The latest benchmarks include the RTX 2080 Ti, for which the performance gain is almost
negligible compared to the GTX 1080 Ti, therefore cannot be recommended as the best low cost solution. This
was a surprise for us since we usually see about a 20-25% increase in speed between generations.
As demonstrated in the tests presented in this document, the remarkable performance of the RiverFlow2D GPU
has several implications including:
• Computer run times are reduced from days to a few hours, or from hours to minutes, and from minutes
to seconds in some cases.
• The RiverFlow2D GPU allows evaluating river flooding simulations of large river reaches that were
impractical until recently due to excessive runtimes.
• The use of GPU technology developed in the RiverFlow2D code also allows using models with large
resolution meshes involving millions of cells.
• The emergence of Pay-per-Use Cloud Services such as the Google Cloud where all of the tested cards
are available at very attractive costs, facilitates the use of the RiverFlow2D GPU model for a wide range
or applications.

More Related Content

PDF
計算力学シミュレーションに GPU は役立つのか?
PDF
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
PDF
Implementing AI: High Performance Architectures: A Universal Accelerated Comp...
 
PDF
組み込みから HPC まで ARM コアで実現するエコシステム
TXT
Starburn
PDF
QGATE 0.3: QUANTUM CIRCUIT SIMULATOR
PDF
CGYRO Performance on Power9 CPUs and Volta GPUS
PDF
Ai Forum at Computex 2017 - Keynote Slides by Jensen Huang
計算力学シミュレーションに GPU は役立つのか?
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
Implementing AI: High Performance Architectures: A Universal Accelerated Comp...
 
組み込みから HPC まで ARM コアで実現するエコシステム
Starburn
QGATE 0.3: QUANTUM CIRCUIT SIMULATOR
CGYRO Performance on Power9 CPUs and Volta GPUS
Ai Forum at Computex 2017 - Keynote Slides by Jensen Huang

What's hot (20)

PPTX
Kindratenko hpc day 2011 Kiev
PDF
Tegra 4 outperforms snapdragon
PDF
Volta (Tesla V100) の紹介
PPTX
Applying of the NVIDIA CUDA to the video processing in the task of the roundw...
PPTX
Debugging CUDA applications
PDF
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloud
PDF
Cuda 6 performance_report
PPTX
Schematic diagrams of GPUs' architecture and Time evolution of theoretical FL...
PDF
Possibilities of generative models
PDF
最新の HPC 技術を生かした AI・ビッグデータインフラの東工大 TSUBAME3.0 及び産総研 ABCI
PPT
BladeCenter GPU Expansion Blade (BGE) - Client Presentation
PDF
How to Choose Mobile Workstation? VR Ready
PDF
RAPIDS Overview
PDF
1101: GRID 技術セッション 2:vGPU Sizing
PDF
ELC-E Linux Awareness
PDF
GTC 2018 で発表された自動運転最新情報
PDF
Contour Ilugc Demo Presentation
PDF
GDDR5 SDRAM : Notes
PDF
Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas
PDF
Evolution of Supermicro GPU Server Solution
Kindratenko hpc day 2011 Kiev
Tegra 4 outperforms snapdragon
Volta (Tesla V100) の紹介
Applying of the NVIDIA CUDA to the video processing in the task of the roundw...
Debugging CUDA applications
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloud
Cuda 6 performance_report
Schematic diagrams of GPUs' architecture and Time evolution of theoretical FL...
Possibilities of generative models
最新の HPC 技術を生かした AI・ビッグデータインフラの東工大 TSUBAME3.0 及び産総研 ABCI
BladeCenter GPU Expansion Blade (BGE) - Client Presentation
How to Choose Mobile Workstation? VR Ready
RAPIDS Overview
1101: GRID 技術セッション 2:vGPU Sizing
ELC-E Linux Awareness
GTC 2018 で発表された自動運転最新情報
Contour Ilugc Demo Presentation
GDDR5 SDRAM : Notes
Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas
Evolution of Supermicro GPU Server Solution
Ad

Similar to #Riverflow2 d gpu tests 2019 (20)

PDF
NVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdf
PDF
7nm "Navi" GPU - A GPU Built For Performance
 
PPTX
Hybrid CPU GPU MATLAB Image Processing Benchmarking
PPTX
NVIDIA vGPU - Introduction to NVIDIA Virtual GPU
PDF
Hardware & Software Platforms for HPC, AI and ML
PDF
Experiences with Power 9 at A*STAR CRC
PDF
Application Optimisation using OpenPOWER and Power 9 systems
PDF
Nvidia tesla-k80-overview
PDF
[IGC2018] AMD Don Woligroski - WHY Ryzen
PDF
HPE ProLiant DL380 Gen9 Server Data Sheet
PDF
DCSF 19 Accelerating Docker Containers with NVIDIA GPUs
PDF
20170602_OSSummit_an_intelligent_storage
PPTX
Introduction to Accelerators
PDF
Accelerating Data Science With GPUs
PDF
Dell PowerEdge C4130 & NVIDIA Tesla K80 GPU accelerators
PDF
Latest HPC News from NVIDIA
PDF
NVIDIA GPUs Power HPC & AI Workloads in Cloud with Univa
PPTX
Robotics and Machine Learning: Working with NVIDIA Jetson Kits
PDF
20201006_PGconf_Online_Large_Data_Processing
PPTX
Building the World's Largest GPU
NVIDIA DGX User Group 1st Meet Up_30 Apr 2021.pdf
7nm "Navi" GPU - A GPU Built For Performance
 
Hybrid CPU GPU MATLAB Image Processing Benchmarking
NVIDIA vGPU - Introduction to NVIDIA Virtual GPU
Hardware & Software Platforms for HPC, AI and ML
Experiences with Power 9 at A*STAR CRC
Application Optimisation using OpenPOWER and Power 9 systems
Nvidia tesla-k80-overview
[IGC2018] AMD Don Woligroski - WHY Ryzen
HPE ProLiant DL380 Gen9 Server Data Sheet
DCSF 19 Accelerating Docker Containers with NVIDIA GPUs
20170602_OSSummit_an_intelligent_storage
Introduction to Accelerators
Accelerating Data Science With GPUs
Dell PowerEdge C4130 & NVIDIA Tesla K80 GPU accelerators
Latest HPC News from NVIDIA
NVIDIA GPUs Power HPC & AI Workloads in Cloud with Univa
Robotics and Machine Learning: Working with NVIDIA Jetson Kits
20201006_PGconf_Online_Large_Data_Processing
Building the World's Largest GPU
Ad

More from Cheer Chain Enterprise Co., Ltd. (20)

PDF
Intel® Tiber™ Developer Cloud Overview (30 min).pdf
PDF
MAXQDA-24-Features-EN.pdf
PDF
PPTX
Paessler_Sales_Presentation_EN.pptx
PDF
A General Method for Estimating a Linear Structural Equation System
PDF
Focused Analysis of #Qualitative #Interviews with #MAXQDA Step by Step - #免費 ...
PDF
Maxqda 2020 質性分析及混合研究理論應用軟體完整使用手冊(英文版)
PPTX
#Acunetix #product #presentation
PDF
DEA SolverPro Newsletter19
PDF
DEA-Solver-Pro Version 14d- Newsletter17
PPT
Doctor web Company profile 防毒軟體公司簡介
PDF
Getting started-guide-maxqda2018 MAXQDA 2018 質性分析軟體 中英文快速入門手冊
PDF
NativeJ screenshot - NativeJ is a powerful Java EXE maker!
PDF
Edraw Max Pro 使用者手冊 - All-In-One Diagram Software!!
PDF
Nvidia gpu-application-catalog TESLA K80 GPU應用程式型錄
PDF
Atlas.ti 8 質性分析軟體新功能介紹_祺荃企業有限公司
PDF
Maxqda12 features -detailed feature comparison for more information about eac...
PDF
CABRI® 3D V2 - 革命性的數學工具(中文操作手冊)
PDF
MAXQDA 12 質性(定性)分析軟體中文入門指南
PDF
全新 Veeam Availability Suite v9包括 Veeam Backup & Replication 和 Veeam ONE 備份解決方...
Intel® Tiber™ Developer Cloud Overview (30 min).pdf
MAXQDA-24-Features-EN.pdf
Paessler_Sales_Presentation_EN.pptx
A General Method for Estimating a Linear Structural Equation System
Focused Analysis of #Qualitative #Interviews with #MAXQDA Step by Step - #免費 ...
Maxqda 2020 質性分析及混合研究理論應用軟體完整使用手冊(英文版)
#Acunetix #product #presentation
DEA SolverPro Newsletter19
DEA-Solver-Pro Version 14d- Newsletter17
Doctor web Company profile 防毒軟體公司簡介
Getting started-guide-maxqda2018 MAXQDA 2018 質性分析軟體 中英文快速入門手冊
NativeJ screenshot - NativeJ is a powerful Java EXE maker!
Edraw Max Pro 使用者手冊 - All-In-One Diagram Software!!
Nvidia gpu-application-catalog TESLA K80 GPU應用程式型錄
Atlas.ti 8 質性分析軟體新功能介紹_祺荃企業有限公司
Maxqda12 features -detailed feature comparison for more information about eac...
CABRI® 3D V2 - 革命性的數學工具(中文操作手冊)
MAXQDA 12 質性(定性)分析軟體中文入門指南
全新 Veeam Availability Suite v9包括 Veeam Backup & Replication 和 Veeam ONE 備份解決方...

Recently uploaded (20)

PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
System and Network Administration Chapter 2
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
PTS Company Brochure 2025 (1).pdf.......
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PPTX
ai tools demonstartion for schools and inter college
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PPTX
Introduction to Artificial Intelligence
PDF
top salesforce developer skills in 2025.pdf
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
Understanding Forklifts - TECH EHS Solution
PPTX
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
PPTX
Reimagine Home Health with the Power of Agentic AI​
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
System and Network Administration Chapter 2
How to Choose the Right IT Partner for Your Business in Malaysia
PTS Company Brochure 2025 (1).pdf.......
2025 Textile ERP Trends: SAP, Odoo & Oracle
ai tools demonstartion for schools and inter college
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Introduction to Artificial Intelligence
top salesforce developer skills in 2025.pdf
Navsoft: AI-Powered Business Solutions & Custom Software Development
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Upgrade and Innovation Strategies for SAP ERP Customers
Understanding Forklifts - TECH EHS Solution
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
Reimagine Home Health with the Power of Agentic AI​
Which alternative to Crystal Reports is best for small or large businesses.pdf
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises

#Riverflow2 d gpu tests 2019

  • 1. January 2019 GPU Performance Tests of the RiverFlow2D Model
  • 2. RiverFlow2D GPU Tests ii RiverFlow2D© model and documentation produced by Hydronia, LLC. Pembroke Pines, FL. USA. Information in this document is subject to change without notice and does not represent a commitment on part of Hydronia, LLC. RiverFlow2D. RiverFlow2D Plus and RiverFlow2D GPU are copyrighted by Hydronia, LLC. 2011-2019. All other products or service names mentioned herein are trademarks of their respective owners. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means electronic, mechanical, photocopying, recording or otherwise, without the prior written permission of Hydronia, LLC. Technical documentation prepared by Asier Lacasta and Reinaldo Garcia. Last document modification date: January, 2019. Technical Support: support@hydronia.com Web site: www.hydronia.com
  • 3. RiverFlow2D GPU Tests iii Contents CONTENTS..................................................................................................................................................III LIST OF FIGURES ......................................................................................................................................IV LIST OF TABLES .........................................................................................................................................V 1 INTRODUCTION..................................................................................................................................1 2 TEST CASES.......................................................................................................................................2 2.1 TEST 1................................................................................................................................................................................2 Test 1 Results ...............................................................................................................................................................3 2.2 TEST 2................................................................................................................................................................................4 Test 2 Results ...............................................................................................................................................................5 2.3 TEST 3................................................................................................................................................................................6 Test 3 Results ...............................................................................................................................................................7 2.4 TEST 4................................................................................................................................................................................8 Test 4 Results ...............................................................................................................................................................9 3 COMMENTS.......................................................................................................................................10
  • 4. RiverFlow2D GPU Tests iv List of Figures Figure 1 RiverFlow2D Plus triangular-cell mesh...............................................................................................................................1 Figure 2 Main window at the end of the simulation for test 1 mesh 3...............................................................................................2 Figure 3 Test 1: Speed up of the GPU solution compared against the non-parallelized CPU version.............................................3 Figure 4 Main window at the end of the simulation for test 2 mesh 3...............................................................................................4 Figure 5 Test 2: Speed up of the GPU solution compared against the non-parallelized CPU version.............................................5 Figure 6 Main window at the end of the simulation for test 3 using Tesla K40 (top) and using Tesla K80 (bottom). .......................6 Figure 7 Test 3: Speed up of the GPU solution compared against the non-parallelized CPU version.............................................7 Figure 8 Main window at the end of the simulation for test 4 using Tesla K40 (top) and using Tesla K80 (bottom). ......................8 Figure 9 Test 4: Computational cost (in seconds) for the Tesla K80, Tesla P100, Tesla V100 and RTX 2080 Ti devices. ............9
  • 5. RiverFlow2D GPU Tests v List of Tables Table 1 Technical specification summary of NVIDIA GPU hardware. ..............................................................................................1 Table 2 Test 1: Run times for RiverFlow2D in different GPU hardware. Intel CPU corresponds to the non-parallelized model......3 Table 3 Test 2: Run times for RiverFlow2D in different GPU hardware. Intel CPU corresponds to the non-parallelized model.....5 Table 4 Test 3: Run times for RiverFlow2D in different GPU hardware. Intel CPU corresponds to the non-parallelized model......7 Table 5 Test 4: Run times for RiverFlow2D in different GPU hardware. Intel CPU corresponds to the non-parallelized model......9
  • 7. RiverFlow2D GPU Tests 1 1 Introduction RiverFlow2D, is suite of two-dimensional finite-volume models for rivers, floodplains and estuaries that include flow hydrodynamics, and add-on modules for erosion and deposition simulations, mud and debris flows, and pollutant dispersion. RiverFlow2D can route floods in rivers and simulate inundation over complex terrain at high resolution and with remarkable stability, accuracy and speed. The use of adaptive triangular-cell meshes enables the flow field to be resolved around key features in complex river environments. The GPU version allows performing hydrodynamic computations up to than 680 times faster than non-parallelized models. RiverFlow2D hydraulic simulation core has been developed in collaboration with the Computational Hydraulics Group of the University of Zaragoza in Spain. This document presents several tests to demonstrate the performance of the RiverFlow2D GPU model on a variety of real project applications using several meshes with different resolutions and utilizing various NVIDIA GPU hardware cards (see Table 1). Table 1 Technical specification summary of NVIDIA GPU hardware. Tesla K40 Tesla K80 GTX 1080 Ti Tesla P100 Tesla V100 RTX 2080 Ti CUDA cores 2,880 2 x 2,496 3,584 3,584 5,120 4,352 Memory 12 Gb 24 Gb 11 Gb 16 Gb 16 Gb 11 Gb Note: The sequential version of the code was run on a computer with an Intel Core i7-3820 @ 3.60 GHz CPU. In the tests described in this document we report runtimes for each application and calculate model speed ups with respect to the non-parallelized CPU model (using one core), which is the standard procedure to compute speedups. For instance, if the speedup is reported to be 100, it means that the model performs 100 times faster than the non-parallelized version. Figure 1 RiverFlow2D Plus triangular-cell mesh.
  • 8. RiverFlow2D GPU Tests 2 2 Test Cases We present different tests to illustrate the performance of the RiverFlow2D GPU model in five real applications using various GPU cards. 2.1 Test 1 The first test case involves the model application to a short reach of the Green River (USA) using three mesh resolutions: 19,079 cells (Mesh 3), 154,880 cells (Mesh 3), and 1,878,607 (Mesh 3). Figure 2 Main window at the end of the simulation for test 1 mesh 3.
  • 9. RiverFlow2D GPU Tests 3 Test 1 Results Table 2 Test 1: Run times for RiverFlow2D in different GPU hardware. Intel CPU corresponds to the non-parallelized model. Mesh No. Cells Intel CPU Tesla K80 GTX 1080 Ti Tesla P100 Tesla V100 RTX 2080 Ti Max Speedup Mesh1 19,079 00:00:08:14 00:00:00:18 00:00:00:38 00:00:00:13 00:00:00:11 00:00:00:46 45x Mesh2 154,880 00:03:23:47 00:00:02:38 00:00:02:44 00:00:01:24 00:00:00:51 00:00:03:07 238x Mesh3 1,878,607 08:23:17:47 00:01:28:04 00:01:08:28 00:00:33:40 00:00:18:49 00:01:00:39 687x Figure 3 Test 1: Speed up of the GPU solution compared against the non-parallelized CPU version. 27.44 13.00 38.00 44.91 10.74 77.39 74.55 145.56 239.75 65.39 146.68 188.67 383.70 686.51 212.99 0.00 100.00 200.00 300.00 400.00 500.00 600.00 700.00 800.00 Tesla K80 GTX 1080 Ti Tesla P100 Tesla V100 RTX 2080 Ti Axis Title Mesh1 Mesh2 Mesh3
  • 10. RiverFlow2D GPU Tests 4 2.2 Test 2 The second test is about an application of a hydraulic structure in New Orleans at high resolution. We present results for three meshes: 21,001 Cells in Mesh 1, 539,177 cells in Mesh 2 and 1,640,606 Cells in Mesh 3. The project was provided by Stantec. Figure 4 Main window at the end of the simulation for test 2 mesh 3.
  • 11. RiverFlow2D GPU Tests 5 Test 2 Results Table 3 Test 2: Run times for RiverFlow2D in different GPU hardware. Intel CPU corresponds to the non-parallelized model. Mesh No. Cells Intel CPU Tesla K80 GTX 1080 Ti Tesla P100 Tesla V100 RTX 2080 Ti Max Speedup Mesh 1 21,001 00:00:37:07 00:00:01:24 00:00:02:42 00:00:00:53 00:00:00:44 00:00:03:15 51x Mesh 2 539,177 02:22:39:24 00:00:38:12 00:00:35:30 00:00:16:36 00:00:09:49 00:00:32:08 432x Mesh 3 1,640,606 16:05:34:31 00:03:18:32 00:02:37:37 00:01:15:36 00:00:40:55 00:02:17:59 571x Figure 5 Test 2: Speed up of the GPU solution compared against the non-parallelized CPU version. 26.51 13.75 42.02 50.61 11.42 110.98 119.42 255.39 431.86 131.93117.74 148.30 309.19 571.27 169.40 0.00 100.00 200.00 300.00 400.00 500.00 600.00 Tesla K80 GTX 1080 Ti Tesla P100 Tesla V100 RTX 2080 Ti Mesh1 Mesh2 Mesh3
  • 12. RiverFlow2D GPU Tests 6 2.3 Test 3 This test case represents an event for the simulation of a river in California (USA) including 357,611 cells. The event covers a period of 6 days and 23 hours. Figure 6 Main window at the end of the simulation for test 3 using Tesla K40 (top) and using Tesla K80 (bottom).
  • 13. RiverFlow2D GPU Tests 7 Test 3 Results Table 4 Test 3: Run times for RiverFlow2D in different GPU hardware. Intel CPU corresponds to the non-parallelized model. No. Cells Intel CPU Tesla K80 GTX 1080 Ti Tesla P100 Tesla V100 RTX 2080 Ti Max Speedup 357,611 06:00:30:01 00:01:51:47 00:01:47:10 00:00:49:49 00:00:34:38 00:01:59:56 250x Figure 7 Test 3: Speed up of the GPU solution compared against the non-parallelized CPU version. 77.56 80.90 174.04 250.34 72.29 0.00 50.00 100.00 150.00 200.00 250.00 300.00 Tesla K80 GTX 1080 Ti Tesla P100 Tesla V100 RTX 2080 Ti
  • 14. RiverFlow2D GPU Tests 8 2.4 Test 4 This test reports on results of an ongoing collaboration with the National Oceanic and Atmospheric Administration (NOAA) of the USA. It shows a simulation of 420-mile reach of the Red River of the North located in Minnesota (USA). The event involves the routing of 3-month hydrographs. Figure 8 Main window at the end of the simulation for test 4 using Tesla K40 (top) and using Tesla K80 (bottom).
  • 15. RiverFlow2D GPU Tests 9 Test 4 Results The computer times of the non-parallelized CPU model is impractical for this test. Therefore, only the RiverFlow2D GPU model was used. Table 5 Test 4: Run times for RiverFlow2D in different GPU hardware. Intel CPU corresponds to the non-parallelized model. No. of cells Tesla K80 Tesla P100 Tesla V100 RTX 2080 Ti 4,616,546 00:20:50:46 01:02:55:42 01:02:21:04 00:12:22:54 Figure 9 Test 4: Computational cost (in seconds) for the Tesla K80, Tesla P100, Tesla V100 and RTX 2080 Ti devices. 75046 33972 22948 44574 Tesla K80 Tesla P100 Tesla V100 RTX 2080 Ti
  • 16. RiverFlow2D GPU Tests 10 3 Comments This report presents performance results of the Riverflow2D GPU model in several NVIDIA GPUs including the latest generation RTX, Tesla P100 and V100 cards. While the Tesla V100 is still the clear winner of the tested devices, the NVIDIA GTX 1080 Ti card is much lower in costs and its acceleration capabilities are also remarkable. The latest benchmarks include the RTX 2080 Ti, for which the performance gain is almost negligible compared to the GTX 1080 Ti, therefore cannot be recommended as the best low cost solution. This was a surprise for us since we usually see about a 20-25% increase in speed between generations. As demonstrated in the tests presented in this document, the remarkable performance of the RiverFlow2D GPU has several implications including: • Computer run times are reduced from days to a few hours, or from hours to minutes, and from minutes to seconds in some cases. • The RiverFlow2D GPU allows evaluating river flooding simulations of large river reaches that were impractical until recently due to excessive runtimes. • The use of GPU technology developed in the RiverFlow2D code also allows using models with large resolution meshes involving millions of cells. • The emergence of Pay-per-Use Cloud Services such as the Google Cloud where all of the tested cards are available at very attractive costs, facilitates the use of the RiverFlow2D GPU model for a wide range or applications.