SlideShare a Scribd company logo
Version Control
.
Introduction
Nowadays, GPU computational power grows fast, and multi-dissimilar-
processor computation has been increasingly popular. It is our attempt to
adopt GPUs techniques on the computation engine and algorithms of
ablative thermal protection system surfaces modeling.
Analyzing the source code, we have roughly figured out the time
consuming percentage of each part of the source code and the reasons
behind. Since that GPUs techniques require low level details, it is also
essential in some cases to be familiarize with the related source code other
than the preconditioner and solver.
Several approaches to adopt GPUs iterative methods are attempted since
this is a ongoing project and direction is not clear. Considering original
program uses PETSc, our first attempt was to adopt ready-to-use libraries
like AMGX to test potentials and estimate incoming problems.
Other ways like using new sparse matrix subclasses to perform matrix-
vector products on the GPU or to adjust half-ready libraries like
CUDA_ITSOL are still on hold.
2015 Kentucky Academic of Science Conference
Abstract
This project is based on a larger multi-group ongoing project and version
control was implemented. Our current research aims to improve
computational efficiency in solving large sparse linear systems for
modeling of ablative thermal protection system surfaces by introducing
GPU techniques. So far three approaches have been attempted, and trying
to use AMGX is still in progress.
The tools we formed our toolchain include Git, CMake, Doxygen, and Gitolite
etc. The purpose we set up the current tools was to not only increase the speed
of the program, but to increase the speed od developing as well. Thanks to
former researchers, we are developing synchronously at an steady and
increasingly fast speed.
Their main functionalities cover from analyzing and dissecting source code to
building and synchronizing works. For instance, with SSH authentication and
Gitolite authorization we are able to not only avoid running into using
commercial services like Github or Bitbucket.
Adopting GPU Accelerated Computing
The source code mainly choose to use PETSc as the library to solve Ax = b problems. To
control variables and to do detailed comparison we try to keep using block Jacobi
preconditioned flexible GMRES solver as well.
Employing Version Control and GPU Accelerated ComputEmploying Version Control and GPU Accelerated Computinging Technology to Numerical andTechnology to Numerical and
Experimental Investigations for Modeling of Ablative Thermal Protection System SurfacesExperimental Investigations for Modeling of Ablative Thermal Protection System Surfaces
Longyin Cui, Jens Hannemann and Chi Shen
Division of Computer Science, Kentucky State University
Funded by NASA EPSCOR---Subaward 3049025269-14-032
Source Code and its Analyzing
The source code is the fluid dynamics module known as Kentucky
Aerothermodynamics and Thermal Response System (KATS). The solver uses a
second-order finite volume approach to solve the laminar Navier-Stokes
equations, species mass conservation and energy balance equations for flow in
chemical and thermal non-equilibrium state, and a fully implicit first-order
backward Euler method for time integration.
The picture on the
left is example
about a open
source distributed
version control
system Git
showing
documented
changes.
The picture on the
left is showing
different PETSc
function calls’ time
consumption in our
case.
The time is
averaged due to
variation.
In this project, the PETSc library is linked to KATS, which provides linear solver for the
large sparse linear system and preconditioning. It is found that the program require
multiple rounds (as many as 3000000) times of converging. During each process, a series
of function relate to convection-diffusion equations and perturbation are called, and then
preconditioner and solver are chosen and prepared through runtime options. After solving,
solutions are restored for the next round.
For CFD applications, experientially speaking, people see good results with 10:1
CPU:GPU ratio, and have run as many as 40:1. Furthermore, we discovered that in the
preparing process there is another portion of the codes that consumes a large amount of
time. However, the root loop calls functions that pertaining to basic linear algebra
calculations which means it could be solved by GPUs easily.
Our next step is to fully inserted AMGX into source and test different ratio concerning
CPUs and GPUs. The overall time consumption comparison for solving and preparing
phases would be easily achieved.
The picture one the left is the solving time
for one time step when the CPU:GPU ratio
is 1:1.
Although the solving time could speed up
we realize it is unpractical to allocate 16
nodes when requiring 32 GPUs if each
node has only 2.
The picture on the
left is another
example of
building tool
CMake.
It is chosen for its
compatibility
across different
platform and
convenience to
use.

More Related Content

PDF
The Past, Present, and Future of OpenACC
PPTX
OpenACC Monthly Highlights: September 2021
PDF
Deep Learning for Fast Simulation
PPTX
OpenACC Monthly Highlights: February 2022
PPTX
OpenACC Monthly Highlights: January 2021
PPTX
OpenACC Monthly Highlights: July 2021
PDF
Resume_Dec_16
PPTX
OpenACC Highlights: GTC Digital April 2020
The Past, Present, and Future of OpenACC
OpenACC Monthly Highlights: September 2021
Deep Learning for Fast Simulation
OpenACC Monthly Highlights: February 2022
OpenACC Monthly Highlights: January 2021
OpenACC Monthly Highlights: July 2021
Resume_Dec_16
OpenACC Highlights: GTC Digital April 2020

What's hot (20)

PDF
Varun Gatne - Resume - Final
PPTX
OpenACC Monthly Highlights: March 2021
PPTX
OpenACC Monthly Highlights: June 2021
PPTX
OpenACC Monthly Highlights March 2019
PPTX
OpenACC Monthly Highlights: October2020
PPTX
OpenACC Highlights: 2019 Year in Review
PPTX
OpenACC Monthly Highlights: May 2019
PPTX
Evaluating Caching Strategies for Cloud Data Access using an Enterprise Serv...
PPTX
OpenACC Monthly Highlights Summer 2019
PPTX
OpenACC Monthly Highlights: February 2021
DOC
Shrilesh kathe 2017
DOCX
A data and task co scheduling algorithm for scientific cloud workflows
PPTX
OpenACC Monthly Highlights: August 2020
PPTX
OpenACC Monthly Highlights: June 2020
PPTX
OpenACC Monthly Highlights: November 2020
PDF
Portfolio-DeepSoft.en
PDF
HPC + Ai: Machine Learning Models in Scientific Computing
PPTX
OpenACC Monthly Highlights April 2018
PPTX
A Model Transformation from the Palladio Component Model to Layered Queueing ...
PDF
Spark for Behavioral Analytics Research: Spark Summit East talk by John W u
Varun Gatne - Resume - Final
OpenACC Monthly Highlights: March 2021
OpenACC Monthly Highlights: June 2021
OpenACC Monthly Highlights March 2019
OpenACC Monthly Highlights: October2020
OpenACC Highlights: 2019 Year in Review
OpenACC Monthly Highlights: May 2019
Evaluating Caching Strategies for Cloud Data Access using an Enterprise Serv...
OpenACC Monthly Highlights Summer 2019
OpenACC Monthly Highlights: February 2021
Shrilesh kathe 2017
A data and task co scheduling algorithm for scientific cloud workflows
OpenACC Monthly Highlights: August 2020
OpenACC Monthly Highlights: June 2020
OpenACC Monthly Highlights: November 2020
Portfolio-DeepSoft.en
HPC + Ai: Machine Learning Models in Scientific Computing
OpenACC Monthly Highlights April 2018
A Model Transformation from the Palladio Component Model to Layered Queueing ...
Spark for Behavioral Analytics Research: Spark Summit East talk by John W u
Ad

Similar to NASA_EPSCoR_poster_2015 (20)

PDF
Hybrid Multicore Computing : NOTES
PPTX
OpenACC Monthly Highlights: May 2020
PDF
Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterog...
PDF
P5 verification
PPTX
Graphics processing unit ppt
PDF
GPU accelerated Large Scale Analytics
PDF
FrackingPaper
PDF
openCL Paper
PPTX
OpenACC Monthly Highlights February 2019
PPTX
OpenACC Monthly Highlights February 2019
PDF
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
PDF
Comparing Write-Ahead Logging and the Memory Bus Using
PDF
Accelerating economics: how GPUs can save you time and money
PPTX
OpenACC and Open Hackathons Monthly Highlights: April 2022
PPTX
OpenACC and Open Hackathons Monthly Highlights: July 2022.pptx
PDF
Calibration of Deployment Simulation Models - A Multi-Paradigm Modelling Appr...
PDF
Accelerating S3D A GPGPU Case Study
PPTX
Process planning approaches
PDF
OpenACC and Hackathons Monthly Highlights: April 2023
PPTX
OpenACC and Open Hackathons Monthly Highlights August 2022
Hybrid Multicore Computing : NOTES
OpenACC Monthly Highlights: May 2020
Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterog...
P5 verification
Graphics processing unit ppt
GPU accelerated Large Scale Analytics
FrackingPaper
openCL Paper
OpenACC Monthly Highlights February 2019
OpenACC Monthly Highlights February 2019
Matlab Based High Level Synthesis Engine for Area And Power Efficient Arithme...
Comparing Write-Ahead Logging and the Memory Bus Using
Accelerating economics: how GPUs can save you time and money
OpenACC and Open Hackathons Monthly Highlights: April 2022
OpenACC and Open Hackathons Monthly Highlights: July 2022.pptx
Calibration of Deployment Simulation Models - A Multi-Paradigm Modelling Appr...
Accelerating S3D A GPGPU Case Study
Process planning approaches
OpenACC and Hackathons Monthly Highlights: April 2023
OpenACC and Open Hackathons Monthly Highlights August 2022
Ad

NASA_EPSCoR_poster_2015

  • 1. Version Control . Introduction Nowadays, GPU computational power grows fast, and multi-dissimilar- processor computation has been increasingly popular. It is our attempt to adopt GPUs techniques on the computation engine and algorithms of ablative thermal protection system surfaces modeling. Analyzing the source code, we have roughly figured out the time consuming percentage of each part of the source code and the reasons behind. Since that GPUs techniques require low level details, it is also essential in some cases to be familiarize with the related source code other than the preconditioner and solver. Several approaches to adopt GPUs iterative methods are attempted since this is a ongoing project and direction is not clear. Considering original program uses PETSc, our first attempt was to adopt ready-to-use libraries like AMGX to test potentials and estimate incoming problems. Other ways like using new sparse matrix subclasses to perform matrix- vector products on the GPU or to adjust half-ready libraries like CUDA_ITSOL are still on hold. 2015 Kentucky Academic of Science Conference Abstract This project is based on a larger multi-group ongoing project and version control was implemented. Our current research aims to improve computational efficiency in solving large sparse linear systems for modeling of ablative thermal protection system surfaces by introducing GPU techniques. So far three approaches have been attempted, and trying to use AMGX is still in progress. The tools we formed our toolchain include Git, CMake, Doxygen, and Gitolite etc. The purpose we set up the current tools was to not only increase the speed of the program, but to increase the speed od developing as well. Thanks to former researchers, we are developing synchronously at an steady and increasingly fast speed. Their main functionalities cover from analyzing and dissecting source code to building and synchronizing works. For instance, with SSH authentication and Gitolite authorization we are able to not only avoid running into using commercial services like Github or Bitbucket. Adopting GPU Accelerated Computing The source code mainly choose to use PETSc as the library to solve Ax = b problems. To control variables and to do detailed comparison we try to keep using block Jacobi preconditioned flexible GMRES solver as well. Employing Version Control and GPU Accelerated ComputEmploying Version Control and GPU Accelerated Computinging Technology to Numerical andTechnology to Numerical and Experimental Investigations for Modeling of Ablative Thermal Protection System SurfacesExperimental Investigations for Modeling of Ablative Thermal Protection System Surfaces Longyin Cui, Jens Hannemann and Chi Shen Division of Computer Science, Kentucky State University Funded by NASA EPSCOR---Subaward 3049025269-14-032 Source Code and its Analyzing The source code is the fluid dynamics module known as Kentucky Aerothermodynamics and Thermal Response System (KATS). The solver uses a second-order finite volume approach to solve the laminar Navier-Stokes equations, species mass conservation and energy balance equations for flow in chemical and thermal non-equilibrium state, and a fully implicit first-order backward Euler method for time integration. The picture on the left is example about a open source distributed version control system Git showing documented changes. The picture on the left is showing different PETSc function calls’ time consumption in our case. The time is averaged due to variation. In this project, the PETSc library is linked to KATS, which provides linear solver for the large sparse linear system and preconditioning. It is found that the program require multiple rounds (as many as 3000000) times of converging. During each process, a series of function relate to convection-diffusion equations and perturbation are called, and then preconditioner and solver are chosen and prepared through runtime options. After solving, solutions are restored for the next round. For CFD applications, experientially speaking, people see good results with 10:1 CPU:GPU ratio, and have run as many as 40:1. Furthermore, we discovered that in the preparing process there is another portion of the codes that consumes a large amount of time. However, the root loop calls functions that pertaining to basic linear algebra calculations which means it could be solved by GPUs easily. Our next step is to fully inserted AMGX into source and test different ratio concerning CPUs and GPUs. The overall time consumption comparison for solving and preparing phases would be easily achieved. The picture one the left is the solving time for one time step when the CPU:GPU ratio is 1:1. Although the solving time could speed up we realize it is unpractical to allocate 16 nodes when requiring 32 GPUs if each node has only 2. The picture on the left is another example of building tool CMake. It is chosen for its compatibility across different platform and convenience to use.