SlideShare a Scribd company logo
––




Accelerating High Performance Applications
Strategic Focus on Applications

 Senior-level relationship and market
 managers

 Dedicated technical resources

 More than 150 people devoted to
 libraries, tools, application porting
 and market development

 Worldwide focus
Reaching a Broad Range of Markets




  Scientific computing   Creative pro   Education / research
Strategic Partners
CAD/ CAM/    CAE/ EDA    Computational   Computational   Defence &      Digital        Physical       Seismic
CAID                     chemistry       Finance         Intelligence   Content        Sciences       processing
                                                                        creation                      and
                                                                                                      visualization
Autodesk     Ansys       Amber           MATLAB          Ikena          Adobe          Quda (L-QCD)   Schlumberger



Dassault     Dassault    NAMD            Mathematica     Intergraph     Autodesk M&E   WRF            Landmark
Systemes:    Systemes:
CATIA        Simulia
Solidworks

PTC          Nastran     Gromacs         NAG             ESRI           Avid           ACUSA          Paradigm



Siemens      LSTC        Lammps          Murex           Manifold       MainConcept    HOMME



             Synopsys    GAMESS                                         Sony           HYCOM
Leading MD Applications


                    Features
 Application                             GPU Perf   Release Status                           Notes
                   Supported
                     PMEMD :                                                       Single and multi-GPUs.
  AMBER         Explicit & Implicit         8X         V11 Released            Expect 2x more performance in
                     Solvent                                                     V11 patch release (shortly)

               Implicit (5x), Explicit              Single GPU released,             Next release: 2H2011
 GROMACS           (2x) Solvent
                                          2x-5x         Version 4.5.4                 Better Explicit, MPI

               Lennard-Jones, Gay-
 LAMMPS              Berne
                                            6x           Released                    Single and multi-GPU.


                  Non-bond force
  NAMD              calculation
                                          2x-7x        Released, v2.8                Single and multi-GPU.


                                                                    GPU Perf compared against Multi-core x86 CPU socket.
                                                                       GPU Perf benchmarked on GPU supported features
                                                                           and may be a kernel to kernel perf comparison
Additional MD/MM Applications Ramping

                    Features
 Application                             GPU Perf           Release Status                          Notes
                   Supported

                       TBD,                 4-29X                                                Single GPU.
 Abalone           “Simulations”          (on 1060 GPU)
                                                                Released
                                                                                             Agile Molecule, Inc.

                                                                                        Production bio-molecular
                                          “µ-sec long
                 Written for use on                                                  dynamics (MD) software specially
  ACEMD                GPUs
                                        trajectories on         Released
                                                                                      optimized to run on single and
                                         workstation”
                                                                                               multi-GPUs
               Two-body Forces, Link-
                                                            V 4.0 Source only              Next release: 2H2011
 DL_POLY       cell Pairs, Ewald SPME         4x            Results Published        Multi-GPU, multi-node supported
                  forces, Shake VV

 HOOMD-          Written for use on           2X            Released, Version
                                                                                            Single and multi-GPU.
                                        (32 CPU cores vs.
                       GPUs                                       0.9.2
  Blue                                    2 10XX GPUs)


                                                                           GPU Perf compared against Multi-core x86 CPU socket.
                                                                              GPU Perf benchmarked on GPU supported features
                                                                                  and may be a kernel to kernel perf comparison
Viz and “Docking” Applications

  Related              Features
                                                       GPU Perf       Release Status                            Notes
Applications          Supported
                                                                                                     Visualization from Visage
                  3D visualization of
                                                                                                  Imaging. Next release, 5.4, will
Amira 5®         volumetric data and                      N/A        Released, Version 5.3.3
                                                                                                   use GPU for general purpose
                       surfaces
                                                                                                   processing in some functions

  Core              GPU accelerated                      Up to
                                                                      Released, Suite 2011
                                                                                                       Single and multi-GPUs.
                      application                        5000X                                            Schrodinger, Inc.
 Hopping
                   Real-time shape
                                                                                                      Single and multi-GPUs.
FastROCS              similarity                       800-3000X            Released
                                                                                                   Open Eyes Scientific Software
                searching/comparison
                      High quality rendering,
               large structures (100 million atoms),
                       GPU acceleration for
                                                       100-125X or                                Visualization from University of
   VMD         computationally demanding analysis
                 and visualization tasks, multiple
                GPU support for very fast display of
                                                         greater
                                                                     Released, Version 1.9
                                                                                                   Illinois at Urbana-Champaign
                    molecular orbitals arising in
                  quantum chemistry calculations
                                                                                       GPU Perf compared against Multi-core x86 CPU socket.
                                                                                          GPU Perf benchmarked on GPU supported features
                                                                                              and may be a kernel to kernel perf comparison
Quantum Chemistry
                   Features            GPU
Application                                        Release Status                           Notes
                  Supported            Perf
                  Libqc with Rys
                                                                              Single GPU supported in 10/1/10
              Quadrature Algorithm,
                                                                                          release.
GAMESS-US      integral evaluation,     2.5X            Released
                                                                                   Multi-GPU supported in
                 closed shell Fock
                                                                                     July 2011 release.
               matrix construction
               Triples part of Reg-
                                                                                   Development GPGPU
                CCSD(T), CCSD &         3-8X           Date TBA,
NWChem            EOMCCSD task        projected     in development
                                                                                benchmarks: www.nwchem-
                                                                                         sw.org
                    schedulers
                                                       Date TBA,
                Various features        8-14x
 Q-CHEM         including RI-MP2      projected
                                                    In development               Significant porting already

                                      44-650X                                    Single and Multi-GPU.
                “Full GPU-based         vs.                                  Completely redesigned to exploit
TeraChem           solution”          GAMESS
                                                  Version 1.45 released
                                                                                massive GPU parallelism
                                      CPU ver.
                                                                   GPU Perf compared against Multi-core x86 CPU socket.
                                                                      GPU Perf benchmarked on GPU supported features
                                                                          and may be a kernel to kernel perf comparison
Material Science



                   Features            GPU
Application                                    Release Status                           Notes
                  Supported            Perf
               BigDFT - 50% of the                                       http://inac.cea.fr/L_Sim/BigDFT
 Abinit          program (short        6-30X   Released June 2009                  /news.html
                  convolutions)

Quantum-       PWscf package: linear
                 algebra (matrix
                                                                          Created by Irish Centre for High-
Espresso/       multiply), explicit    TBD     Released May 5, 2011
                                                                                  End Computing
              computational kernels,
  PWscf               3D FFTs




                                                                GPU Perf compared against Multi-core x86 CPU socket.
                                                                   GPU Perf benchmarked on GPU supported features
                                                                       and may be a kernel to kernel perf comparison
Bioinformatics


CUDA-BLASTP                 HEX Protein Docking
CUDA-EC                     Jacket (MATLAB Plugin)
CUDA-MEME                   MUMmerGPU
CUDASW++ (Smith-Waterman)   MUMmerGPU++
DNADist                     SARUMAN
GPU Blast                   SeqNFind
GPU-HMMER                   UGENE


                            Additional details can be found at Tesla Bio Workbench:
                            http://www.nvidia.com/object/tesla_bio_workbench.html
Structural Mechanics
    Application      GPU Features               GPU Perf               Release Status                        Notes
ANSYS Mechanical     Linear eqn solvers           2x Total            Today, release 13 SP2          FE implicit, single-GPU

 Abaqus/Standard     Linear eqn solver            2x Total             Today, release 6.11           FE implicit, single-GPU

  IMPETUS Afea       Explicit solver, SPH   10x SPH, 2x Total           Today, release 1.0           FE explicit, multi-GPU


 LS-DYNA implicit    Linear eqn solver            3x Total              Planned for 2011             FE implicit, multi-GPU


   MD Nastran        Linear eqn solvers          2x Solver              Planned for 2011             FE implicit, multi-GPU


       Marc          Linear eqn solver           1.5x Total             Planned for 2011             FE implicit, single-GPU

 RADIOSS Implicit    Linear eqn solver           1.5x Total               Demonstration              FE implicit, single-GPU

PAM-CRASH implicit   Linear eqn solver           1.5x Total               Demonstration              FE implicit, single-GPU

   NX Nastran        Linear eqn solver           1.4x Total               Demonstration              FE implicit, single-GPU
                                   GPU Perf compared against Multi-core x86 CPU socket.
                                   GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
Fluid Dynamics
   Application      GPU Features                GPU Perf              Release Status                       Notes
 Altair AcuSolve    Linear eqn solver             2x Total             Today, release 1.8     FE unstructured NS, multi-GPU

Autodesk Moldflow   Linear eqn solver             2x Total            Today, release 2011     FE unstructured NS, single-GPU

 FluiDyna LBultra   LBM, particle CFD            20x Total             Today, release 1.0       Structured LBM, multi-GPU

FluiDyna Culises-   Linear eqn solvers           3x Solver             Today, release 1.0       Unstructured NS, single-GPU
OpenFOAM Solver
 Vratis SpeedIT-    Linear eqn solvers           3x Solver             Today, release 1.2       Unstructured NS, multi-GPU
OpenFOAM Solver
   Prometech        MPS, particle CFD           4x-9x Total           Q3CY11 release 2.5         Particle based, multi-GPU
  Particleworks
  Sandia NL S3D     Chemistry kernel       8x SP, 5x DP kernel           Demonstration        Structured grid DNS, multi-GPU

  Turbostream         Explicit solver            19x Total             Today, release 2.0      Structured grid NS, multi-GPU

 SD++ (Jameson)       Explicit solver            16x Total             Planned for 2011       FE unstructured NS, multi-GPU
                                    GPU Perf compared against Multi-core x86 CPU socket.
 FEFLO (Lohner)       Explicit solver            2x Total            Planned for 2011         FE unstructured NS, multi-GPU
                                    GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
Electromagnetics

                     Features
  Application                           GPU Perf       Release Status                      Notes
                    Supported
                                                                                     Single & multi-GPU;
 Agilent EMPro          FDTD                6X         2011.07 Released
                                                                                        EMPro 2011 PR

                     Transient (FIT)    9X on 1 GPU
CST Microwave                                                                        Single & multi-GPU;
                 solver; Combined MPI   to 20X+ on 4     2011 Released
                                                                                      www.cst.com/perf
    Studio         & GPU computing          GPUs
                                                                                   Single and multi-GPU;
Remcom XFdtd            FDTD              30-300X        XF7 Released
                                                                                 XStream GPU acceleration

                       FDTD;                                                        Single and multi-GPU;
SPEAG SEMCAD X       Acceleware
                                           100X         14.4.3 Released
                                                                                    www.speag.com/perf




                                                           GPU Performance compared against quad-core x86 CPU socket;
                                                        Remcom XFdtd GPU performance compared against single core CPU
Climate/ Weather/ Ocean
Application   GPU Features                 GPU Perf              Production Status                       Notes
               WSM5, WSM3, Ice
  WRF         Microphysics models
                                         4x-6x Models               Today, release 3.2                  single-GPU


 ASUCA           Most routines             12x Total              In production at JMA                  multi-GPU

   NIM           Most routines           7x Dynamics               Limited production                   multi-GPU

 HIRLAM         Dynamical core             3x Solver                 Planned for 2011                   multi-GPU


 HOMME              Models                 3x Models                 Planned for 2011                   single-GPU


  CAM          Linear eqn solver           2x Solver                 Planned for 2011                   single-GPU

                                        10x Models, 3x
 GEOS-5          Most routines
                                          Dynamics
                                                                      Demonstration                     multi-GPU


 MITgcm        Linear eqn solver           3x solver                  Demonstration                     single-GPU

 HYCOM         Linear eqn solver           2x solver                  Demonstration                     single-GPU
                             GPU Perf compared against Multi-core x86 CPU socket.
                             GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison

More Related Content

PDF
Trinity press deck 10 2 2012
 
PDF
AMD Embedded G-Series Press Presentation
 
PDF
AMD Catalyst Software
 
PPTX
GPU Accelerated Computational Chemistry Applications
PDF
ARM and SoC Traning Part I -- Overview
PDF
3 d to _hpc
PDF
Cots moves to multicore: AMD
PDF
AMD Embedded G-Series Product Page
 
Trinity press deck 10 2 2012
 
AMD Embedded G-Series Press Presentation
 
AMD Catalyst Software
 
GPU Accelerated Computational Chemistry Applications
ARM and SoC Traning Part I -- Overview
3 d to _hpc
Cots moves to multicore: AMD
AMD Embedded G-Series Product Page
 

What's hot (20)

PDF
AMD Opteron 6200 and 4200 Series Presentation
 
PDF
Poser pro reference manual
PDF
AMD Analyst Day 2009: Rick Bergman
 
PDF
AMD Unified Video Decoder
 
PDF
Hardware assisted Virtualization in Embedded
PDF
Congatec_Global Vendor for Innovative Embedded Solutions_Ankara
PDF
Congatec_Global Vendor for Innovative Embedded Solutions_Istanbul
PDF
Toward a practical “HPC Cloud”: Performance tuning of a virtualized HPC cluster
PDF
Implement Checkpointing for Android
PPTX
AMD Opteron 6000 Series Platform Press Presentation
 
PDF
Simulation Directed Co-Design from Smartphones to Supercomputers
PDF
Case Study: Porting Qt for Embedded Linux on Embedded Processors
PDF
Implement Checkpointing for Android (ELCE2012)
PDF
Hp All In 1
PDF
Power7 facts and features 17 aug
PPTX
AMD Chiplet Architecture for High-Performance Server and Desktop Products
 
PDF
HPCMPUG2011 cray tutorial
PDF
An FPGA-based Scalable Simulation Accelerator for Tile Architectures @HEART2011
PPTX
Gentek Introduce(en)
PDF
Dme presentation-feb2013v2-1
AMD Opteron 6200 and 4200 Series Presentation
 
Poser pro reference manual
AMD Analyst Day 2009: Rick Bergman
 
AMD Unified Video Decoder
 
Hardware assisted Virtualization in Embedded
Congatec_Global Vendor for Innovative Embedded Solutions_Ankara
Congatec_Global Vendor for Innovative Embedded Solutions_Istanbul
Toward a practical “HPC Cloud”: Performance tuning of a virtualized HPC cluster
Implement Checkpointing for Android
AMD Opteron 6000 Series Platform Press Presentation
 
Simulation Directed Co-Design from Smartphones to Supercomputers
Case Study: Porting Qt for Embedded Linux on Embedded Processors
Implement Checkpointing for Android (ELCE2012)
Hp All In 1
Power7 facts and features 17 aug
AMD Chiplet Architecture for High-Performance Server and Desktop Products
 
HPCMPUG2011 cray tutorial
An FPGA-based Scalable Simulation Accelerator for Tile Architectures @HEART2011
Gentek Introduce(en)
Dme presentation-feb2013v2-1
Ad

Similar to Nvidia Cuda Apps Jun27 11 (20)

PDF
PG-Strom - GPU Accelerated Asyncr
PDF
N A G P A R I S280101
PDF
2D Games to HPC
PDF
3 d to_hpc
PDF
GPU Virtualization on VMware's Hosted I/O Architecture
PDF
AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”
PDF
PG-Strom
PDF
Compute API –Past & Future
PDF
[03 2][gpu용 개발자 도구 - parallel nsight 및 axe] gateau parallel-nsight
PDF
Heterogeneous Systems Architecture: The Next Area of Computing Innovation
 
PDF
GPU Programming with Java
PPT
BladeCenter GPU Expansion Blade (BGE) - Client Presentation
PDF
Kernel Recipes 2014 - The Linux graphics stack and Nouveau driver
PPTX
iMinds The Conference: Jan Lemeire
PDF
Pgopencl
PDF
PostgreSQL with OpenCL
PDF
Introduction to GPU Programming
PDF
Example Application of GPU
PDF
Introduction to the Graphics Pipeline of the PS3
PDF
Sears Point Racetrack
PG-Strom - GPU Accelerated Asyncr
N A G P A R I S280101
2D Games to HPC
3 d to_hpc
GPU Virtualization on VMware's Hosted I/O Architecture
AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”
PG-Strom
Compute API –Past & Future
[03 2][gpu용 개발자 도구 - parallel nsight 및 axe] gateau parallel-nsight
Heterogeneous Systems Architecture: The Next Area of Computing Innovation
 
GPU Programming with Java
BladeCenter GPU Expansion Blade (BGE) - Client Presentation
Kernel Recipes 2014 - The Linux graphics stack and Nouveau driver
iMinds The Conference: Jan Lemeire
Pgopencl
PostgreSQL with OpenCL
Introduction to GPU Programming
Example Application of GPU
Introduction to the Graphics Pipeline of the PS3
Sears Point Racetrack
Ad

Recently uploaded (20)

PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Approach and Philosophy of On baking technology
PPTX
Cloud computing and distributed systems.
PDF
KodekX | Application Modernization Development
PDF
cuic standard and advanced reporting.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPT
Teaching material agriculture food technology
PDF
Machine learning based COVID-19 study performance prediction
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Network Security Unit 5.pdf for BCA BBA.
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Mobile App Security Testing_ A Comprehensive Guide.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
Approach and Philosophy of On baking technology
Cloud computing and distributed systems.
KodekX | Application Modernization Development
cuic standard and advanced reporting.pdf
Encapsulation_ Review paper, used for researhc scholars
NewMind AI Weekly Chronicles - August'25 Week I
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Diabetes mellitus diagnosis method based random forest with bat algorithm
Teaching material agriculture food technology
Machine learning based COVID-19 study performance prediction
Review of recent advances in non-invasive hemoglobin estimation
The Rise and Fall of 3GPP – Time for a Sabbatical?
The AUB Centre for AI in Media Proposal.docx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
How UI/UX Design Impacts User Retention in Mobile Apps.pdf

Nvidia Cuda Apps Jun27 11

  • 2. Strategic Focus on Applications Senior-level relationship and market managers Dedicated technical resources More than 150 people devoted to libraries, tools, application porting and market development Worldwide focus
  • 3. Reaching a Broad Range of Markets Scientific computing Creative pro Education / research
  • 4. Strategic Partners CAD/ CAM/ CAE/ EDA Computational Computational Defence & Digital Physical Seismic CAID chemistry Finance Intelligence Content Sciences processing creation and visualization Autodesk Ansys Amber MATLAB Ikena Adobe Quda (L-QCD) Schlumberger Dassault Dassault NAMD Mathematica Intergraph Autodesk M&E WRF Landmark Systemes: Systemes: CATIA Simulia Solidworks PTC Nastran Gromacs NAG ESRI Avid ACUSA Paradigm Siemens LSTC Lammps Murex Manifold MainConcept HOMME Synopsys GAMESS Sony HYCOM
  • 5. Leading MD Applications Features Application GPU Perf Release Status Notes Supported PMEMD : Single and multi-GPUs. AMBER Explicit & Implicit 8X V11 Released Expect 2x more performance in Solvent V11 patch release (shortly) Implicit (5x), Explicit Single GPU released, Next release: 2H2011 GROMACS (2x) Solvent 2x-5x Version 4.5.4 Better Explicit, MPI Lennard-Jones, Gay- LAMMPS Berne 6x Released Single and multi-GPU. Non-bond force NAMD calculation 2x-7x Released, v2.8 Single and multi-GPU. GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
  • 6. Additional MD/MM Applications Ramping Features Application GPU Perf Release Status Notes Supported TBD, 4-29X Single GPU. Abalone “Simulations” (on 1060 GPU) Released Agile Molecule, Inc. Production bio-molecular “µ-sec long Written for use on dynamics (MD) software specially ACEMD GPUs trajectories on Released optimized to run on single and workstation” multi-GPUs Two-body Forces, Link- V 4.0 Source only Next release: 2H2011 DL_POLY cell Pairs, Ewald SPME 4x Results Published Multi-GPU, multi-node supported forces, Shake VV HOOMD- Written for use on 2X Released, Version Single and multi-GPU. (32 CPU cores vs. GPUs 0.9.2 Blue 2 10XX GPUs) GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
  • 7. Viz and “Docking” Applications Related Features GPU Perf Release Status Notes Applications Supported Visualization from Visage 3D visualization of Imaging. Next release, 5.4, will Amira 5® volumetric data and N/A Released, Version 5.3.3 use GPU for general purpose surfaces processing in some functions Core GPU accelerated Up to Released, Suite 2011 Single and multi-GPUs. application 5000X Schrodinger, Inc. Hopping Real-time shape Single and multi-GPUs. FastROCS similarity 800-3000X Released Open Eyes Scientific Software searching/comparison High quality rendering, large structures (100 million atoms), GPU acceleration for 100-125X or Visualization from University of VMD computationally demanding analysis and visualization tasks, multiple GPU support for very fast display of greater Released, Version 1.9 Illinois at Urbana-Champaign molecular orbitals arising in quantum chemistry calculations GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
  • 8. Quantum Chemistry Features GPU Application Release Status Notes Supported Perf Libqc with Rys Single GPU supported in 10/1/10 Quadrature Algorithm, release. GAMESS-US integral evaluation, 2.5X Released Multi-GPU supported in closed shell Fock July 2011 release. matrix construction Triples part of Reg- Development GPGPU CCSD(T), CCSD & 3-8X Date TBA, NWChem EOMCCSD task projected in development benchmarks: www.nwchem- sw.org schedulers Date TBA, Various features 8-14x Q-CHEM including RI-MP2 projected In development Significant porting already 44-650X Single and Multi-GPU. “Full GPU-based vs. Completely redesigned to exploit TeraChem solution” GAMESS Version 1.45 released massive GPU parallelism CPU ver. GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
  • 9. Material Science Features GPU Application Release Status Notes Supported Perf BigDFT - 50% of the http://inac.cea.fr/L_Sim/BigDFT Abinit program (short 6-30X Released June 2009 /news.html convolutions) Quantum- PWscf package: linear algebra (matrix Created by Irish Centre for High- Espresso/ multiply), explicit TBD Released May 5, 2011 End Computing computational kernels, PWscf 3D FFTs GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
  • 10. Bioinformatics CUDA-BLASTP HEX Protein Docking CUDA-EC Jacket (MATLAB Plugin) CUDA-MEME MUMmerGPU CUDASW++ (Smith-Waterman) MUMmerGPU++ DNADist SARUMAN GPU Blast SeqNFind GPU-HMMER UGENE Additional details can be found at Tesla Bio Workbench: http://www.nvidia.com/object/tesla_bio_workbench.html
  • 11. Structural Mechanics Application GPU Features GPU Perf Release Status Notes ANSYS Mechanical Linear eqn solvers 2x Total Today, release 13 SP2 FE implicit, single-GPU Abaqus/Standard Linear eqn solver 2x Total Today, release 6.11 FE implicit, single-GPU IMPETUS Afea Explicit solver, SPH 10x SPH, 2x Total Today, release 1.0 FE explicit, multi-GPU LS-DYNA implicit Linear eqn solver 3x Total Planned for 2011 FE implicit, multi-GPU MD Nastran Linear eqn solvers 2x Solver Planned for 2011 FE implicit, multi-GPU Marc Linear eqn solver 1.5x Total Planned for 2011 FE implicit, single-GPU RADIOSS Implicit Linear eqn solver 1.5x Total Demonstration FE implicit, single-GPU PAM-CRASH implicit Linear eqn solver 1.5x Total Demonstration FE implicit, single-GPU NX Nastran Linear eqn solver 1.4x Total Demonstration FE implicit, single-GPU GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
  • 12. Fluid Dynamics Application GPU Features GPU Perf Release Status Notes Altair AcuSolve Linear eqn solver 2x Total Today, release 1.8 FE unstructured NS, multi-GPU Autodesk Moldflow Linear eqn solver 2x Total Today, release 2011 FE unstructured NS, single-GPU FluiDyna LBultra LBM, particle CFD 20x Total Today, release 1.0 Structured LBM, multi-GPU FluiDyna Culises- Linear eqn solvers 3x Solver Today, release 1.0 Unstructured NS, single-GPU OpenFOAM Solver Vratis SpeedIT- Linear eqn solvers 3x Solver Today, release 1.2 Unstructured NS, multi-GPU OpenFOAM Solver Prometech MPS, particle CFD 4x-9x Total Q3CY11 release 2.5 Particle based, multi-GPU Particleworks Sandia NL S3D Chemistry kernel 8x SP, 5x DP kernel Demonstration Structured grid DNS, multi-GPU Turbostream Explicit solver 19x Total Today, release 2.0 Structured grid NS, multi-GPU SD++ (Jameson) Explicit solver 16x Total Planned for 2011 FE unstructured NS, multi-GPU GPU Perf compared against Multi-core x86 CPU socket. FEFLO (Lohner) Explicit solver 2x Total Planned for 2011 FE unstructured NS, multi-GPU GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
  • 13. Electromagnetics Features Application GPU Perf Release Status Notes Supported Single & multi-GPU; Agilent EMPro FDTD 6X 2011.07 Released EMPro 2011 PR Transient (FIT) 9X on 1 GPU CST Microwave Single & multi-GPU; solver; Combined MPI to 20X+ on 4 2011 Released www.cst.com/perf Studio & GPU computing GPUs Single and multi-GPU; Remcom XFdtd FDTD 30-300X XF7 Released XStream GPU acceleration FDTD; Single and multi-GPU; SPEAG SEMCAD X Acceleware 100X 14.4.3 Released www.speag.com/perf GPU Performance compared against quad-core x86 CPU socket; Remcom XFdtd GPU performance compared against single core CPU
  • 14. Climate/ Weather/ Ocean Application GPU Features GPU Perf Production Status Notes WSM5, WSM3, Ice WRF Microphysics models 4x-6x Models Today, release 3.2 single-GPU ASUCA Most routines 12x Total In production at JMA multi-GPU NIM Most routines 7x Dynamics Limited production multi-GPU HIRLAM Dynamical core 3x Solver Planned for 2011 multi-GPU HOMME Models 3x Models Planned for 2011 single-GPU CAM Linear eqn solver 2x Solver Planned for 2011 single-GPU 10x Models, 3x GEOS-5 Most routines Dynamics Demonstration multi-GPU MITgcm Linear eqn solver 3x solver Demonstration single-GPU HYCOM Linear eqn solver 2x solver Demonstration single-GPU GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison