SlideShare a Scribd company logo
Adjusting primitives for graph
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list
based graph representation that is
Multiply with different modes (map)
Sequential OpenMP CUDA
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
float bfloat16
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
Sequential OpenMP CUDA (memcpy, in-place)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
sum-loop sum-reduce
one-loop atomic-add
block-loop template, next-pow2 launch one-reduce, next-pow2 launch
block-loop template, prev. pow2 launch one-reduce, prev-pow2 launch
grid-loop
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Adjusting primitives for graph : SHORT REPORT / NOTES

More Related Content

PDF
Experiments with Primitive operations : SHORT REPORT / NOTES
PDF
PageRank Experiments : SHORT REPORT / NOTES
PDF
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
PDF
Algorithmic optimizations for Dynamic Monolithic PageRank (from STICD) : SHOR...
PPTX
Matlab ppt
PPT
Session 19 - MapReduce
PPTX
Tuning and Debugging in Apache Spark
PDF
On Implementation of Neuron Network(Back-propagation)
Experiments with Primitive operations : SHORT REPORT / NOTES
PageRank Experiments : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Algorithmic optimizations for Dynamic Monolithic PageRank (from STICD) : SHOR...
Matlab ppt
Session 19 - MapReduce
Tuning and Debugging in Apache Spark
On Implementation of Neuron Network(Back-propagation)

Similar to Adjusting primitives for graph : SHORT REPORT / NOTES (20)

PDF
Alpine Spark Implementation - Technical
PDF
Multinomial Logistic Regression with Apache Spark
PDF
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
ODP
Introduction to Structured Streaming
PDF
Xxx treme aggregation
PDF
Apache Mahout Algorithms
PDF
Java Keeps Throttling Up!
PDF
Advanced Spark Programming - Part 2 | Big Data Hadoop Spark Tutorial | CloudxLab
PDF
Hadoop interview questions - Softwarequery.com
PDF
Large-Scale Machine Learning with Apache Spark
PPTX
CS 542 -- Query Execution
PDF
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
PDF
Tuning and Debugging in Apache Spark
PDF
Large scale logistic regression and linear support vector machines using spark
PDF
Scaling Apache Spark MLlib to Billions of Parameters: Spark Summit East talk ...
PDF
de Valpine NIMBLE
PDF
lecture_GPUArchCUDA04-OpenMPHOMP.pdf
PPTX
Apache Crunch
PDF
Hadoop interview question
ODP
Aggregating In Accumulo
Alpine Spark Implementation - Technical
Multinomial Logistic Regression with Apache Spark
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Introduction to Structured Streaming
Xxx treme aggregation
Apache Mahout Algorithms
Java Keeps Throttling Up!
Advanced Spark Programming - Part 2 | Big Data Hadoop Spark Tutorial | CloudxLab
Hadoop interview questions - Softwarequery.com
Large-Scale Machine Learning with Apache Spark
CS 542 -- Query Execution
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Tuning and Debugging in Apache Spark
Large scale logistic regression and linear support vector machines using spark
Scaling Apache Spark MLlib to Billions of Parameters: Spark Summit East talk ...
de Valpine NIMBLE
lecture_GPUArchCUDA04-OpenMPHOMP.pdf
Apache Crunch
Hadoop interview question
Aggregating In Accumulo
Ad

More from Subhajit Sahu (20)

PDF
About TrueTime, Spanner, Clock synchronization, CAP theorem, Two-phase lockin...
PDF
Adjusting Bitset for graph : SHORT REPORT / NOTES
PDF
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
PDF
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTES
PDF
Shared memory Parallelism (NOTES)
PDF
A Dynamic Algorithm for Local Community Detection in Graphs : NOTES
PDF
Scalable Static and Dynamic Community Detection Using Grappolo : NOTES
PDF
Application Areas of Community Detection: A Review : NOTES
PDF
Community Detection on the GPU : NOTES
PDF
Survey for extra-child-process package : NOTES
PDF
Dynamic Batch Parallel Algorithms for Updating PageRank : POSTER
PDF
Abstract for IPDPS 2022 PhD Forum on Dynamic Batch Parallel Algorithms for Up...
PDF
Fast Incremental Community Detection on Dynamic Graphs : NOTES
PDF
Can you fix farming by going back 8000 years : NOTES
PDF
HITS algorithm : NOTES
PDF
Basic Computer Architecture and the Case for GPUs : NOTES
PDF
Dynamic Batch Parallel Algorithms for Updating Pagerank : SLIDES
PDF
Are Satellites Covered in Gold Foil : NOTES
PDF
Taxation for Traders < Markets and Taxation : NOTES
PDF
A Generalization of the PageRank Algorithm : NOTES
About TrueTime, Spanner, Clock synchronization, CAP theorem, Two-phase lockin...
Adjusting Bitset for graph : SHORT REPORT / NOTES
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
DyGraph: A Dynamic Graph Generator and Benchmark Suite : NOTES
Shared memory Parallelism (NOTES)
A Dynamic Algorithm for Local Community Detection in Graphs : NOTES
Scalable Static and Dynamic Community Detection Using Grappolo : NOTES
Application Areas of Community Detection: A Review : NOTES
Community Detection on the GPU : NOTES
Survey for extra-child-process package : NOTES
Dynamic Batch Parallel Algorithms for Updating PageRank : POSTER
Abstract for IPDPS 2022 PhD Forum on Dynamic Batch Parallel Algorithms for Up...
Fast Incremental Community Detection on Dynamic Graphs : NOTES
Can you fix farming by going back 8000 years : NOTES
HITS algorithm : NOTES
Basic Computer Architecture and the Case for GPUs : NOTES
Dynamic Batch Parallel Algorithms for Updating Pagerank : SLIDES
Are Satellites Covered in Gold Foil : NOTES
Taxation for Traders < Markets and Taxation : NOTES
A Generalization of the PageRank Algorithm : NOTES
Ad

Recently uploaded (20)

PDF
Lecture1 pattern recognition............
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPT
Quality review (1)_presentation of this 21
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPT
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
PDF
Introduction to Business Data Analytics.
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
Global journeys: estimating international migration
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
1_Introduction to advance data techniques.pptx
PPTX
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
Lecture1 pattern recognition............
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Galatica Smart Energy Infrastructure Startup Pitch Deck
Quality review (1)_presentation of this 21
Reliability_Chapter_ presentation 1221.5784
IBA_Chapter_11_Slides_Final_Accessible.pptx
Supervised vs unsupervised machine learning algorithms
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
Introduction to Business Data Analytics.
Business Acumen Training GuidePresentation.pptx
Global journeys: estimating international migration
Business Ppt On Nestle.pptx huunnnhhgfvu
climate analysis of Dhaka ,Banglades.pptx
1_Introduction to advance data techniques.pptx
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
oil_refinery_comprehensive_20250804084928 (1).pptx

Adjusting primitives for graph : SHORT REPORT / NOTES

  • 1. Adjusting primitives for graph Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is Multiply with different modes (map) Sequential OpenMP CUDA 1. Performance of sequential execution based vs OpenMP based vector multiply. 2. Comparing various launch configs for CUDA based vector multiply. Sum with different storage types (reduce) float bfloat16 1. Performance of vector element sum using float vs bfloat16 as the storage type. Sum with different modes (reduce) Sequential OpenMP CUDA (memcpy, in-place) 1. Performance of sequential execution based vs OpenMP based vector element sum. 2. Performance of memcpy vs in-place based CUDA based vector element sum. 3. Comparing various launch configs for CUDA based vector element sum (memcpy). 4. Comparing various launch configs for CUDA based vector element sum (in-place). Sum with in-place strategies of CUDA mode (reduce) sum-loop sum-reduce one-loop atomic-add block-loop template, next-pow2 launch one-reduce, next-pow2 launch block-loop template, prev. pow2 launch one-reduce, prev-pow2 launch grid-loop 1. Comparing various launch configs for CUDA based vector element sum (in-place).