© Arthur J. Lembo, Jr.
Salisbury University
QGIS Plug-in for Parallel Processing in
Terrain Analysis'
Arthur Lembo
Department of Geography and
Geoscience
@artlembo
© Arthur J. Lembo, Jr.
Salisbury University
If you were plowing a field, which would
you rather use? Two strong oxen or 1024
chickens?
- Seymour Cray
© Arthur J. Lembo, Jr.
Salisbury University
• As part of a NSF REU, we built a QGIS
plugin to perform terrain-based parallel
processing with Python.
• This presentation shows the results of our
undergraduate student research project
– Multicore processors
– Massively parallel GPGPUs
– Hardware evolution
– Our QGIS plug-in
– The road ahead
Overview
© Arthur J. Lembo, Jr.
Salisbury University
NSF REU
• The NSF REU is a 3 year grant
(extended to 5 years) focused on
parallel processing.
• The goal is to expose undergraduates
to academic research in computer
science.
• My role has been to mentor students in
the use of parallel processing in
geography.
© Arthur J. Lembo, Jr.
Salisbury University
• 1971 Intel 4004
• Ted Hoff
• $60,000
• 2,300 transistors
• 582,000,000 Quad
• Lithography
• Killed time-sharing
Microcomputer revolution
© Arthur J. Lembo, Jr.
Salisbury University
• 2x / 18 months
• Design Shrink
• Smaller = Faster
Moore’s Law
© Arthur J. Lembo, Jr.
Salisbury University
Trouble in paradise?
© Arthur J. Lembo, Jr.
Salisbury University
• Heat
– Limits on power density
– Package dissipation limit
– Watercooled overclocking
• Subunit Complexity
– Single clock cycle synchronicity
– AMD translation lookaside buffer bug
• RC Interconnect delay
Limits of Moore’s Law
© Arthur J. Lembo, Jr.
Salisbury University
• Parallel dies
• Parallel packages
• Core 2 Duo
• Core 2 Quad
 Repeating history
Parallel microprocessors
© Arthur J. Lembo, Jr.
Salisbury University
• 64-bit just getting traction
• Windows Parallelism
• Multithreading difficult to Code
• Parallel code even harder
• Scientific Computing
• Who care’s what I have to say?
• Gaming leads the way
Limited uptake
Opening week:
Grand Theft Auto: $500M
Halo 3: $300M
Spiderman 3: $182M
Pirates 3: $196M
© Arthur J. Lembo, Jr.
Salisbury University
© Arthur J. Lembo, Jr.
Salisbury University
© Arthur J. Lembo, Jr.
Salisbury University
Options for Large Geographic
Computations
• Use a smaller dataset
• Generalize the resolution
of your dataset
– Both options compromise
the integrity of the data
• Invest in clusters (groups
of ordinary PCs joined
together with combined
power and parallel
processing) or time
sharing
– Require special
programming
– Very costly
© Arthur J. Lembo, Jr.
Salisbury University
Background
Parallel Processing - program
allows multiple computations to
occur concurrently
GPU - graphical processing unit,
generally used for video /
gaming visuals processing
Designed for multithreading,
contain hundreds of cores,
good at simple math
CPU - central processing unit,
what computation is
traditionally done on
Contain much smaller number
of cores, good at complex
calculations
© Arthur J. Lembo, Jr.
Salisbury University
Test Environment
GPU - Nvidia GTX 670
1344 CUDA cores
2 GB DDR5 RAM
Intel Xeon E5607
processor
4 Cores, 4 threads
2.27 GHz
© Arthur J. Lembo, Jr.
Salisbury University
Why PyCUDA
• Expose CUDA functions in
QGIS
• Easier to program?
• Easier to add functionality?
© Arthur J. Lembo, Jr.
Salisbury University
Terrain functions
• Started with 3 common GIS functions
• Slope - ~15 calculations, Aspect - ~20 calculations,
Hillshade - ~45 calculations
• All are embarrassingly parallel
© Arthur J. Lembo, Jr.
Salisbury University
Terrain visuals
Altitude HillshadeSlope
© Arthur J. Lembo, Jr.
Salisbury University
Methods, cont.
© Arthur J. Lembo, Jr.
Salisbury University
Scheduler
• Overall manager
• Starts and
manages
processes which
load data,
• Performs raster
calculations on
GPU,
• Save data back to
disk.
© Arthur J. Lembo, Jr.
Salisbury University
GPU
Calculator
• Where the actual
calculations are
performed
• Reads data from the
input pipe, performs
GPU calculations
over all the cores,
sends result through
output pipe to data
saver
• Designed so any
algorithm based on
a 3x3 grid of pixels
can be used
© Arthur J. Lembo, Jr.
Salisbury University
Saver
• Takes the results of
the calculations done
in the GPU manager
and saves them to
disk
• Uses the GDAL
libraries to write
multiple lines at a
time to a Geotiff
• Multiple savers can
all run in parallel to
save the ouputs of
different functions
© Arthur J. Lembo, Jr.
Salisbury University
Python by itself is much slower than C++
PyCUDA is faster both because of utilizing the GPU
and because it is written in C
*take away: when given the option, use pyCUDA
libraries
Results and Discussion
(Stage 1 - out of the box)
Size Python C++ PyCUDA QGIS
25 MB 50 secs 5 secs 4 secs 5 secs
200 MB 7:30 mins 40 secs 28 secs 15 secs
© Arthur J. Lembo, Jr.
Salisbury University
Results and Discussion
(Stage 2 - threading)
Adding CPU based parallelism
increases gains
Reduces time waiting for data to be
given to GPU
Size Threaded
Python
Threaded
C++
Threaded
PyCUDA
QGIS
25 MB 45 secs 5 secs 4 secs 5 secs
200 MB 7:30 mins 40 secs 9 secs 15 secs
1.5 GB 1:30 hrs 18:03 mins 9:04 mins 9 mins
© Arthur J. Lembo, Jr.
Salisbury University
Results and Discussion
(Stage 2 - added computation)
Adding more complex computations allows us to
maximize the GPU contribution.
Computing hillshade which requires about 3x more
computations
PyCUDA doesn’t even slow down when switching
formulas
Shows that it can do much more before peaking out
Size QGIS Threaded
PyCUDA
25 MB 5 secs 4 secs
1.5 GB 11:00 mins 9:04 mins
12 GB 45:00 mins 50:00 mins
© Arthur J. Lembo, Jr.
Salisbury University
Results and Discussion
(Stage 3 - further optimization)
Main bottleneck in computations disk I/O
Total time the GPU is working for the 1.5 GB file is less than
2 seconds
Increasing size of reads and writes gains even more time
Size QGIS Threaded
PyCUDA
Input 2:00 mins 1:55 mins
Computation 9:00 mins 1:00 mins
Output 2:00 mins 2:20 mins
Total 11:00 mins 3:35 mins
© Arthur J. Lembo, Jr.
Salisbury University
Results and Discussion (Stage
3 - chunking)
Lines read Time taken
1 50:00
10 39:50
15 28:30
20 33:30
30 37:00
40 40:00
50 48:30
Lines read Time taken
1 5:18
10 3:54
15 3:00
20 3:50
30 4:16
40 4:52
50 5:21
Reading too much data in
one call causes slowdown
Optimal number is ~15
lines for all sizes
No apparent ratio between
disk read lines and raster
column and row sizes
1.5 GB is 14400 rows *
28800 cols
12 GB is 51187 rows *
60818 cols
The limitation is how fast
we can send data to GPU
12 GB file 1.5 GB file
© Arthur J. Lembo, Jr.
Salisbury University
Results
• The PyCUDA version is
consistently faster than QGIS
when calculating hillshade for
files of various sizes.
• GPU computations, including
CPU based memory
management took one ninth of
the time required to do the same
thing in QGIS
• The I/O bottleneck can be seen in
the input and output sections of
the second table.
• Output takes a much longer time
because it has to wait for the
GPU to pass data to the saver
before it can start saving to disk
9:00
© Arthur J. Lembo, Jr.
Salisbury University
Other Takeaways
• CUDA is very efficient when you
have a smaller number of data
elements, but massive calculations
per element.
• Terrain based analysis use massive
amounts of data, but few calculations
per data element.
© Arthur J. Lembo, Jr.
Salisbury University
Earlier work
© Arthur J. Lembo, Jr.
Salisbury University
Next Steps
• Improve the installation process – it is
too arduous at the moment
• Get the plug-in to work in Windows
© Arthur J. Lembo, Jr.
Salisbury University
Conclusion
• Early results show the ability to triple terrain analysis
speed compared to serial methods
• Multithreading can significantly improve GIS
analysis speed
• Try it out for yourself:
https://github.com/aFuerst/PyCUDA-Raster
GPU
C++
QGIS
SERIAL
© Arthur J. Lembo, Jr.
Salisbury University
SO, WHAT IS A GOOD GIS
EXAMPLE OF MASSIVE
CALCULATIONS PER DATA
ELEMENT?
© Arthur J. Lembo, Jr.
Salisbury University
Acknowledgements
Salisbury University
National Science Foundation (Award #
1460900)
Students:
William Hoffman
Charlie Kazer
Alex Fuerst

More Related Content

PPTX
Remote sensing by priyanshu kumar,9608684800
PPTX
Gis in transportation
PPT
Gujarat tourism, at a glance
PPTX
Unit 5 Orientation and Maps
PPTX
Application of gis in urban planning
PPT
Mahindra Comviva 2015
PDF
Global mapper
 
PPTX
GEOGRAPHICAL INFORMATION SYSTEM (GIS)
Remote sensing by priyanshu kumar,9608684800
Gis in transportation
Gujarat tourism, at a glance
Unit 5 Orientation and Maps
Application of gis in urban planning
Mahindra Comviva 2015
Global mapper
 
GEOGRAPHICAL INFORMATION SYSTEM (GIS)

What's hot (20)

PDF
Lime - Riding into Singapore
PPTX
Tourist Places of India
PPTX
Irctc.ppt
PPTX
Chandrayaan 2
PPT
Archaeological Applications of Geographic Information Systems (GIS)
PPTX
Aditya l1
PPTX
Developing Efficient Web-based GIS Applications
PDF
Smart City IoT Platforms - Benefits and Challenges
PDF
IITTM 1st sem previous question paper's
PPTX
Horse Riding
PPTX
Chandrayaan 2
PDF
Smart railway crossing embedded with automated platform bridge
PDF
how to use midjourney AI
PPTX
Applications of remote sensing and modelling in flood risk analysis and irrig...
PPTX
Application of GIS and Remote Sensing
PDF
PPTX
My ppt on gis
PPTX
PHOTOGRAMMETRY (REMOTE SENSING & GIS).pptx
PDF
Cartography and Web GIS - Jack Dangermond
PDF
Training Avatars for the Metaverse
Lime - Riding into Singapore
Tourist Places of India
Irctc.ppt
Chandrayaan 2
Archaeological Applications of Geographic Information Systems (GIS)
Aditya l1
Developing Efficient Web-based GIS Applications
Smart City IoT Platforms - Benefits and Challenges
IITTM 1st sem previous question paper's
Horse Riding
Chandrayaan 2
Smart railway crossing embedded with automated platform bridge
how to use midjourney AI
Applications of remote sensing and modelling in flood risk analysis and irrig...
Application of GIS and Remote Sensing
My ppt on gis
PHOTOGRAMMETRY (REMOTE SENSING & GIS).pptx
Cartography and Web GIS - Jack Dangermond
Training Avatars for the Metaverse
Ad

Similar to QGIS plugin for parallel processing in terrain analysis (20)

PPTX
Introducing Container Technology to TSUBAME3.0 Supercomputer
PDF
"The BG collaboration, Past, Present, Future. The new available resources". P...
PDF
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
PDF
AES encryption on modern consumer architectures
PDF
Deep Learning at Scale
PDF
MSR 2009
PDF
Data-intensive IceCube Cloud Burst
PPTX
Gfarm presentation and thesis topic introduction
PPTX
19th Session.pptx
PDF
PEARC17: Deploying RMACC Summit: An HPC Resource for the Rocky Mountain Region
PDF
Enhancing Performance with Globus and the Science DMZ
PDF
(ATS6-GS04) Performance Analysis of Accelrys Enterprise Platform 9.0 on IBM’s...
PPTX
MapReduce presentation
PDF
Performance Optimization of CGYRO for Multiscale Turbulence Simulations
PDF
Understanding and Measuring I/O Performance
PPTX
Gpgpu intro
PDF
Introduction to GPUs for Machine Learning
PPTX
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
PDF
GRP 19 - Nautilus, IceCube and LIGO
PDF
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
Introducing Container Technology to TSUBAME3.0 Supercomputer
"The BG collaboration, Past, Present, Future. The new available resources". P...
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
AES encryption on modern consumer architectures
Deep Learning at Scale
MSR 2009
Data-intensive IceCube Cloud Burst
Gfarm presentation and thesis topic introduction
19th Session.pptx
PEARC17: Deploying RMACC Summit: An HPC Resource for the Rocky Mountain Region
Enhancing Performance with Globus and the Science DMZ
(ATS6-GS04) Performance Analysis of Accelrys Enterprise Platform 9.0 on IBM’s...
MapReduce presentation
Performance Optimization of CGYRO for Multiscale Turbulence Simulations
Understanding and Measuring I/O Performance
Gpgpu intro
Introduction to GPUs for Machine Learning
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
GRP 19 - Nautilus, IceCube and LIGO
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
Ad

More from Ross McDonald (20)

PDF
Visualising school catchment areas - FOSS4GUK 2018
PDF
Using QGIS to create 3D indoor maps
PDF
Creating and indoor routable network with QGIS and pgRouting
PDF
Viewsheds and Advanced Calculations
PDF
Using QGIS for ecological surveying
PDF
Welcome to the 6th Scottish QGIS UK meeting
PDF
How deep is your loch?
PDF
Data capture with Leaflet and OpenStreetMap
PDF
Them thar hills: shadin', texturin', blendin'
PDF
Mapping narrative: QGIS in the humanities classrom
PDF
QGIS server: the good, the not-so-good and the ugly
PDF
QGIS UK Thank you for coming
PDF
Decision support tools for forestry using open source software
PDF
Installing QGIS on a network
PDF
Pgrouting_foss4guk_ross_mcdonald
PDF
Liam Mason QGIS Geoserver SLD
PDF
Phil Bartie QGIS PLPython
PDF
John Stevenson Volcanoes and FOSS4G Edinburgh
PPT
Roger Garbett - QGIS Print Composer
PPTX
Matt Walsh thinkWhere_QGIS_usergroup_pyqt
Visualising school catchment areas - FOSS4GUK 2018
Using QGIS to create 3D indoor maps
Creating and indoor routable network with QGIS and pgRouting
Viewsheds and Advanced Calculations
Using QGIS for ecological surveying
Welcome to the 6th Scottish QGIS UK meeting
How deep is your loch?
Data capture with Leaflet and OpenStreetMap
Them thar hills: shadin', texturin', blendin'
Mapping narrative: QGIS in the humanities classrom
QGIS server: the good, the not-so-good and the ugly
QGIS UK Thank you for coming
Decision support tools for forestry using open source software
Installing QGIS on a network
Pgrouting_foss4guk_ross_mcdonald
Liam Mason QGIS Geoserver SLD
Phil Bartie QGIS PLPython
John Stevenson Volcanoes and FOSS4G Edinburgh
Roger Garbett - QGIS Print Composer
Matt Walsh thinkWhere_QGIS_usergroup_pyqt

Recently uploaded (20)

PDF
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
PDF
CloudStack 4.21: First Look Webinar slides
PDF
Enhancing emotion recognition model for a student engagement use case through...
PPTX
Configure Apache Mutual Authentication
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PPTX
2018-HIPAA-Renewal-Training for executives
PDF
A review of recent deep learning applications in wood surface defect identifi...
PPT
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
PPT
What is a Computer? Input Devices /output devices
PDF
Credit Without Borders: AI and Financial Inclusion in Bangladesh
PPTX
Custom Battery Pack Design Considerations for Performance and Safety
PDF
Architecture types and enterprise applications.pdf
PDF
OpenACC and Open Hackathons Monthly Highlights July 2025
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
CloudStack 4.21: First Look Webinar slides
Enhancing emotion recognition model for a student engagement use case through...
Configure Apache Mutual Authentication
NewMind AI Weekly Chronicles – August ’25 Week III
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
1 - Historical Antecedents, Social Consideration.pdf
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
Taming the Chaos: How to Turn Unstructured Data into Decisions
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
sustainability-14-14877-v2.pddhzftheheeeee
2018-HIPAA-Renewal-Training for executives
A review of recent deep learning applications in wood surface defect identifi...
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
What is a Computer? Input Devices /output devices
Credit Without Borders: AI and Financial Inclusion in Bangladesh
Custom Battery Pack Design Considerations for Performance and Safety
Architecture types and enterprise applications.pdf
OpenACC and Open Hackathons Monthly Highlights July 2025
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor

QGIS plugin for parallel processing in terrain analysis

  • 1. © Arthur J. Lembo, Jr. Salisbury University QGIS Plug-in for Parallel Processing in Terrain Analysis' Arthur Lembo Department of Geography and Geoscience @artlembo
  • 2. © Arthur J. Lembo, Jr. Salisbury University If you were plowing a field, which would you rather use? Two strong oxen or 1024 chickens? - Seymour Cray
  • 3. © Arthur J. Lembo, Jr. Salisbury University • As part of a NSF REU, we built a QGIS plugin to perform terrain-based parallel processing with Python. • This presentation shows the results of our undergraduate student research project – Multicore processors – Massively parallel GPGPUs – Hardware evolution – Our QGIS plug-in – The road ahead Overview
  • 4. © Arthur J. Lembo, Jr. Salisbury University NSF REU • The NSF REU is a 3 year grant (extended to 5 years) focused on parallel processing. • The goal is to expose undergraduates to academic research in computer science. • My role has been to mentor students in the use of parallel processing in geography.
  • 5. © Arthur J. Lembo, Jr. Salisbury University • 1971 Intel 4004 • Ted Hoff • $60,000 • 2,300 transistors • 582,000,000 Quad • Lithography • Killed time-sharing Microcomputer revolution
  • 6. © Arthur J. Lembo, Jr. Salisbury University • 2x / 18 months • Design Shrink • Smaller = Faster Moore’s Law
  • 7. © Arthur J. Lembo, Jr. Salisbury University Trouble in paradise?
  • 8. © Arthur J. Lembo, Jr. Salisbury University • Heat – Limits on power density – Package dissipation limit – Watercooled overclocking • Subunit Complexity – Single clock cycle synchronicity – AMD translation lookaside buffer bug • RC Interconnect delay Limits of Moore’s Law
  • 9. © Arthur J. Lembo, Jr. Salisbury University • Parallel dies • Parallel packages • Core 2 Duo • Core 2 Quad  Repeating history Parallel microprocessors
  • 10. © Arthur J. Lembo, Jr. Salisbury University • 64-bit just getting traction • Windows Parallelism • Multithreading difficult to Code • Parallel code even harder • Scientific Computing • Who care’s what I have to say? • Gaming leads the way Limited uptake Opening week: Grand Theft Auto: $500M Halo 3: $300M Spiderman 3: $182M Pirates 3: $196M
  • 11. © Arthur J. Lembo, Jr. Salisbury University
  • 12. © Arthur J. Lembo, Jr. Salisbury University
  • 13. © Arthur J. Lembo, Jr. Salisbury University Options for Large Geographic Computations • Use a smaller dataset • Generalize the resolution of your dataset – Both options compromise the integrity of the data • Invest in clusters (groups of ordinary PCs joined together with combined power and parallel processing) or time sharing – Require special programming – Very costly
  • 14. © Arthur J. Lembo, Jr. Salisbury University Background Parallel Processing - program allows multiple computations to occur concurrently GPU - graphical processing unit, generally used for video / gaming visuals processing Designed for multithreading, contain hundreds of cores, good at simple math CPU - central processing unit, what computation is traditionally done on Contain much smaller number of cores, good at complex calculations
  • 15. © Arthur J. Lembo, Jr. Salisbury University Test Environment GPU - Nvidia GTX 670 1344 CUDA cores 2 GB DDR5 RAM Intel Xeon E5607 processor 4 Cores, 4 threads 2.27 GHz
  • 16. © Arthur J. Lembo, Jr. Salisbury University Why PyCUDA • Expose CUDA functions in QGIS • Easier to program? • Easier to add functionality?
  • 17. © Arthur J. Lembo, Jr. Salisbury University Terrain functions • Started with 3 common GIS functions • Slope - ~15 calculations, Aspect - ~20 calculations, Hillshade - ~45 calculations • All are embarrassingly parallel
  • 18. © Arthur J. Lembo, Jr. Salisbury University Terrain visuals Altitude HillshadeSlope
  • 19. © Arthur J. Lembo, Jr. Salisbury University Methods, cont.
  • 20. © Arthur J. Lembo, Jr. Salisbury University Scheduler • Overall manager • Starts and manages processes which load data, • Performs raster calculations on GPU, • Save data back to disk.
  • 21. © Arthur J. Lembo, Jr. Salisbury University GPU Calculator • Where the actual calculations are performed • Reads data from the input pipe, performs GPU calculations over all the cores, sends result through output pipe to data saver • Designed so any algorithm based on a 3x3 grid of pixels can be used
  • 22. © Arthur J. Lembo, Jr. Salisbury University Saver • Takes the results of the calculations done in the GPU manager and saves them to disk • Uses the GDAL libraries to write multiple lines at a time to a Geotiff • Multiple savers can all run in parallel to save the ouputs of different functions
  • 23. © Arthur J. Lembo, Jr. Salisbury University Python by itself is much slower than C++ PyCUDA is faster both because of utilizing the GPU and because it is written in C *take away: when given the option, use pyCUDA libraries Results and Discussion (Stage 1 - out of the box) Size Python C++ PyCUDA QGIS 25 MB 50 secs 5 secs 4 secs 5 secs 200 MB 7:30 mins 40 secs 28 secs 15 secs
  • 24. © Arthur J. Lembo, Jr. Salisbury University Results and Discussion (Stage 2 - threading) Adding CPU based parallelism increases gains Reduces time waiting for data to be given to GPU Size Threaded Python Threaded C++ Threaded PyCUDA QGIS 25 MB 45 secs 5 secs 4 secs 5 secs 200 MB 7:30 mins 40 secs 9 secs 15 secs 1.5 GB 1:30 hrs 18:03 mins 9:04 mins 9 mins
  • 25. © Arthur J. Lembo, Jr. Salisbury University Results and Discussion (Stage 2 - added computation) Adding more complex computations allows us to maximize the GPU contribution. Computing hillshade which requires about 3x more computations PyCUDA doesn’t even slow down when switching formulas Shows that it can do much more before peaking out Size QGIS Threaded PyCUDA 25 MB 5 secs 4 secs 1.5 GB 11:00 mins 9:04 mins 12 GB 45:00 mins 50:00 mins
  • 26. © Arthur J. Lembo, Jr. Salisbury University Results and Discussion (Stage 3 - further optimization) Main bottleneck in computations disk I/O Total time the GPU is working for the 1.5 GB file is less than 2 seconds Increasing size of reads and writes gains even more time Size QGIS Threaded PyCUDA Input 2:00 mins 1:55 mins Computation 9:00 mins 1:00 mins Output 2:00 mins 2:20 mins Total 11:00 mins 3:35 mins
  • 27. © Arthur J. Lembo, Jr. Salisbury University Results and Discussion (Stage 3 - chunking) Lines read Time taken 1 50:00 10 39:50 15 28:30 20 33:30 30 37:00 40 40:00 50 48:30 Lines read Time taken 1 5:18 10 3:54 15 3:00 20 3:50 30 4:16 40 4:52 50 5:21 Reading too much data in one call causes slowdown Optimal number is ~15 lines for all sizes No apparent ratio between disk read lines and raster column and row sizes 1.5 GB is 14400 rows * 28800 cols 12 GB is 51187 rows * 60818 cols The limitation is how fast we can send data to GPU 12 GB file 1.5 GB file
  • 28. © Arthur J. Lembo, Jr. Salisbury University Results • The PyCUDA version is consistently faster than QGIS when calculating hillshade for files of various sizes. • GPU computations, including CPU based memory management took one ninth of the time required to do the same thing in QGIS • The I/O bottleneck can be seen in the input and output sections of the second table. • Output takes a much longer time because it has to wait for the GPU to pass data to the saver before it can start saving to disk 9:00
  • 29. © Arthur J. Lembo, Jr. Salisbury University Other Takeaways • CUDA is very efficient when you have a smaller number of data elements, but massive calculations per element. • Terrain based analysis use massive amounts of data, but few calculations per data element.
  • 30. © Arthur J. Lembo, Jr. Salisbury University Earlier work
  • 31. © Arthur J. Lembo, Jr. Salisbury University Next Steps • Improve the installation process – it is too arduous at the moment • Get the plug-in to work in Windows
  • 32. © Arthur J. Lembo, Jr. Salisbury University Conclusion • Early results show the ability to triple terrain analysis speed compared to serial methods • Multithreading can significantly improve GIS analysis speed • Try it out for yourself: https://github.com/aFuerst/PyCUDA-Raster GPU C++ QGIS SERIAL
  • 33. © Arthur J. Lembo, Jr. Salisbury University SO, WHAT IS A GOOD GIS EXAMPLE OF MASSIVE CALCULATIONS PER DATA ELEMENT?
  • 34. © Arthur J. Lembo, Jr. Salisbury University Acknowledgements Salisbury University National Science Foundation (Award # 1460900) Students: William Hoffman Charlie Kazer Alex Fuerst