SlideShare a Scribd company logo
2
Most read
3
Most read
6
Most read
Overview
 Locating the bottleneck
 Performance measurements
 Optimizations
 Balancing the pipeline
 Other optimizations: multi-processing, parallel processing
ITCS 4010/5010:Game Engine Design 1 Pipeline Optimization
Pipeline Optimization
“We should forget about small efficiencies, say about 97% of the time:
premature optimization is the root of all evil”
– Donald Knuth
 Make it run first, then optimize
 But only optimize where it makes any difference
 Pipeline Optimization: Process to maximize the rendering speed,
then allow stages that are not bottlenecks to consume as much time
as the bottleneck.
ITCS 4010/5010:Game Engine Design 2 Pipeline Optimization
Pipeline Optimization
 Stages execute in parallel
 Always the slowest stage is the bottleneck of the pipeline
 The bottleneck determines throughput (i.e., maximum speed)
 The bottleneck is the average bottleneck over a frame
 Cannot measure intra-frame bottlenecks easily
 Bottlenecks can change over a frame
 Most important: find bottleneck, then optimize that stage!
ITCS 4010/5010:Game Engine Design 3 Pipeline Optimization
Locating the Bottleneck
 Two bottleneck location techniques:
 Technique 1:
◦ Make a certain stage work less
◦ If performance is the better, then that stage is the bottleneck
 Technique 2:
◦ Make the other two stages work less or (better) not at all
◦ If performance is the same, then the stages not included above
is the bottleneck
 Complication: the bus between CPU and graphics card may be bot-
tleneck (not a typical stage)
ITCS 4010/5010:Game Engine Design 4 Pipeline Optimization
Application (CPU) Stage the Bottleneck?
 Use top, osview command on Unix, TaskManager on Windows.
 If app uses (near) 100% of CPU time, then very likely application is
the bottleneck
 Using a code profiler is safer.
 Make CPU do less work (e.g., turn off collision-detection)
 Replace glVertex and glNormal with glColor
 Makes the geometry and rasterizer do almost nothing
 No vertices to transform, no normals to compute lighting for, no tri-
angles to rasterize
 If performance does not change, program is CPU-bound, or CPU-
limited
ITCS 4010/5010:Game Engine Design 5 Pipeline Optimization
Geometry Stage the Bottleneck?
 Trickiest stage to test
 Why? Change in geometry workload usually changes application
and rasterizer workload.
 Number of light sources only affects geometry stage:
◦ Disable light sources (vertex shaders can make this simple).
◦ If performance goes up, then geometry is bottleneck, and pro-
gram transform-limited
 Alternately, enable all light sources; if performance stays the same,
geometry stage NOT the bottleneck
 Alternately, test CPU and rasterizer instead
ITCS 4010/5010:Game Engine Design 6 Pipeline Optimization
Rasterizer Stage the Bottleneck?
 The easiest, and fastest to test
 Simply, decrease the size of the window you render to
◦ Does not change app. or geometry workload
◦ But rasterizer needs to fill fewer pixels
◦ If the performance goes up, then program is “fill-limited” or “fill-
bound”
 Make rasterizer work less: Turn of texturing, fog, blending, depth
buffering etc (if your architecture have performance penalties for
these)
ITCS 4010/5010:Game Engine Design 7 Pipeline Optimization
Optimization
 Optimize the bottleneck stage
 Only put enough effort, so that the bottleneck stage moves
 Did you get enough performance?
◦ Yes! Quit optimizing
◦ NO! Continute optimizing the (possibly new) bottleneck
 If close to maximum speed of system, might need to turn to acceler-
ation techniques (spatial data structures, occlusion culling, etc)
ITCS 4010/5010:Game Engine Design 8 Pipeline Optimization
Illustrating Optimization
 Height of bar: time it takes for that stage for one frame
 Highest bar is bottleneck
 After optimization: bottleneck has moved to APP
 No use in optimizing GEOM, turn to optimizing APP instead
ITCS 4010/5010:Game Engine Design 9 Pipeline Optimization
Application Stage Optimization
 Initial Steps:
◦ Turn on optimiziation flags in compiler
◦ Use code profilers, shows places where majority of time is spent
◦ This is time consuming stuff
 Strategy 1: Efficient code
◦ Use fewer instructions
◦ Use more efficient instructions
◦ Recode algorithmically
 Strategy 2: Efficient memory access
ITCS 4010/5010:Game Engine Design 10 Pipeline Optimization
Appliction:Code Optimization Tricks
 SIMD intstructions sets perfect for vector ops
◦ 2-4 operations in parallell
◦ SSE, SSE2, 3DNow! are examples
 Division is an expensive operation
◦ Between 4-39 times slower than most other instructions
◦ Good usage Example: vector normalization:
Instead of
v = (vx/d, vy/d, vz/d)
Do
d = v · v, f = 1/d, v = v ∗ f
 On some CPUs there are low-precision versions of (1/x) and square
root reciprocal (1/
√
x)
ITCS 4010/5010:Game Engine Design 11 Pipeline Optimization
Code Optimization Tricks (contd)
 Conditional branches are generally expensive;
◦ Avoid if-then-else if possible
◦ Sometimes branch prediction on CPUs works remarkably well
 Math functions (sin, cos, tan, sqrt, exp, etc.) are expensive
◦ Rough approximation might be sufficient
◦ Can use first few terms in Taylor series
 Inline code is good (avoids function calls)
 float (32 bits) is faster than double (64 bits); less data is sent down
the pipeline
ITCS 4010/5010:Game Engine Design 12 Pipeline Optimization
Code Optimization Tricks (contd)
 Compiler optimization: Hard to predict: –counter vs. counter–
 Use const in C and C++ to help to compiler with optimization
 Following often incur overhead:
◦ Dynamic casting (C++)
◦ Virtual methods
◦ Inherited constructors
◦ Passing structs by value
ITCS 4010/5010:Game Engine Design 13 Pipeline Optimization
Memory Optimization
 Memory hierarchies (caches) in modern computers - primary, sec-
ondary caches.
 Bad memory access pattern can ruin performance
 Not really about using less memory, though that can help
ITCS 4010/5010:Game Engine Design 14 Pipeline Optimization
Memory Optimization Tricks
 Sequential access: Store data in order in memory:
◦ Tex Coords #0, Position #0, Tex Coords #1, Position #1, Tex
coords #2, Position #2, etc.
 Cache prefetching is good, but hard to control
 malloc() and free() may be slow: Consider using a custom storage
allocator - allocate memory to a pool at startup
ITCS 4010/5010:Game Engine Design 15 Pipeline Optimization
Memory Optimization Tricks (contd)
 Align data with size of cache line
◦ Example: on most Pentiums, the cache line size if 32 bytes
◦ Now, assume that it takes 30 bytes to store a vertex
◦ Padding with another 2 bytes to 32 bytes will likely perform bet-
ter.
 Following pointers (linked list) is expensive (if memory is allocated
arbitrarily)
◦ Does not use coherence well that cache usually exploits
◦ That is, the address after the one we just used is likely to be
used soon
◦ Paper by Smits on ray tracing shows this.
ITCS 4010/5010:Game Engine Design 16 Pipeline Optimization
Geometry Stage: Optimization
 Geometry stage does per-vertex ops
◦ Best way to optimize: Use Triangle strips!!!
 Lighting optimization:
◦ Spot lights expensive, point light cheaper, directional light
cheapest
◦ Disable lighting if possible
◦ Use as few light sources as possible
◦ If you use 1/d2
fallof, then if d  10 (example), disable light
ITCS 4010/5010:Game Engine Design 17 Pipeline Optimization
Geometry Stage: Optimization
 Normals must be normalized to get correct lighting
◦ Normalize them as a preprocess, and disable normalizing if pos-
sible
 Lighting can be computed for both sides of a triangle; disable if not
needed.
 If light sources are static with respect to geometry, and material is
only diffuse
◦ Precompute lighting on CPU
◦ Send only precomputed colors (not normals)
ITCS 4010/5010:Game Engine Design 18 Pipeline Optimization
Raster Stage: Optimization
 Rasterizer stage does per-pixel ops
 Simple Optimization: turn on backface culling if possible
 Turn off Z-buffering if possible:
◦ Example: after screen clear, draw large background polygon
◦ Using polygon-aligned BSP trees
 Draw in front-to-back order
 Try disable features: texture filtering mode, fog, blending, multisam-
pling
ITCS 4010/5010:Game Engine Design 19 Pipeline Optimization
Raster Stage: Optimization
 To make rasterization faster, need to rasterize fewer (or cheaper)
pixels:
◦ Make window smaller
◦ Render to a smaller texture, and then enlarge texture onto
screen
 Depth complexity is number of times a pixel has been written to
◦ Good for understanding behaviour of application
ITCS 4010/5010:Game Engine Design 20 Pipeline Optimization
Depth Complexity
ITCS 4010/5010:Game Engine Design 21 Pipeline Optimization
Overall Optimization: General Techniques
 Reduce number of primitives, eg. using polygon simplification algo-
rithms
 Preprocess geometry and data for the particular architecture
 Turn off features not in use such as:
◦ Depth buffering, Blending, Fog, Texturing
ITCS 4010/5010:Game Engine Design 22 Pipeline Optimization
Overall Optimization (contd)
 Minimize state changes by grouping objects
◦ Example: objects with the same texture should be rendered to-
gether
 If all pixels are always drawn, avoid color buffer clear
 Frame buffer reads are expensive
 Display lists may work faster
 Precompile a list of primitives for faster rendering
 OpenGL API supports this
ITCS 4010/5010:Game Engine Design 23 Pipeline Optimization
Balancing the Pipeline
 The bottleneck stage sets the frame rate
 The other two stages will be idle for some time
 Also, to sync with monitor, there might be idle time for all stages
 Exploit this time to make quality of images better if possible
ITCS 4010/5010:Game Engine Design 24 Pipeline Optimization
Balancing the Pipeline
 Increase number of triangles (affects all stages)
 More lights, more expensive (geometry)
 More realistic animation, more accurate collision detection (applica-
tion)
 More expensive texture filtering, blending, etc. (rasterizer)
 If not fill-limited, increase window size
 Note: there are FIFOs between stages (and at many other places
too) to smooth out idleness of stages
 More techniques in text.
ITCS 4010/5010:Game Engine Design 25 Pipeline Optimization
Multiprocessing
 Use this if application is bottleneck, and is affordable
 Two major ways: (1) Multiprocessor pipelining, (2) Parallel process-
ing
ITCS 4010/5010:Game Engine Design 26 Pipeline Optimization
Summary
 Pipeline optimization is no substitute for good algorithms!
 Do optimization as a last step.
 Primarily for products that should be shipped
 Most often good to use triangle strips!
ITCS 4010/5010:Game Engine Design 27 Pipeline Optimization

More Related Content

PPT
Computer graphics
PPT
Actuators.ppt
PPTX
I2C BUS
PPT
8051 microcontroller introduction
PDF
COLOR CRT MONITORS IN COMPUTER GRAPHICS
PPTX
Arm architecture chapter2_steve_furber
PPTX
Dda algorithm
PPTX
Architectural Modeling
Computer graphics
Actuators.ppt
I2C BUS
8051 microcontroller introduction
COLOR CRT MONITORS IN COMPUTER GRAPHICS
Arm architecture chapter2_steve_furber
Dda algorithm
Architectural Modeling

What's hot (20)

PPTX
Lect7 Association analysis to correlation analysis
PDF
Internet Of things
PPTX
Display devices
PPTX
PPT
Clipping
PPT
Input devices in computer graphics
PPT
Shading
PPT
06. thumb instructions
PDF
Unit 1 intro-embedded
PPTX
Direct memory access (dma)
PPT
Introduction to computer graphics
PDF
Class 16 floating and proportional control mode
PPT
Parallel,Distributed and Real Time Systems
PDF
Samsung Techwin SCB-2000 Data Sheet
PPTX
07 chapter07 loop_diagrams
PPTX
Viewing transformation
PPTX
PROGRAMMABLE KEYBOARD AND DISPLAY INTERFACE(8279).pptx
PDF
Introduction to ARM LPC2148
PPTX
Curve and text clipping
PPT
Window to viewport transformation
Lect7 Association analysis to correlation analysis
Internet Of things
Display devices
Clipping
Input devices in computer graphics
Shading
06. thumb instructions
Unit 1 intro-embedded
Direct memory access (dma)
Introduction to computer graphics
Class 16 floating and proportional control mode
Parallel,Distributed and Real Time Systems
Samsung Techwin SCB-2000 Data Sheet
07 chapter07 loop_diagrams
Viewing transformation
PROGRAMMABLE KEYBOARD AND DISPLAY INTERFACE(8279).pptx
Introduction to ARM LPC2148
Curve and text clipping
Window to viewport transformation
Ad

Similar to 5035-Pipeline-Optimization-Techniques.pdf (20)

PDF
thu-blake-gdc-2014-final
PPTX
Game Memory Optimisation
PPTX
Improve the performance of your Unity project using Graphics Performance Anal...
PPTX
Unity Optimization Tips, Tricks and Tools
PDF
It Doesn't Have to Be Hard: How to Fix Your Performance Woes
PPTX
Introduction to Game Engine: Concepts & Components
PDF
The Architecture of Intel Processor Graphics: Gen 11
PDF
The Architecture of 11th Generation Intel® Processor Graphics
PPTX
Intel® Graphics Performance Analyzers
PDF
Debug, Analyze and Optimize Games with Intel Tools - Matteo Valoriani - Codem...
PDF
Debug, Analyze and Optimize Games with Intel Tools - Matteo Valoriani - Codem...
PDF
Debug, Analyze and Optimize Games with Intel Tools
PDF
PlayStation: Cutting Edge Techniques
PPT
Brewing Your Own Game Engie eng
PPTX
VR Optimization Techniques
PDF
Accelerate Your Game Development on Android*
PPSX
Gcn performance ftw by stephan hodes
PPTX
Game Engine for Serious Games
PPTX
Forts and Fights Scaling Performance on Unreal Engine*
PPTX
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
thu-blake-gdc-2014-final
Game Memory Optimisation
Improve the performance of your Unity project using Graphics Performance Anal...
Unity Optimization Tips, Tricks and Tools
It Doesn't Have to Be Hard: How to Fix Your Performance Woes
Introduction to Game Engine: Concepts & Components
The Architecture of Intel Processor Graphics: Gen 11
The Architecture of 11th Generation Intel® Processor Graphics
Intel® Graphics Performance Analyzers
Debug, Analyze and Optimize Games with Intel Tools - Matteo Valoriani - Codem...
Debug, Analyze and Optimize Games with Intel Tools - Matteo Valoriani - Codem...
Debug, Analyze and Optimize Games with Intel Tools
PlayStation: Cutting Edge Techniques
Brewing Your Own Game Engie eng
VR Optimization Techniques
Accelerate Your Game Development on Android*
Gcn performance ftw by stephan hodes
Game Engine for Serious Games
Forts and Fights Scaling Performance on Unreal Engine*
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser ...
Ad

Recently uploaded (20)

DOC
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
PDF
RMMM.pdf make it easy to upload and study
PPTX
Cell Types and Its function , kingdom of life
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PPTX
Cell Structure & Organelles in detailed.
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
PDF
Yogi Goddess Pres Conference Studio Updates
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
A systematic review of self-coping strategies used by university students to ...
PPTX
Orientation - ARALprogram of Deped to the Parents.pptx
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
Weekly quiz Compilation Jan -July 25.pdf
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
RMMM.pdf make it easy to upload and study
Cell Types and Its function , kingdom of life
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Abdominal Access Techniques with Prof. Dr. R K Mishra
Pharmacology of Heart Failure /Pharmacotherapy of CHF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Cell Structure & Organelles in detailed.
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
Yogi Goddess Pres Conference Studio Updates
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
A systematic review of self-coping strategies used by university students to ...
Orientation - ARALprogram of Deped to the Parents.pptx
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
human mycosis Human fungal infections are called human mycosis..pptx
Microbial disease of the cardiovascular and lymphatic systems
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Weekly quiz Compilation Jan -July 25.pdf
FourierSeries-QuestionsWithAnswers(Part-A).pdf

5035-Pipeline-Optimization-Techniques.pdf

  • 1. Overview Locating the bottleneck Performance measurements Optimizations Balancing the pipeline Other optimizations: multi-processing, parallel processing ITCS 4010/5010:Game Engine Design 1 Pipeline Optimization Pipeline Optimization “We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil” – Donald Knuth Make it run first, then optimize But only optimize where it makes any difference Pipeline Optimization: Process to maximize the rendering speed, then allow stages that are not bottlenecks to consume as much time as the bottleneck. ITCS 4010/5010:Game Engine Design 2 Pipeline Optimization Pipeline Optimization Stages execute in parallel Always the slowest stage is the bottleneck of the pipeline The bottleneck determines throughput (i.e., maximum speed) The bottleneck is the average bottleneck over a frame Cannot measure intra-frame bottlenecks easily Bottlenecks can change over a frame Most important: find bottleneck, then optimize that stage! ITCS 4010/5010:Game Engine Design 3 Pipeline Optimization Locating the Bottleneck Two bottleneck location techniques: Technique 1: ◦ Make a certain stage work less ◦ If performance is the better, then that stage is the bottleneck Technique 2: ◦ Make the other two stages work less or (better) not at all ◦ If performance is the same, then the stages not included above is the bottleneck Complication: the bus between CPU and graphics card may be bot- tleneck (not a typical stage) ITCS 4010/5010:Game Engine Design 4 Pipeline Optimization
  • 2. Application (CPU) Stage the Bottleneck? Use top, osview command on Unix, TaskManager on Windows. If app uses (near) 100% of CPU time, then very likely application is the bottleneck Using a code profiler is safer. Make CPU do less work (e.g., turn off collision-detection) Replace glVertex and glNormal with glColor Makes the geometry and rasterizer do almost nothing No vertices to transform, no normals to compute lighting for, no tri- angles to rasterize If performance does not change, program is CPU-bound, or CPU- limited ITCS 4010/5010:Game Engine Design 5 Pipeline Optimization Geometry Stage the Bottleneck? Trickiest stage to test Why? Change in geometry workload usually changes application and rasterizer workload. Number of light sources only affects geometry stage: ◦ Disable light sources (vertex shaders can make this simple). ◦ If performance goes up, then geometry is bottleneck, and pro- gram transform-limited Alternately, enable all light sources; if performance stays the same, geometry stage NOT the bottleneck Alternately, test CPU and rasterizer instead ITCS 4010/5010:Game Engine Design 6 Pipeline Optimization Rasterizer Stage the Bottleneck? The easiest, and fastest to test Simply, decrease the size of the window you render to ◦ Does not change app. or geometry workload ◦ But rasterizer needs to fill fewer pixels ◦ If the performance goes up, then program is “fill-limited” or “fill- bound” Make rasterizer work less: Turn of texturing, fog, blending, depth buffering etc (if your architecture have performance penalties for these) ITCS 4010/5010:Game Engine Design 7 Pipeline Optimization Optimization Optimize the bottleneck stage Only put enough effort, so that the bottleneck stage moves Did you get enough performance? ◦ Yes! Quit optimizing ◦ NO! Continute optimizing the (possibly new) bottleneck If close to maximum speed of system, might need to turn to acceler- ation techniques (spatial data structures, occlusion culling, etc) ITCS 4010/5010:Game Engine Design 8 Pipeline Optimization
  • 3. Illustrating Optimization Height of bar: time it takes for that stage for one frame Highest bar is bottleneck After optimization: bottleneck has moved to APP No use in optimizing GEOM, turn to optimizing APP instead ITCS 4010/5010:Game Engine Design 9 Pipeline Optimization Application Stage Optimization Initial Steps: ◦ Turn on optimiziation flags in compiler ◦ Use code profilers, shows places where majority of time is spent ◦ This is time consuming stuff Strategy 1: Efficient code ◦ Use fewer instructions ◦ Use more efficient instructions ◦ Recode algorithmically Strategy 2: Efficient memory access ITCS 4010/5010:Game Engine Design 10 Pipeline Optimization Appliction:Code Optimization Tricks SIMD intstructions sets perfect for vector ops ◦ 2-4 operations in parallell ◦ SSE, SSE2, 3DNow! are examples Division is an expensive operation ◦ Between 4-39 times slower than most other instructions ◦ Good usage Example: vector normalization: Instead of v = (vx/d, vy/d, vz/d) Do d = v · v, f = 1/d, v = v ∗ f On some CPUs there are low-precision versions of (1/x) and square root reciprocal (1/ √ x) ITCS 4010/5010:Game Engine Design 11 Pipeline Optimization Code Optimization Tricks (contd) Conditional branches are generally expensive; ◦ Avoid if-then-else if possible ◦ Sometimes branch prediction on CPUs works remarkably well Math functions (sin, cos, tan, sqrt, exp, etc.) are expensive ◦ Rough approximation might be sufficient ◦ Can use first few terms in Taylor series Inline code is good (avoids function calls) float (32 bits) is faster than double (64 bits); less data is sent down the pipeline ITCS 4010/5010:Game Engine Design 12 Pipeline Optimization
  • 4. Code Optimization Tricks (contd) Compiler optimization: Hard to predict: –counter vs. counter– Use const in C and C++ to help to compiler with optimization Following often incur overhead: ◦ Dynamic casting (C++) ◦ Virtual methods ◦ Inherited constructors ◦ Passing structs by value ITCS 4010/5010:Game Engine Design 13 Pipeline Optimization Memory Optimization Memory hierarchies (caches) in modern computers - primary, sec- ondary caches. Bad memory access pattern can ruin performance Not really about using less memory, though that can help ITCS 4010/5010:Game Engine Design 14 Pipeline Optimization Memory Optimization Tricks Sequential access: Store data in order in memory: ◦ Tex Coords #0, Position #0, Tex Coords #1, Position #1, Tex coords #2, Position #2, etc. Cache prefetching is good, but hard to control malloc() and free() may be slow: Consider using a custom storage allocator - allocate memory to a pool at startup ITCS 4010/5010:Game Engine Design 15 Pipeline Optimization Memory Optimization Tricks (contd) Align data with size of cache line ◦ Example: on most Pentiums, the cache line size if 32 bytes ◦ Now, assume that it takes 30 bytes to store a vertex ◦ Padding with another 2 bytes to 32 bytes will likely perform bet- ter. Following pointers (linked list) is expensive (if memory is allocated arbitrarily) ◦ Does not use coherence well that cache usually exploits ◦ That is, the address after the one we just used is likely to be used soon ◦ Paper by Smits on ray tracing shows this. ITCS 4010/5010:Game Engine Design 16 Pipeline Optimization
  • 5. Geometry Stage: Optimization Geometry stage does per-vertex ops ◦ Best way to optimize: Use Triangle strips!!! Lighting optimization: ◦ Spot lights expensive, point light cheaper, directional light cheapest ◦ Disable lighting if possible ◦ Use as few light sources as possible ◦ If you use 1/d2 fallof, then if d 10 (example), disable light ITCS 4010/5010:Game Engine Design 17 Pipeline Optimization Geometry Stage: Optimization Normals must be normalized to get correct lighting ◦ Normalize them as a preprocess, and disable normalizing if pos- sible Lighting can be computed for both sides of a triangle; disable if not needed. If light sources are static with respect to geometry, and material is only diffuse ◦ Precompute lighting on CPU ◦ Send only precomputed colors (not normals) ITCS 4010/5010:Game Engine Design 18 Pipeline Optimization Raster Stage: Optimization Rasterizer stage does per-pixel ops Simple Optimization: turn on backface culling if possible Turn off Z-buffering if possible: ◦ Example: after screen clear, draw large background polygon ◦ Using polygon-aligned BSP trees Draw in front-to-back order Try disable features: texture filtering mode, fog, blending, multisam- pling ITCS 4010/5010:Game Engine Design 19 Pipeline Optimization Raster Stage: Optimization To make rasterization faster, need to rasterize fewer (or cheaper) pixels: ◦ Make window smaller ◦ Render to a smaller texture, and then enlarge texture onto screen Depth complexity is number of times a pixel has been written to ◦ Good for understanding behaviour of application ITCS 4010/5010:Game Engine Design 20 Pipeline Optimization
  • 6. Depth Complexity ITCS 4010/5010:Game Engine Design 21 Pipeline Optimization Overall Optimization: General Techniques Reduce number of primitives, eg. using polygon simplification algo- rithms Preprocess geometry and data for the particular architecture Turn off features not in use such as: ◦ Depth buffering, Blending, Fog, Texturing ITCS 4010/5010:Game Engine Design 22 Pipeline Optimization Overall Optimization (contd) Minimize state changes by grouping objects ◦ Example: objects with the same texture should be rendered to- gether If all pixels are always drawn, avoid color buffer clear Frame buffer reads are expensive Display lists may work faster Precompile a list of primitives for faster rendering OpenGL API supports this ITCS 4010/5010:Game Engine Design 23 Pipeline Optimization Balancing the Pipeline The bottleneck stage sets the frame rate The other two stages will be idle for some time Also, to sync with monitor, there might be idle time for all stages Exploit this time to make quality of images better if possible ITCS 4010/5010:Game Engine Design 24 Pipeline Optimization
  • 7. Balancing the Pipeline Increase number of triangles (affects all stages) More lights, more expensive (geometry) More realistic animation, more accurate collision detection (applica- tion) More expensive texture filtering, blending, etc. (rasterizer) If not fill-limited, increase window size Note: there are FIFOs between stages (and at many other places too) to smooth out idleness of stages More techniques in text. ITCS 4010/5010:Game Engine Design 25 Pipeline Optimization Multiprocessing Use this if application is bottleneck, and is affordable Two major ways: (1) Multiprocessor pipelining, (2) Parallel process- ing ITCS 4010/5010:Game Engine Design 26 Pipeline Optimization Summary Pipeline optimization is no substitute for good algorithms! Do optimization as a last step. Primarily for products that should be shipped Most often good to use triangle strips! ITCS 4010/5010:Game Engine Design 27 Pipeline Optimization