SlideShare a Scribd company logo
Dissecting the Rendering of
Philip Hammer
DECK13 Interactive GmbH
Quo Vadis, Berlin, 24.4.2018
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
Introduction
● DECK13 Interactive released “The Surge” in 2017
○ New IP, new publisher
○ Overhauled tech (Fledge Engine / 3th Generation)
○ Award Winning: Best German Game, Best Graphics, Best PC-/Console-Game (Deutscher Entwicklerpreis 2017)
○ Our most ambitious game from DECK13
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
Introduction
● Team of around 70 people in Frankfurt
○ Tech Department: ~11 people (engine + game code)
○ Art- & Sound-Outsourcing
● Myself
○ Since 2006 @ DECK13
○ Working on rendering / engine / graphics / shaders
○ Worked on The Surge, Lords of the Fallen, Venetica, Ankh, Jack Keane, etc.
● The results and techniques presented in this article is
the work of many people in the Deck13 Tech department.
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
● Fledge Gen 1 (2009: Blood Knights, Tiger & Chicken)
○ PS3, Xbox 360, PC / D3D9, iOS (iPad 2 and up)
○ Deferred Rendering, Direct Lighting only, Minimal Multithreading
● Fledge Gen 2 (2012: Lords of the Fallen)
○ PS4, Xbox One, PC / D3D11
○ Volumetric Lighting, Direct & Indirect Lighting, Task-based multithreaded rendering
● Fledge Gen 3 (2014: The Surge)
○ PS4 (+Pro), Xbox One (+X), PC / D3D11
○ Physically-based rendering, Clustered Deferred Rendering, GPU Particles
● Fledge Gen 4 (2017/2018: The Surge 2)
○ Vulkan, D3D12
○ Currently in the making .. stay tuned
Tech Evolution
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
● Fledge Gen 1 (2009: Blood Knights, Tiger & Chicken)
○ PS3, Xbox 360, PC / D3D9, iOS (iPad 2 and up)
○ Deferred Rendering, Direct Lighting only, Minimal Multithreading
● Fledge Gen 2 (2012: Lords of the Fallen)
○ PS4, Xbox One, PC / D3D11
○ Volumetric Lighting, Direct & Indirect Lighting, Task-based multithreaded rendering
● Fledge Gen 3 (2014: The Surge)
○ PS4 (+Pro), Xbox One (+X), PC / D3D11
○ Physically-based rendering, Clustered Deferred Rendering, GPU Particles
● Fledge Gen 4 (2017/2018: The Surge 2)
○ Vulkan, D3D12
○ Currently in the making .. stay tuned
Tech Evolution
Today’s topics
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
Tech Evolution
● The Surge Tech (Gen 3)
○ Stable Framerate across all platforms
PS4: 1080p @ 30 FPS
PS4 Pro: 1620p @ 30 FPS or 1080p @ 60 FPS
Xbox One: 900p @ 30 FPS
Xbox One X: 1800p @ 30 FPS or 1080p @ 60 FPS
○ Physical-Based Rendering
○ Clustered Deferred Rendering
○ Volumetric Lighting
○ GPU Particles
○ Screen-space Reflections
○ etc.
● New things in the making (Gen 4) - short peak into the future towards the end
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
Physical-based Rendering
● Switched from (non-PBR) Blinn-Phong to
GGX Cook-Torrance BRDF [1]
○ De-facto industry standard.
○ More material data to drive the BRDF
● Artists needed to adapt (Workflow, Tools, Mindset)
○ Lots of pitfalls (no arbitrary texture data)
○ Adoption process was rather unproblematic -
most tools (Substance, Marmoset) already provide PBR workflow
● We use “Metalness-Workflow”
○ Artist provide Albedo, Normal, Roughness and Metalness textures
○ Metalness is a mask to treat the albedo differently in specular lighting
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
Physical-based Rendering
● Direct Lighting: 100% dynamic lights
○ 16 shadowmaps rendered into atlas (4kx4k - 8kx8k, D16_FLOAT)
○ If cap reached, the shadowmap isn’t updated anymore and
virtually becomes static
● Image-based lighting
○ Precomputed, parallax corrected environment probes (Artist placed)
○ Specular probe is 256x256 cubemap (BC6_UFLOAT)
with GGX filtered importance sampled mip chain [2]
○ Diffuse lighting is simply the 6th mip level of probe
(“incorrect”, but visually equivalent with proper irradiance)
○ IBL pass can be modified with simple, multiplicative “ambient lights” [3]
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
Physical-based Rendering
● G-Buffer breakdown
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
Physical-based Rendering
● G-Buffer breakdown
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
Physical-based Rendering
X Y Z W
RT0 (8:8:8:8) Albedo RGB Material-ID
RT1 (10:10:10:2) VS Normal XYZ -
RT2 (8:8:8:8) Roughness Metalness Occlusion [shared]
RT3 (16:16) Motion Vectors XY - -
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
Physical-based Rendering
● Material-ID indexes directly into StructuredBuffer to query per-material data
○ Save G-Buffer space
● [shared] - per-pixel context dependent
○ mutual exclusive material data
○ based on per-material data
○ Emissive Mask
■ Defines whether or not to interpret albedo as emissive
■ Emissive combined in final “combine” pass
■ Effectively saves dedicated emissive channel
○ Translucency
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
Clustered Deferred Rendering
● Switch from rasterization-based light volume rendering to full (async) compute-based approach
○ Low CPU overhead
■ Light culling runs entirely on GPU
■ Filling a buffer with light infos instead of dispatching thousands of drawcalls
○ Advantages on GPU
■ No need to fetch G-Buffer for every light
○ Async Compute: Lighting runs in parallel to shadow rendering (at least on consoles)
○ But: many more optimizations necessary to get better perf
● Could render environment probes in the same pass
○ Environment probes are still clustered but rendered in a separate (pixelshader) pass together with SSR
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
Clustered Deferred Rendering
● Divide view frustum into a 3D grid
○ In our case: 16 x 8 x 24
● Culling: Assign lights to grid cells
○ Upload light culling info to GPU (StructuredBuffer with Position, AABB, etc.)
○ Create list of light indices for each cell (single large uint buffer)
● Dispatch lighting compute shader
○ In fact we dispatch twice: unshadowed and shadowed lights
○ Unshadowed can run in parallel with shadowmap generation
● Can use cluster information also for forward rendering
○ We do this for our lit transparent objects
○ Simply compute grid cell index for a position and query light list
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
Deferred Decals
● Decals play a major role in our environment art
○ Static: Logos/Signs, Material Layers (Sand, Water Puddles, Rust, etc.), Color Variations
○ Dynamic: Blood, Explosion Marks, etc.
● Extremely flexible
● Break uniform look of heavily instanced scenes
● Adds lot of large- and small-scale details
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
Deferred Decals
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
Deferred Decals
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
Deferred Decals
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
Deferred Decals
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
Deferred Decals
● Modifies G-Buffer by alpha-blending onto it
○ Therefore, lighting is “free” since it’s done afterwards
● 2 methods for tangentspace reconstruction
○ Surface Normal (use G-Buffer normal)
○ Planar (use decal projection direction)
● Full PBR support + many per-decal features (add. Mask, UV modifiers, etc.)
● Implementation rasterization-based deferred
○ Rasterize geometry (boxes) for each decal
○ CPU bottleneck with large number of decals
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
Deferred Decals
● Common issue with Deferred Decals: Wrong Mip Selection due to screenspace gradients
○ Problem: Texture leaks around depth discontinuities
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
Deferred Decals
● Common issue with Deferred Decals: Wrong Mip Selection due to screenspace gradients
○ Problem: Texture leaks around depth discontinuities
○ Common solution: Use highest mip
■ Causes flickering in distance due to oversampling (no mips)
■ Texture cache hit
○ Our solution: Use mip0 only with large depth discontinuities
// Sample 2 quads
const float4 d0 = depthSampler.Gather(sampler_point_clamp, screenUV, int2(-1, -1));
const float4 d1 = depthSampler.Gather(sampler_point_clamp, screenUV, int2(0, 0));
const float4 dCross = float4 (d0.z, d0.y, d1.y, d1.z);
const float dC = d.w;
// Find suitable neighbor screen positions in x and y so we can compute proper gradients
// Select based on the smallest different in depth
const bool useFirstMip = any(abs(dC.xxxx - dCross) > 0.001);
if (useFirstMip)
albedoTex.SampleLevel(..);
else
albedoTex.Sample(..);
d0.x d0.y
d0.z d0.w
d1.x d1.y
d1.z d1.w
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
“Object Decals”
● Alternative Term: “Blend Meshes”
● Alpha-Blend arbitrary meshes on the G-Buffer
○ Artists can create simple plane-”decals” with custom UV setup
○ Efficiently add small, high-res details like panels, rivets, LED, etc.
○ Works also on skinned objects (e.g. logos on Exo-Gear)
1 Base G-Buffer Pass (solid)
2 Object Decal Pass (alpha-blend)
3 Deferred Decal Pass (alpha-blend)
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
Next Decals / Fledge Gen 4
● “Bindless” Decals
○ Analogous to clustered deferred lighting
■ Culling & rendering happens entirely on the GPU
■ Collect info about all visible decals in a buffer
■ Render all decals before lighting in the same compute shader
○ Decal info stores texture IDs (UINT32) to index directly into DescriptorSet / DescriptorTable
○ Blending not restricted to alpha-blending anymore (linear interpolation)
■ “Geometric” normal blending possible [4]
■ Replacing layered materials with decals is now feasible
○ Availability of interpolated vertex-normals in G-Buffer improves T-Space reconstruction
○ Currently in active development
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
Optimizing for Occupancy / GCN
● GCN Hardware wants saturated CU Units
○ Huge lighting shader uses a lot of general purpose registers
if not structured carefully
● Reducing register usage (VGPR/SGPR) can be a huge win
○ Especially for long, ALU heavy shaders such as lighting
○ Minimize register lifetime
○ Look at the data and iterate
■ runtime profilers
■ static shader code analysis statistics
● Goal: Want min. 40% GCN Wave Occupancy on
PS4 and Xbox One (for lighting compute shader)
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
Optimizing for Occupancy / GCN
● Example: Split light type loops
○ Different light types uses different data
■ Shadowed lights use shadow projection matrices, shadowmaps, etc.
■ Image projectors use image projection matrices, images, etc.
■ Boxlights must check bounds differently
■ etc.
○ Shader can free up register usage if structured well
for each light in lightbuffer
if light.type == POINT
// do pointlight calculations
if light.type == SPOT
// do spotlight calculations
else if light.type == SPOT_SHADOWED
// do shadowed spotlight calculated
end
for each light in lightbuffer_point
// do pointlight calculations
end
for each light in lightbuffer_spot
// do spotlight calculations
end
for each light in lightbuffer_spot_shadowed
// do shadowed spotlight calculations
end
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
What’s next ?
● Currently working on Fledge Gen 4
○ Always improving tech iteratively
○ Always keep existing systems “alive”
○ Parallel Development of new systems / breaking changes
● Spread knowledge
○ Weekly presentation meeting (tech internal)
● Leap to new APIs
○ Vulkan, DirectX 12
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
What’s next ?
● New low-level renderer design
○ Better match the new APIs (no more state-driven)
○ More low-level control such as explicit resource syncs, GPU memory management, etc.
○ Async-Compute also on PC
○ More C-style, more data-oriented
○ “Do as little as possible during render-loop” aka “prebake as much as we can”
■ Setting DescriptorSets, Map/Unmap GPU memory, etc.
○ Goal: Rendering must not be a CPU performance bottleneck
● Better ingame-profiling for content creators
● Better tools for artists and game designers
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
What’s next ?
● Improving specific rendering subsystems
○ Switch to physically based inverse square falloff (lumen units)
○ Improved IBL system (e.g. split irradiance and specular probes)
○ Unified volumetric fog / lighting (“lit fog” vs. “volumetric lighting”)
○ Bindless Decals
○ New material system
■ More flexibility for custom shaders / FX Materials
■ Better fit for the new rendering backend interface
○ Improved postprocessing, Antialiasing, HDR tonemapping / color correction
Thank you for
your attention!
DECK13 is hiring!
● Tools Programmer
● Concept Environment Artist
● VFX Artist
● etc.
@philiphammer0
phammer@deck13.com
linkedin.com/in/philip-hammer-430baa6
Questions ?
Dissecting the Rendering of The Surge
Quo Vadis Berlin 2018
References
[1] Walter et al., "Microfacet Models for Refraction through Rough Surfaces"
[2] Karis, “Real Shading in Unreal Engine 4”, Siggraph 2013
[3] Schulz, Mader, “Rendering Techniques in Ryse: Son of Rome”, Siggraph 2014
[4] Barré-Brisebois, Hill, "Blending in Detail"
http://blog.selfshadow.com/publications/blending-in-detail/

More Related Content

PPT
A Bit More Deferred Cry Engine3
PPTX
Optimizing the Graphics Pipeline with Compute, GDC 2016
PPTX
Five Rendering Ideas from Battlefield 3 & Need For Speed: The Run
PPT
Secrets of CryENGINE 3 Graphics Technology
PPT
Crysis Next-Gen Effects (GDC 2008)
PPTX
Frostbite on Mobile
PDF
OpenGL 4.4 - Scene Rendering Techniques
PPTX
Terrain in Battlefield 3: A Modern, Complete and Scalable System
A Bit More Deferred Cry Engine3
Optimizing the Graphics Pipeline with Compute, GDC 2016
Five Rendering Ideas from Battlefield 3 & Need For Speed: The Run
Secrets of CryENGINE 3 Graphics Technology
Crysis Next-Gen Effects (GDC 2008)
Frostbite on Mobile
OpenGL 4.4 - Scene Rendering Techniques
Terrain in Battlefield 3: A Modern, Complete and Scalable System

What's hot (20)

PPTX
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
PDF
Graphics Gems from CryENGINE 3 (Siggraph 2013)
PDF
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)
PPSX
Advancements in-tiled-rendering
PPTX
Physically Based and Unified Volumetric Rendering in Frostbite
PPSX
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
PPTX
Rendering Technologies from Crysis 3 (GDC 2013)
PPTX
Physically Based Sky, Atmosphere and Cloud Rendering in Frostbite
PDF
Rendering Tech of Space Marine
PDF
Siggraph2016 - The Devil is in the Details: idTech 666
PDF
Advanced Scenegraph Rendering Pipeline
PDF
Screen Space Reflections in The Surge
PPTX
Decima Engine: Visibility in Horizon Zero Dawn
PPTX
Lighting you up in Battlefield 3
PPTX
FrameGraph: Extensible Rendering Architecture in Frostbite
PDF
Bindless Deferred Decals in The Surge 2
PPTX
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
PDF
Hill Stephen Rendering Tools Splinter Cell Conviction
PDF
Volumetric Lighting for Many Lights in Lords of the Fallen
PPTX
Triangle Visibility buffer
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
Graphics Gems from CryENGINE 3 (Siggraph 2013)
The Rendering Technology of 'Lords of the Fallen' (Game Connection Europe 2014)
Advancements in-tiled-rendering
Physically Based and Unified Volumetric Rendering in Frostbite
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Rendering Technologies from Crysis 3 (GDC 2013)
Physically Based Sky, Atmosphere and Cloud Rendering in Frostbite
Rendering Tech of Space Marine
Siggraph2016 - The Devil is in the Details: idTech 666
Advanced Scenegraph Rendering Pipeline
Screen Space Reflections in The Surge
Decima Engine: Visibility in Horizon Zero Dawn
Lighting you up in Battlefield 3
FrameGraph: Extensible Rendering Architecture in Frostbite
Bindless Deferred Decals in The Surge 2
Parallel Graphics in Frostbite - Current & Future (Siggraph 2009)
Hill Stephen Rendering Tools Splinter Cell Conviction
Volumetric Lighting for Many Lights in Lords of the Fallen
Triangle Visibility buffer
Ad

Similar to Dissecting the Rendering of The Surge (20)

PPTX
The Rendering Pipeline - Challenges & Next Steps
PDF
Taking Killzone Shadow Fall Image Quality Into The Next Generation
PDF
Deferred rendering in_leadwerks_engine[1]
PDF
Rendering basics
PPT
Felwyrld Tech
PPT
Bending the Graphics Pipeline
PDF
Smedberg niklas bringing_aaa_graphics
PDF
Rendering Techniques in Rise of the Tomb Raider
PDF
Deferred Rendering in Killzone 2
PDF
Deferred Rendering in Killzone 2
PDF
Mobile crossplatformchallenges siggraph
PDF
Mobile crossplatformchallenges siggraph
PPT
NVIDIA Graphics, Cg, and Transparency
PPT
Anatomy of a Texture Fetch
PPTX
FlameWorks GTC 2014
PDF
Evolution of the modern graphics architectures with a focus on GPUs | Turing1...
PPTX
Summer Games University - Day 3
PDF
Deferred Rendering in Killzone 2
PDF
The technology behind_the_elemental_demo_16x9-1248544805
PPTX
Lighting the City of Glass
The Rendering Pipeline - Challenges & Next Steps
Taking Killzone Shadow Fall Image Quality Into The Next Generation
Deferred rendering in_leadwerks_engine[1]
Rendering basics
Felwyrld Tech
Bending the Graphics Pipeline
Smedberg niklas bringing_aaa_graphics
Rendering Techniques in Rise of the Tomb Raider
Deferred Rendering in Killzone 2
Deferred Rendering in Killzone 2
Mobile crossplatformchallenges siggraph
Mobile crossplatformchallenges siggraph
NVIDIA Graphics, Cg, and Transparency
Anatomy of a Texture Fetch
FlameWorks GTC 2014
Evolution of the modern graphics architectures with a focus on GPUs | Turing1...
Summer Games University - Day 3
Deferred Rendering in Killzone 2
The technology behind_the_elemental_demo_16x9-1248544805
Lighting the City of Glass
Ad

Recently uploaded (20)

PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PPTX
Materi-Enum-and-Record-Data-Type (1).pptx
PPT
JAVA ppt tutorial basics to learn java programming
PDF
System and Network Administraation Chapter 3
PPTX
Online Work Permit System for Fast Permit Processing
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
medical staffing services at VALiNTRY
PPTX
Introduction to Artificial Intelligence
PDF
Digital Strategies for Manufacturing Companies
PDF
AI in Product Development-omnex systems
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PPTX
ISO 45001 Occupational Health and Safety Management System
PPTX
Transform Your Business with a Software ERP System
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PPTX
Operating system designcfffgfgggggggvggggggggg
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
Wondershare Filmora 15 Crack With Activation Key [2025
Materi-Enum-and-Record-Data-Type (1).pptx
JAVA ppt tutorial basics to learn java programming
System and Network Administraation Chapter 3
Online Work Permit System for Fast Permit Processing
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Adobe Illustrator 28.6 Crack My Vision of Vector Design
medical staffing services at VALiNTRY
Introduction to Artificial Intelligence
Digital Strategies for Manufacturing Companies
AI in Product Development-omnex systems
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
ISO 45001 Occupational Health and Safety Management System
Transform Your Business with a Software ERP System
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Operating system designcfffgfgggggggvggggggggg
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
VVF-Customer-Presentation2025-Ver1.9.pptx
2025 Textile ERP Trends: SAP, Odoo & Oracle

Dissecting the Rendering of The Surge

  • 1. Dissecting the Rendering of Philip Hammer DECK13 Interactive GmbH Quo Vadis, Berlin, 24.4.2018
  • 2. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 Introduction ● DECK13 Interactive released “The Surge” in 2017 ○ New IP, new publisher ○ Overhauled tech (Fledge Engine / 3th Generation) ○ Award Winning: Best German Game, Best Graphics, Best PC-/Console-Game (Deutscher Entwicklerpreis 2017) ○ Our most ambitious game from DECK13
  • 3. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 Introduction ● Team of around 70 people in Frankfurt ○ Tech Department: ~11 people (engine + game code) ○ Art- & Sound-Outsourcing ● Myself ○ Since 2006 @ DECK13 ○ Working on rendering / engine / graphics / shaders ○ Worked on The Surge, Lords of the Fallen, Venetica, Ankh, Jack Keane, etc. ● The results and techniques presented in this article is the work of many people in the Deck13 Tech department.
  • 4. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 ● Fledge Gen 1 (2009: Blood Knights, Tiger & Chicken) ○ PS3, Xbox 360, PC / D3D9, iOS (iPad 2 and up) ○ Deferred Rendering, Direct Lighting only, Minimal Multithreading ● Fledge Gen 2 (2012: Lords of the Fallen) ○ PS4, Xbox One, PC / D3D11 ○ Volumetric Lighting, Direct & Indirect Lighting, Task-based multithreaded rendering ● Fledge Gen 3 (2014: The Surge) ○ PS4 (+Pro), Xbox One (+X), PC / D3D11 ○ Physically-based rendering, Clustered Deferred Rendering, GPU Particles ● Fledge Gen 4 (2017/2018: The Surge 2) ○ Vulkan, D3D12 ○ Currently in the making .. stay tuned Tech Evolution
  • 5. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 ● Fledge Gen 1 (2009: Blood Knights, Tiger & Chicken) ○ PS3, Xbox 360, PC / D3D9, iOS (iPad 2 and up) ○ Deferred Rendering, Direct Lighting only, Minimal Multithreading ● Fledge Gen 2 (2012: Lords of the Fallen) ○ PS4, Xbox One, PC / D3D11 ○ Volumetric Lighting, Direct & Indirect Lighting, Task-based multithreaded rendering ● Fledge Gen 3 (2014: The Surge) ○ PS4 (+Pro), Xbox One (+X), PC / D3D11 ○ Physically-based rendering, Clustered Deferred Rendering, GPU Particles ● Fledge Gen 4 (2017/2018: The Surge 2) ○ Vulkan, D3D12 ○ Currently in the making .. stay tuned Tech Evolution Today’s topics
  • 6. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 Tech Evolution ● The Surge Tech (Gen 3) ○ Stable Framerate across all platforms PS4: 1080p @ 30 FPS PS4 Pro: 1620p @ 30 FPS or 1080p @ 60 FPS Xbox One: 900p @ 30 FPS Xbox One X: 1800p @ 30 FPS or 1080p @ 60 FPS ○ Physical-Based Rendering ○ Clustered Deferred Rendering ○ Volumetric Lighting ○ GPU Particles ○ Screen-space Reflections ○ etc. ● New things in the making (Gen 4) - short peak into the future towards the end
  • 7. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 Physical-based Rendering ● Switched from (non-PBR) Blinn-Phong to GGX Cook-Torrance BRDF [1] ○ De-facto industry standard. ○ More material data to drive the BRDF ● Artists needed to adapt (Workflow, Tools, Mindset) ○ Lots of pitfalls (no arbitrary texture data) ○ Adoption process was rather unproblematic - most tools (Substance, Marmoset) already provide PBR workflow ● We use “Metalness-Workflow” ○ Artist provide Albedo, Normal, Roughness and Metalness textures ○ Metalness is a mask to treat the albedo differently in specular lighting
  • 8. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 Physical-based Rendering ● Direct Lighting: 100% dynamic lights ○ 16 shadowmaps rendered into atlas (4kx4k - 8kx8k, D16_FLOAT) ○ If cap reached, the shadowmap isn’t updated anymore and virtually becomes static ● Image-based lighting ○ Precomputed, parallax corrected environment probes (Artist placed) ○ Specular probe is 256x256 cubemap (BC6_UFLOAT) with GGX filtered importance sampled mip chain [2] ○ Diffuse lighting is simply the 6th mip level of probe (“incorrect”, but visually equivalent with proper irradiance) ○ IBL pass can be modified with simple, multiplicative “ambient lights” [3]
  • 9. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 Physical-based Rendering ● G-Buffer breakdown
  • 10. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 Physical-based Rendering ● G-Buffer breakdown
  • 11. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 Physical-based Rendering X Y Z W RT0 (8:8:8:8) Albedo RGB Material-ID RT1 (10:10:10:2) VS Normal XYZ - RT2 (8:8:8:8) Roughness Metalness Occlusion [shared] RT3 (16:16) Motion Vectors XY - -
  • 12. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 Physical-based Rendering ● Material-ID indexes directly into StructuredBuffer to query per-material data ○ Save G-Buffer space ● [shared] - per-pixel context dependent ○ mutual exclusive material data ○ based on per-material data ○ Emissive Mask ■ Defines whether or not to interpret albedo as emissive ■ Emissive combined in final “combine” pass ■ Effectively saves dedicated emissive channel ○ Translucency
  • 13. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 Clustered Deferred Rendering ● Switch from rasterization-based light volume rendering to full (async) compute-based approach ○ Low CPU overhead ■ Light culling runs entirely on GPU ■ Filling a buffer with light infos instead of dispatching thousands of drawcalls ○ Advantages on GPU ■ No need to fetch G-Buffer for every light ○ Async Compute: Lighting runs in parallel to shadow rendering (at least on consoles) ○ But: many more optimizations necessary to get better perf ● Could render environment probes in the same pass ○ Environment probes are still clustered but rendered in a separate (pixelshader) pass together with SSR
  • 14. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 Clustered Deferred Rendering ● Divide view frustum into a 3D grid ○ In our case: 16 x 8 x 24 ● Culling: Assign lights to grid cells ○ Upload light culling info to GPU (StructuredBuffer with Position, AABB, etc.) ○ Create list of light indices for each cell (single large uint buffer) ● Dispatch lighting compute shader ○ In fact we dispatch twice: unshadowed and shadowed lights ○ Unshadowed can run in parallel with shadowmap generation ● Can use cluster information also for forward rendering ○ We do this for our lit transparent objects ○ Simply compute grid cell index for a position and query light list
  • 15. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 Deferred Decals ● Decals play a major role in our environment art ○ Static: Logos/Signs, Material Layers (Sand, Water Puddles, Rust, etc.), Color Variations ○ Dynamic: Blood, Explosion Marks, etc. ● Extremely flexible ● Break uniform look of heavily instanced scenes ● Adds lot of large- and small-scale details
  • 16. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 Deferred Decals
  • 17. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 Deferred Decals
  • 18. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 Deferred Decals
  • 19. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 Deferred Decals
  • 20. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 Deferred Decals ● Modifies G-Buffer by alpha-blending onto it ○ Therefore, lighting is “free” since it’s done afterwards ● 2 methods for tangentspace reconstruction ○ Surface Normal (use G-Buffer normal) ○ Planar (use decal projection direction) ● Full PBR support + many per-decal features (add. Mask, UV modifiers, etc.) ● Implementation rasterization-based deferred ○ Rasterize geometry (boxes) for each decal ○ CPU bottleneck with large number of decals
  • 21. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 Deferred Decals ● Common issue with Deferred Decals: Wrong Mip Selection due to screenspace gradients ○ Problem: Texture leaks around depth discontinuities
  • 22. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 Deferred Decals ● Common issue with Deferred Decals: Wrong Mip Selection due to screenspace gradients ○ Problem: Texture leaks around depth discontinuities ○ Common solution: Use highest mip ■ Causes flickering in distance due to oversampling (no mips) ■ Texture cache hit ○ Our solution: Use mip0 only with large depth discontinuities // Sample 2 quads const float4 d0 = depthSampler.Gather(sampler_point_clamp, screenUV, int2(-1, -1)); const float4 d1 = depthSampler.Gather(sampler_point_clamp, screenUV, int2(0, 0)); const float4 dCross = float4 (d0.z, d0.y, d1.y, d1.z); const float dC = d.w; // Find suitable neighbor screen positions in x and y so we can compute proper gradients // Select based on the smallest different in depth const bool useFirstMip = any(abs(dC.xxxx - dCross) > 0.001); if (useFirstMip) albedoTex.SampleLevel(..); else albedoTex.Sample(..); d0.x d0.y d0.z d0.w d1.x d1.y d1.z d1.w
  • 23. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 “Object Decals” ● Alternative Term: “Blend Meshes” ● Alpha-Blend arbitrary meshes on the G-Buffer ○ Artists can create simple plane-”decals” with custom UV setup ○ Efficiently add small, high-res details like panels, rivets, LED, etc. ○ Works also on skinned objects (e.g. logos on Exo-Gear) 1 Base G-Buffer Pass (solid) 2 Object Decal Pass (alpha-blend) 3 Deferred Decal Pass (alpha-blend)
  • 24. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 Next Decals / Fledge Gen 4 ● “Bindless” Decals ○ Analogous to clustered deferred lighting ■ Culling & rendering happens entirely on the GPU ■ Collect info about all visible decals in a buffer ■ Render all decals before lighting in the same compute shader ○ Decal info stores texture IDs (UINT32) to index directly into DescriptorSet / DescriptorTable ○ Blending not restricted to alpha-blending anymore (linear interpolation) ■ “Geometric” normal blending possible [4] ■ Replacing layered materials with decals is now feasible ○ Availability of interpolated vertex-normals in G-Buffer improves T-Space reconstruction ○ Currently in active development
  • 25. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 Optimizing for Occupancy / GCN ● GCN Hardware wants saturated CU Units ○ Huge lighting shader uses a lot of general purpose registers if not structured carefully ● Reducing register usage (VGPR/SGPR) can be a huge win ○ Especially for long, ALU heavy shaders such as lighting ○ Minimize register lifetime ○ Look at the data and iterate ■ runtime profilers ■ static shader code analysis statistics ● Goal: Want min. 40% GCN Wave Occupancy on PS4 and Xbox One (for lighting compute shader)
  • 26. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 Optimizing for Occupancy / GCN ● Example: Split light type loops ○ Different light types uses different data ■ Shadowed lights use shadow projection matrices, shadowmaps, etc. ■ Image projectors use image projection matrices, images, etc. ■ Boxlights must check bounds differently ■ etc. ○ Shader can free up register usage if structured well for each light in lightbuffer if light.type == POINT // do pointlight calculations if light.type == SPOT // do spotlight calculations else if light.type == SPOT_SHADOWED // do shadowed spotlight calculated end for each light in lightbuffer_point // do pointlight calculations end for each light in lightbuffer_spot // do spotlight calculations end for each light in lightbuffer_spot_shadowed // do shadowed spotlight calculations end
  • 27. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 What’s next ? ● Currently working on Fledge Gen 4 ○ Always improving tech iteratively ○ Always keep existing systems “alive” ○ Parallel Development of new systems / breaking changes ● Spread knowledge ○ Weekly presentation meeting (tech internal) ● Leap to new APIs ○ Vulkan, DirectX 12
  • 28. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 What’s next ? ● New low-level renderer design ○ Better match the new APIs (no more state-driven) ○ More low-level control such as explicit resource syncs, GPU memory management, etc. ○ Async-Compute also on PC ○ More C-style, more data-oriented ○ “Do as little as possible during render-loop” aka “prebake as much as we can” ■ Setting DescriptorSets, Map/Unmap GPU memory, etc. ○ Goal: Rendering must not be a CPU performance bottleneck ● Better ingame-profiling for content creators ● Better tools for artists and game designers
  • 29. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 What’s next ? ● Improving specific rendering subsystems ○ Switch to physically based inverse square falloff (lumen units) ○ Improved IBL system (e.g. split irradiance and specular probes) ○ Unified volumetric fog / lighting (“lit fog” vs. “volumetric lighting”) ○ Bindless Decals ○ New material system ■ More flexibility for custom shaders / FX Materials ■ Better fit for the new rendering backend interface ○ Improved postprocessing, Antialiasing, HDR tonemapping / color correction
  • 30. Thank you for your attention! DECK13 is hiring! ● Tools Programmer ● Concept Environment Artist ● VFX Artist ● etc. @philiphammer0 phammer@deck13.com linkedin.com/in/philip-hammer-430baa6
  • 32. Dissecting the Rendering of The Surge Quo Vadis Berlin 2018 References [1] Walter et al., "Microfacet Models for Refraction through Rough Surfaces" [2] Karis, “Real Shading in Unreal Engine 4”, Siggraph 2013 [3] Schulz, Mader, “Rendering Techniques in Ryse: Son of Rome”, Siggraph 2014 [4] Barré-Brisebois, Hill, "Blending in Detail" http://blog.selfshadow.com/publications/blending-in-detail/