SlideShare a Scribd company logo
Vu Phuong Hoang
ECSPart 1: Introduction to
Data-Oriented Design
2018
▪ found it too hard to reduce lags ?
▪ tried to improve core functions ?
▪ be defeated by a heavy loop ?
Check this out for some experiments
Have you ever ...
A “simple” loop
Frame Rate
TEST #1
TestCacheMissByOrder.cs
Find a minimum value in a table 1000x1000
Test #1 - Result
for (int r = 0; r < ROWS_COUNT; ++r) {
for (int c = 0; c < COLUMNS_COUNT; ++c) {
minValue = Math.Min(minValue, table[r][c]);
}
}
Iterate by each row, then by each column
4ms
for (int c = 0; c < COLUMNS_COUNT; ++c) {
for (int r = 0; r < ROWS_COUNT; ++r) {
minValue = Math.Min(minValue, table[r][c]);
}
}
Just swap the loops order
8ms ???
Test #1 - Result
for (int r = 0; r < ROWS_COUNT; ++r) {
for (int c = 0; c < COLUMNS_COUNT; ++c) {
minValue = Math.Min(minValue, table[r][c]);
}
}
Iterate by each row, then by each column
4ms
for (int c = 0; c < COLUMNS_COUNT; ++c) {
for (int r = 0; r < ROWS_COUNT; ++r) {
minValue = Math.Min(minValue, table[r][c]);
}
}
Just swap the loops order
8ms ???
CPU Cache
▪ Loading from Cache is faster than RAM
▪ Both data & instructions will be loaded
▪ References:
▪ Dogged Determination
▪ Fuzzy Reflection
▪ codeburst.io
CPU Cache
PS4 data loading latency
▪ When a value is read from memory, next values will be read too

à Data is loaded in batch (size = cache line)
▪ A cache line = 64 bytes
▪ Data already in Cache à Cache-hit
▪ Data not in Cache à Cache-miss à Need to load from slower memory
CPU Cache
H: Cache-Hit, M: Cache-Miss
Test #1 - Result explain
for (int r = 0; r < ROWS_COUNT; ++r) {
for (int c = 0; c < COLUMNS_COUNT; ++c) {
minValue = Math.Min(minValue, table[r][c]);
}
}
r1
r2
r3
for (int c = 0; c < COLUMNS_COUNT; ++c) {
for (int r = 0; r < ROWS_COUNT; ++r) {
minValue = Math.Min(minValue, table[r][c]);
}
}
c1 c2 c3
M H H H ...
M H H H ...
M
M M M
M M
M M
... ...
Test #1 - Take it further
int[][] table = new int[ ROWS_COUNT ][ ];
// table[i] = new int[ COLUMNS_COUNT ];
Iterate 2D array
4ms ???
int CELLS_COUNT = ROWS_COUNT * COLUMNS_COUNT
int[] flatTable = new int[ CELLS_COUNT ];
Iterate 1D array
2ms
Fragmentation
▪ Contiguous data is faster to load
▪ CPU allocates memory block where it fits
▪ Memory fragmentation is like a Swiss cheese
▪ Lead to cache-misses
Swiss-Cheese Memory
TEST #2
TestCacheMissByDataSize.cs
Read values in Arrays of different data types (10M elements)
Test #2 - Result
Iterate an array of int (4 bytes)
35ms
Iterate an array of struct (32 bytes)
58ms
Why ?
Bigger struct (36 bytes) is even worse
60ms
Test #2 - Result explain
Iterate an array of int (4 bytes)
35ms
Iterate an array of struct (32 bytes)
58ms
Why ? Answer: CPU Cache, again
Test #2 - Result explain
Cache Pollution
Un-used	data	
still	loaded
Less	space	
in	cache-line
More	
cache-misses
Test #2 - Result explain
Un-used	data	
still	loaded
Less	space	
in	cache-line
More	
cache-misses
GameObject in
OOP style ?
Test #2 - Take it further
Add 1 byte data to the struct.
Then its size comes from 32 to 36 bytes (expect 33).
Why ?
Add 1 byte data to the struct.
Then its size comes from 32 to 36 bytes (expect 33).
Why ?
Answer: Data alignment
More:
▪ Try appending 1 more byte, size keeps at 36.
▪ Try prepending 1 more byte, size goes to 40.
Test #2 - Take it further
▪ Data is put into 4-bytes “buckets”

for fast access
▪ When added data doesn’t fit
▪ Next (& empty) bucket will be used
▪ Wasted un-used bytes = padding
▪ References:
▪ Stdio.vn
▪ Wikipedia
▪ Song Ho Ahn
Data alignment
Without
data alignment
TEST #3
TestDataAlignment.cs
Change order of data in struct
Just re-order data from biggest to smallest size
8 bytes
Test #3 - Result
12 bytes
???
Cache-miss
▪ Fastest way to load data: NOT LOADING IT :)
▪ Second best ways ?
▪ Keep data small (if not, notice about data alignment)
▪ Keep data contiguous
▪ Separate data by function
▪ In Relational Database, sometimes we de-normalize for performance, too !
◆ Problem #1: Encapsulation makes it hard to do this
Functions are also data
▪ Function is split into instruction blocks
▪ CPU looks up these blocks from a table
▪ CPU loads these blocks into instruction cache (I$)
▪ Function call suffers from cache-miss, too !!!
▪ References:
▪ Wikipedia (Instruction Cycle)
▪ Wikipedia (Branch Misprediction)
Function call
TEST #4
TestVirtualFunctions.cs
How overriden functions affect performance ?
Test #4 - Result
Direct call
35ms
1-level indirect call
61ms
10-levels indirect call
411ms
▪ Fastest way to call a function: NOT CALLING IT :)
▪ Second best ways:
▪ Keep high-performance function small (fits in cache)
▪ Keep narrow class hierarchy
▪ 1 function to process multiple instances, not 1 function for each instance
◆ Problem #2: Encapsulation / Polymorphism makes it hard to do this
Function call
Wait, they are OOP core !
Encapsulation + Inheritance + Polymorphism
▪ Multiple inheritance
▪ Useful for game development, bad architecture
▪ “Diamond of dead”
◆ Problem #3: Not an easy way to implement multiple inheritance properly
Other OOP problems
▪ Multiple inheritance
▪ Useful for game development, bad architecture
▪ “Diamond of dead”
◆ Problem #3: Not an easy way to implement multiple inheritance properly
▪ Unit test
▪ My test uses some members, but I need to initialize them all !!!
◆ Problem #4: Unit test involves un-related constraints
Other OOP problems
▪ Multiple inheritance
▪ Useful for game development, bad architecture
▪ “Diamond of dead”
◆ Problem #3: Not an easy way to implement multiple inheritance properly
▪ Unit test
▪ My test uses some members, but I need to initialize them all !!!
◆ Problem #4: Unit test involves un-related constraints
▪ Jobify, False sharing, ...
Other OOP problems
Data-Oriented Design
▪ Focus on how data is laid out in memory
▪ Focus on how data is read / processed
▪ Build functions around data
Data-Oriented Design
▪ Focus on how data is laid out in memory
▪ Focus on how data is read / processed
▪ Build functions around data
▪ References:
▪ DICE
▪ Mike Acton (Insomniac Games, Unity)
▪ Richard Fabian
▪ Keith O’Connor (Ubisoft Montreal)
Data-Oriented Design
“The purpose of all programs, and all
parts of those programs, is to transform
data from one form to another ”
- Mike Acton -
“When there is one, there are many ”
- Mike Acton -
“Designing the code around the data,
not the other way around ”
- Linus Torvalds -
TEST #5
TestGoodEnoughAlgorithms.cs
Find closest object
Test #5 - Result
for (int i = 0; i < ELEMENTS_COUNT; ++i) {
d = GetDistance(center, objects[i].position);
if (minDistance > d) {
minDistance = d;
closestId = i;
}
}
Iterate Array of “GameObjects”
209ms
for (int i = 0; i < ELEMENTS_COUNT; ++i) {
d = GetDistance(center, positions[i]);
if (minDistance > d) {
minDistance = d;
closestId = i;
}
}
Iterate Array of positions
128ms
They’re almost identical, except line #2
Test #5 - Take it further
▪ You already knew DOD is faster (from previous test results)
▪ Let’s improve the algorithm (current: 209ms)
▪ Use GetSquareDistance instead of GetDistance à 137ms
▪ *Eliminate too far objects & pick the 1st close-enough object à 36ms
▪ Reduce branch mis-prediction à 34ms
*Human needs good-enough choice, not the optimal one.
Test #5 - Take it further
for (int i = 0; i < ELEMENTS_COUNT; ++i) {
d = GetSqDistance(center, objects[i].position);
if (d > MAX_SQ_DST) continue;
if (d < MIN_SQ_DST) { closestId = i; break; }
// ... original comparison here
}
Iterate Array of “GameObjects”
36ms
for (int i = 0; i < ELEMENTS_COUNT; ++i) {
d = GetSqDistance(center, positions[i]);
if (d > MAX_SQ_DST) continue;
if (d < MIN_SQ_DST) { closestId = i; break; }
// ... original comparison here
}
Iterate Array of positions
25ms
Your smart algorithm + DOD = AWESOME
▪ Reduce data cache-misses (Problem #1)
▪ Reduce function cache-misses, indirect function calls (Problem #2)
▪ Component over inheritance (Problem #3)
▪ Unit test = Feed input & Assert the output (Problem #4)
▪ References:
▪ Games From Within
▪ Tencent
Data-Oriented Design
ECSEntity
Component
System
Smart DOD
Architecture ?
▪ Performance & flexibility
▪ It’s the FUTURE (click links to see more)
▪ Mentioned top companies (Insomniac Games, Ubisoft, EA/DICE, ...)
▪ Sony
▪ Intel
▪ Apple
▪ Riot Games
▪ Unity !!! (other, other, other, other)
▪ More ...
Why should we care ?
These masterpieces
also use ECS
* Click images for more details
Q&A
End of Part 1

More Related Content

PDF
Game Programming 02 - Component-Based Entity Systems
PDF
Style & Design Principles 03 - Component-Based Entity Systems
PDF
Entity Component Systems
PDF
Component-Based Entity Systems (Demo)
PDF
Practical SPU Programming in God of War III
PPTX
Scaling CPU Experiences: An Introduction to the Entity Component System
PDF
Introduction to Coroutines @ KotlinConf 2017
PPS
God Of War : post mortem
Game Programming 02 - Component-Based Entity Systems
Style & Design Principles 03 - Component-Based Entity Systems
Entity Component Systems
Component-Based Entity Systems (Demo)
Practical SPU Programming in God of War III
Scaling CPU Experiences: An Introduction to the Entity Component System
Introduction to Coroutines @ KotlinConf 2017
God Of War : post mortem

What's hot (20)

PDF
Coder sans peur du changement avec la meme pas mal hexagonal architecture
PPTX
확률의 구현법
PDF
[NDC2017] 뛰는 프로그래머 나는 언리얼 엔진 - 언알못에서 커미터까지
PPTX
Relic's FX System
PDF
The Basics of Unity - The Game Engine
PDF
게임제작개론 : #0 과목소개
PDF
게임제작개론 : #8 게임 제작 프로세스
PPTX
Best practices: Async vs. coroutines - Unite Copenhagen 2019
PPTX
Unity 3D, A game engine
PDF
Game Programming 07 - Procedural Content Generation
PDF
게임제작개론 8
PDF
Deep Dive async/await in Unity with UniTask(UniRx.Async)
PDF
테라로 살펴본 MMORPG의 논타겟팅 시스템
PDF
Deep dive into Coroutines on JVM @ KotlinConf 2017
PDF
김혁, <드래곤 하운드>의 PBR과 레이트레이싱 렌더링 기법, NDC2019
PDF
게임제작개론 : #7 팀 역할과 게임 리소스에 대한 이해
PPTX
[Ndc11 박민근] deferred shading
PDF
덤프 파일을 통한 사후 디버깅 실용 테크닉 NDC2012
PDF
게임제작개론 : #4 게임 밸런싱
PDF
Using Vivox to connect your players: Text and voice comms – Unite Copenhagen ...
Coder sans peur du changement avec la meme pas mal hexagonal architecture
확률의 구현법
[NDC2017] 뛰는 프로그래머 나는 언리얼 엔진 - 언알못에서 커미터까지
Relic's FX System
The Basics of Unity - The Game Engine
게임제작개론 : #0 과목소개
게임제작개론 : #8 게임 제작 프로세스
Best practices: Async vs. coroutines - Unite Copenhagen 2019
Unity 3D, A game engine
Game Programming 07 - Procedural Content Generation
게임제작개론 8
Deep Dive async/await in Unity with UniTask(UniRx.Async)
테라로 살펴본 MMORPG의 논타겟팅 시스템
Deep dive into Coroutines on JVM @ KotlinConf 2017
김혁, <드래곤 하운드>의 PBR과 레이트레이싱 렌더링 기법, NDC2019
게임제작개론 : #7 팀 역할과 게임 리소스에 대한 이해
[Ndc11 박민근] deferred shading
덤프 파일을 통한 사후 디버깅 실용 테크닉 NDC2012
게임제작개론 : #4 게임 밸런싱
Using Vivox to connect your players: Text and voice comms – Unite Copenhagen ...
Ad

Similar to ECS (Part 1/3) - Introduction to Data-Oriented Design (20)

PPTX
Cache & CPU performance
PPTX
introduction to data structures and types
PDF
"Optimization of a .NET application- is it simple ! / ?", Yevhen Tatarynov
PDF
Rainer Grimm, “Functional Programming in C++11”
PPT
Lecture 1 and 2 of Data Structures & Algorithms
PDF
Advance data structure & algorithm
PPT
Basic terminologies & asymptotic notations
PPT
Rocky Nevin's presentation at eComm 2008
PDF
Tiling matrix-matrix multiply, code tuning
PPT
Intro_2.ppt
PPT
Intro.ppt
PPT
Intro.ppt
PDF
Quick Wins
PDF
Java OOP Programming language (Part 8) - Java Database JDBC
PDF
Algorithm Design and Analysis
PDF
How Database Convergence Impacts the Coming Decades of Data Management
PDF
4java Basic Syntax
PDF
Data Structures and Algorithms (DSA) in C
PDF
Data Structure - Lecture 2 - Recursion Stack Queue.pdf
PPT
Hub 102 - Lesson 5 - Algorithm: Sorting & Searching
Cache & CPU performance
introduction to data structures and types
"Optimization of a .NET application- is it simple ! / ?", Yevhen Tatarynov
Rainer Grimm, “Functional Programming in C++11”
Lecture 1 and 2 of Data Structures & Algorithms
Advance data structure & algorithm
Basic terminologies & asymptotic notations
Rocky Nevin's presentation at eComm 2008
Tiling matrix-matrix multiply, code tuning
Intro_2.ppt
Intro.ppt
Intro.ppt
Quick Wins
Java OOP Programming language (Part 8) - Java Database JDBC
Algorithm Design and Analysis
How Database Convergence Impacts the Coming Decades of Data Management
4java Basic Syntax
Data Structures and Algorithms (DSA) in C
Data Structure - Lecture 2 - Recursion Stack Queue.pdf
Hub 102 - Lesson 5 - Algorithm: Sorting & Searching
Ad

More from Phuong Hoang Vu (11)

PDF
Abalanche - Unity Shader Graph #1: Shader & PBR Materials
PDF
Introduce phaser
PDF
Unity Visual Effect Graph
PPTX
[UX Series] 6 - Animation principles
PPTX
[UX Series] 5 - Navigation
PPTX
[UX Series] 4 - Contrast in design
PPTX
[UX Series] 2 - Clean design. Less is more
PPTX
[UX Series] 3 - User behavior patterns and design principles
PPTX
[UX Series] 1b - 12 standard screen layouts
PPTX
[UX Series] 1 - UX Introduction
PDF
Cross platform mobile approaches
Abalanche - Unity Shader Graph #1: Shader & PBR Materials
Introduce phaser
Unity Visual Effect Graph
[UX Series] 6 - Animation principles
[UX Series] 5 - Navigation
[UX Series] 4 - Contrast in design
[UX Series] 2 - Clean design. Less is more
[UX Series] 3 - User behavior patterns and design principles
[UX Series] 1b - 12 standard screen layouts
[UX Series] 1 - UX Introduction
Cross platform mobile approaches

Recently uploaded (20)

PPTX
A Presentation on Artificial Intelligence
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Machine learning based COVID-19 study performance prediction
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Big Data Technologies - Introduction.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
sap open course for s4hana steps from ECC to s4
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Approach and Philosophy of On baking technology
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
A Presentation on Artificial Intelligence
Empathic Computing: Creating Shared Understanding
Programs and apps: productivity, graphics, security and other tools
Network Security Unit 5.pdf for BCA BBA.
Machine learning based COVID-19 study performance prediction
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Per capita expenditure prediction using model stacking based on satellite ima...
Big Data Technologies - Introduction.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
sap open course for s4hana steps from ECC to s4
The AUB Centre for AI in Media Proposal.docx
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Approach and Philosophy of On baking technology
gpt5_lecture_notes_comprehensive_20250812015547.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx

ECS (Part 1/3) - Introduction to Data-Oriented Design

  • 1. Vu Phuong Hoang ECSPart 1: Introduction to Data-Oriented Design 2018
  • 2. ▪ found it too hard to reduce lags ? ▪ tried to improve core functions ? ▪ be defeated by a heavy loop ? Check this out for some experiments Have you ever ... A “simple” loop Frame Rate
  • 3. TEST #1 TestCacheMissByOrder.cs Find a minimum value in a table 1000x1000
  • 4. Test #1 - Result for (int r = 0; r < ROWS_COUNT; ++r) { for (int c = 0; c < COLUMNS_COUNT; ++c) { minValue = Math.Min(minValue, table[r][c]); } } Iterate by each row, then by each column 4ms for (int c = 0; c < COLUMNS_COUNT; ++c) { for (int r = 0; r < ROWS_COUNT; ++r) { minValue = Math.Min(minValue, table[r][c]); } } Just swap the loops order 8ms ???
  • 5. Test #1 - Result for (int r = 0; r < ROWS_COUNT; ++r) { for (int c = 0; c < COLUMNS_COUNT; ++c) { minValue = Math.Min(minValue, table[r][c]); } } Iterate by each row, then by each column 4ms for (int c = 0; c < COLUMNS_COUNT; ++c) { for (int r = 0; r < ROWS_COUNT; ++r) { minValue = Math.Min(minValue, table[r][c]); } } Just swap the loops order 8ms ??? CPU Cache
  • 6. ▪ Loading from Cache is faster than RAM ▪ Both data & instructions will be loaded ▪ References: ▪ Dogged Determination ▪ Fuzzy Reflection ▪ codeburst.io CPU Cache PS4 data loading latency
  • 7. ▪ When a value is read from memory, next values will be read too
 à Data is loaded in batch (size = cache line) ▪ A cache line = 64 bytes ▪ Data already in Cache à Cache-hit ▪ Data not in Cache à Cache-miss à Need to load from slower memory CPU Cache
  • 8. H: Cache-Hit, M: Cache-Miss Test #1 - Result explain for (int r = 0; r < ROWS_COUNT; ++r) { for (int c = 0; c < COLUMNS_COUNT; ++c) { minValue = Math.Min(minValue, table[r][c]); } } r1 r2 r3 for (int c = 0; c < COLUMNS_COUNT; ++c) { for (int r = 0; r < ROWS_COUNT; ++r) { minValue = Math.Min(minValue, table[r][c]); } } c1 c2 c3 M H H H ... M H H H ... M M M M M M M M ... ...
  • 9. Test #1 - Take it further int[][] table = new int[ ROWS_COUNT ][ ]; // table[i] = new int[ COLUMNS_COUNT ]; Iterate 2D array 4ms ??? int CELLS_COUNT = ROWS_COUNT * COLUMNS_COUNT int[] flatTable = new int[ CELLS_COUNT ]; Iterate 1D array 2ms Fragmentation
  • 10. ▪ Contiguous data is faster to load ▪ CPU allocates memory block where it fits ▪ Memory fragmentation is like a Swiss cheese ▪ Lead to cache-misses Swiss-Cheese Memory
  • 11. TEST #2 TestCacheMissByDataSize.cs Read values in Arrays of different data types (10M elements)
  • 12. Test #2 - Result Iterate an array of int (4 bytes) 35ms Iterate an array of struct (32 bytes) 58ms Why ? Bigger struct (36 bytes) is even worse 60ms
  • 13. Test #2 - Result explain Iterate an array of int (4 bytes) 35ms Iterate an array of struct (32 bytes) 58ms Why ? Answer: CPU Cache, again
  • 14. Test #2 - Result explain Cache Pollution Un-used data still loaded Less space in cache-line More cache-misses
  • 15. Test #2 - Result explain Un-used data still loaded Less space in cache-line More cache-misses GameObject in OOP style ?
  • 16. Test #2 - Take it further Add 1 byte data to the struct. Then its size comes from 32 to 36 bytes (expect 33). Why ?
  • 17. Add 1 byte data to the struct. Then its size comes from 32 to 36 bytes (expect 33). Why ? Answer: Data alignment More: ▪ Try appending 1 more byte, size keeps at 36. ▪ Try prepending 1 more byte, size goes to 40. Test #2 - Take it further
  • 18. ▪ Data is put into 4-bytes “buckets”
 for fast access ▪ When added data doesn’t fit ▪ Next (& empty) bucket will be used ▪ Wasted un-used bytes = padding ▪ References: ▪ Stdio.vn ▪ Wikipedia ▪ Song Ho Ahn Data alignment Without data alignment
  • 20. Just re-order data from biggest to smallest size 8 bytes Test #3 - Result 12 bytes ???
  • 21. Cache-miss ▪ Fastest way to load data: NOT LOADING IT :) ▪ Second best ways ? ▪ Keep data small (if not, notice about data alignment) ▪ Keep data contiguous ▪ Separate data by function ▪ In Relational Database, sometimes we de-normalize for performance, too ! ◆ Problem #1: Encapsulation makes it hard to do this
  • 23. ▪ Function is split into instruction blocks ▪ CPU looks up these blocks from a table ▪ CPU loads these blocks into instruction cache (I$) ▪ Function call suffers from cache-miss, too !!! ▪ References: ▪ Wikipedia (Instruction Cycle) ▪ Wikipedia (Branch Misprediction) Function call
  • 24. TEST #4 TestVirtualFunctions.cs How overriden functions affect performance ?
  • 25. Test #4 - Result Direct call 35ms 1-level indirect call 61ms 10-levels indirect call 411ms
  • 26. ▪ Fastest way to call a function: NOT CALLING IT :) ▪ Second best ways: ▪ Keep high-performance function small (fits in cache) ▪ Keep narrow class hierarchy ▪ 1 function to process multiple instances, not 1 function for each instance ◆ Problem #2: Encapsulation / Polymorphism makes it hard to do this Function call
  • 27. Wait, they are OOP core ! Encapsulation + Inheritance + Polymorphism
  • 28. ▪ Multiple inheritance ▪ Useful for game development, bad architecture ▪ “Diamond of dead” ◆ Problem #3: Not an easy way to implement multiple inheritance properly Other OOP problems
  • 29. ▪ Multiple inheritance ▪ Useful for game development, bad architecture ▪ “Diamond of dead” ◆ Problem #3: Not an easy way to implement multiple inheritance properly ▪ Unit test ▪ My test uses some members, but I need to initialize them all !!! ◆ Problem #4: Unit test involves un-related constraints Other OOP problems
  • 30. ▪ Multiple inheritance ▪ Useful for game development, bad architecture ▪ “Diamond of dead” ◆ Problem #3: Not an easy way to implement multiple inheritance properly ▪ Unit test ▪ My test uses some members, but I need to initialize them all !!! ◆ Problem #4: Unit test involves un-related constraints ▪ Jobify, False sharing, ... Other OOP problems
  • 32. ▪ Focus on how data is laid out in memory ▪ Focus on how data is read / processed ▪ Build functions around data Data-Oriented Design
  • 33. ▪ Focus on how data is laid out in memory ▪ Focus on how data is read / processed ▪ Build functions around data ▪ References: ▪ DICE ▪ Mike Acton (Insomniac Games, Unity) ▪ Richard Fabian ▪ Keith O’Connor (Ubisoft Montreal) Data-Oriented Design
  • 34. “The purpose of all programs, and all parts of those programs, is to transform data from one form to another ” - Mike Acton -
  • 35. “When there is one, there are many ” - Mike Acton -
  • 36. “Designing the code around the data, not the other way around ” - Linus Torvalds -
  • 38. Test #5 - Result for (int i = 0; i < ELEMENTS_COUNT; ++i) { d = GetDistance(center, objects[i].position); if (minDistance > d) { minDistance = d; closestId = i; } } Iterate Array of “GameObjects” 209ms for (int i = 0; i < ELEMENTS_COUNT; ++i) { d = GetDistance(center, positions[i]); if (minDistance > d) { minDistance = d; closestId = i; } } Iterate Array of positions 128ms They’re almost identical, except line #2
  • 39. Test #5 - Take it further ▪ You already knew DOD is faster (from previous test results) ▪ Let’s improve the algorithm (current: 209ms) ▪ Use GetSquareDistance instead of GetDistance à 137ms ▪ *Eliminate too far objects & pick the 1st close-enough object à 36ms ▪ Reduce branch mis-prediction à 34ms *Human needs good-enough choice, not the optimal one.
  • 40. Test #5 - Take it further for (int i = 0; i < ELEMENTS_COUNT; ++i) { d = GetSqDistance(center, objects[i].position); if (d > MAX_SQ_DST) continue; if (d < MIN_SQ_DST) { closestId = i; break; } // ... original comparison here } Iterate Array of “GameObjects” 36ms for (int i = 0; i < ELEMENTS_COUNT; ++i) { d = GetSqDistance(center, positions[i]); if (d > MAX_SQ_DST) continue; if (d < MIN_SQ_DST) { closestId = i; break; } // ... original comparison here } Iterate Array of positions 25ms Your smart algorithm + DOD = AWESOME
  • 41. ▪ Reduce data cache-misses (Problem #1) ▪ Reduce function cache-misses, indirect function calls (Problem #2) ▪ Component over inheritance (Problem #3) ▪ Unit test = Feed input & Assert the output (Problem #4) ▪ References: ▪ Games From Within ▪ Tencent Data-Oriented Design
  • 43. ▪ Performance & flexibility ▪ It’s the FUTURE (click links to see more) ▪ Mentioned top companies (Insomniac Games, Ubisoft, EA/DICE, ...) ▪ Sony ▪ Intel ▪ Apple ▪ Riot Games ▪ Unity !!! (other, other, other, other) ▪ More ... Why should we care ?
  • 44. These masterpieces also use ECS * Click images for more details