SlideShare a Scribd company logo
This slide deck was presented at Unite Berlin 2018.
This offline version includes numerous additional slides, cut
from the original presentation for brevity and/or time.
These extra slides contains more examples and data, but are
not essential for understanding the presentation.
Optimization & Best Practices:
Through The Ages
Ian Dundore
Unity Technologies
This guy again?
Spoiler Alert
• Scripting Performance
• Transforms
• Audio
• Animations
First:
An important message.
Even me. Especially me.
Profile everything.
Remember this?
oops.
• In the specific case of String.Equals, that advice is wrong!
• From a performance perspective, at least.
• For all other string comparisons, it’s right!
• Compare, StartsWith, EndsWith, IndexOf, etc.
• Again, from a performance perspective.
• (Psst! This is documented!)
https://docs.microsoft.com/en-us/dotnet/standard/base-types/best-practices-strings#common-string-comparison-methods-in-net
Let’s test it.
Testing Considerations
• How does the code path differ with different inputs?
• What is the environment around the executing code?
• Runtime
• IL2CPP/Mono? .Net version?
• Hardware
• Pipeline depth, cache size, cache-line length
• # of cores, core affinity settings on threads, throttling
• What exactly is your test measuring?
Your Test Harness Matters!
Profiler.BeginSample(“Test A”);
for (int i=0; i<NUM_TESTS; ++i) {
DoAThing(i);
}
Profiler.EndSample();
int i = 0;
Profiler.BeginSample(“Test B”);
DoAThing(0);
while (i<NUM_TESTS) {
DoAThing(++i);
DoAThing(++i);
DoAThing(++i);
// … repeat a lot …
DoAThing(++i);
}
Profiler.EndSample();
Less Loop OverheadMore Loop Overhead
public bool Equals(String value) {
if (this == null)
throw new NullReferenceException();
if (value == null)
return false;
if (Object.ReferenceEquals(this, value))
return true;
if (this.Length != value.Length)
return false;
return EqualsHelper(this, value);
}
Mono’s String.cs (1)
What does EqualsHelper do?
• Uses unsafe code to pin strings to memory addresses.
• C-style integer comparison of raw bytes of the strings.
• Core is a special cache-friendly loop.
• 64-bit: Step through strings with a stride of 12 bytes.
while (length >= 12)
{
if (*(long*)a != *(long*)b) return false;
if (*(long*)(a+4) != *(long*)(b+4)) return false;
if (*(long*)(a+8) != *(long*)(b+8)) return false;
a += 12; b += 12; length -= 12;
}
public bool Equals(String value, StringComparison comparisonType) {
if (comparisonType < StringComparison.CurrentCulture ||
comparisonType > StringComparison.OrdinalIgnoreCase)
throw new ArgumentException(…);
Contract.EndContractBlock();
if ((Object)this == (Object)value) {
return true;
}
if ((Object)value == null) {
return false;
}
Mono’s String.cs (2)
switch (comparisonType) {
case StringComparison.CurrentCulture:
return (CultureInfo.CurrentCulture.CompareInfo.Compare(this,
value, CompareOptions.None) == 0);
case StringComparison.CurrentCultureIgnoreCase:
return (CultureInfo.CurrentCulture.CompareInfo.Compare(this,
value, CompareOptions.IgnoreCase) == 0);
case StringComparison.InvariantCulture:
return (CultureInfo.InvariantCulture.CompareInfo.Compare(this,
value, CompareOptions.None) == 0);
case StringComparison.InvariantCultureIgnoreCase:
return (CultureInfo.InvariantCulture.CompareInfo.Compare(this,
value, CompareOptions.IgnoreCase) == 0);
Mono’s String.cs (3)
case StringComparison.Ordinal:
if (this.Length != value.Length)
return false;
return EqualsHelper(this, value);
Mono’s String.cs (4)
But wait!
• For non-matching strings, length will often differ.
• But for length-invariant strings, first character usually differs.
• This optimization is found in CompareOrdinal, but not Equals.
public static int CompareOrdinal(String strA, String strB) {
if ((Object)strA == (Object)strB)
return 0;
if (strA == null)
return -1;
if (strB == null)
return 1;
// Most common case, first character is different.
if ((strA.m_firstChar - strB.m_firstChar) != 0)
return strA.m_firstChar - strB.m_firstChar;
return CompareOrdinalHelper(strA, strB);
}
This is getting silly.
public static int CompareOrdinal(String strA, int indexA,
String strB, int indexB, int length) {
if (strA == null || strB == null) {
if ((Object)strA==(Object)strB) { //they're both null;
return 0;
}
return (strA==null)? -1 : 1; //-1 if A is null, 1 if B is null.
}
return nativeCompareOrdinalEx(strA, indexA, strB, indexB, length);
}
An overload that goes almost directly to native code!
Test Design: 4 cases
• Case 1: Two identical strings.
• Case 2: Two strings of random characters of same length.
• Case 3: Two strings of random characters of same length.
• First characters identical, to bypass check in Compare.
• Case 4: Two strings of random characters, different lengths.
• Comparison’s worst case is bounded by the shorter string.
• Constrained range to 15-25 characters to be similar to above tests.
Mono 3.5
Identical Content
Identical Length
Random Content
Identical Length
First Char Equal
Identical Length
Random Content
Random Length
String.Equals 2.97 1.75 1.73 1.30
String.Equals
with Ordinal type
5.87 3.46 3.56 3.39
String.Compare 37.52 33.29 64.66 31.35
String.Compare
with Ordinal type
6.23 3.35 3.35 3.26
CompareOrdinal 5.68 3.10 3.18 2.99
CompareOrdinal
with Indices
5.53 3.30 3.42 3.95
Simple
Hand-Coded
5.46 1.75 2.18 1.40
100,000 comparisons. Timings in milliseconds.
Unity 2018.1.0f2, Windows Standalone, Mono 3.5, Core i7-3500K
Mono 3.5
Identical Content
Identical Length
Random Content
Identical Length
First Char Equal
Identical Length
Random Content
Random Length
String.Equals 3.23 1.80 1.82 1.21
String.Equals
with Ordinal type
3.84 2.13 2.03 1.38
String.Compare 34.72 28.70 63.03 29.74
String.Compare
with Ordinal type
5.16 1.75 2.68 1.65
CompareOrdinal 4.93 1.55 2.21 1.40
CompareOrdinal
with Indices
4.77 3.59 3.59 4.41
Simple
Hand-Coded
4.40 1.66 1.95 1.28
100,000 comparisons. Timings in milliseconds.
Unity 2018.1.0f2, Windows Standalone, Mono 4.6, Core i7-3500K
IL2CPP
Identical Content
Identical Length
Random Content
Identical Length
First Char Equal
Identical Length
Random Content
Random Length
String.Equals 2.61 1.26 1.27 0.95
String.Equals
with Ordinal type
5.38 3.80 3.84 3.66
String.Compare 39.12 29.32 60.56 28.01
String.Compare
with Ordinal type
4.84 3.58 3.62 3.52
CompareOrdinal 4.78 3.55 3.58 3.51
CompareOrdinal
with Indices
4.93 3.71 3.72 4.17
Simple
Hand-Coded
13.83 3.52 3.93 2.16
100,000 comparisons. Timings in milliseconds.
Unity 2018.1.0f2, Windows Standalone, IL2CPP 3.5, Core i7-6700K
IL2CPP
Identical Content
Identical Length
Random Content
Identical Length
First Char Equal
Identical Length
Random Content
Random Length
String.Equals 2.64 1.92 1.93 0.96
String.Equals
with Ordinal type
2.94 2.26 2.73 1.49
String.Compare 40.98 30.61 60.82 29.26
String.Compare
with Ordinal type
3.18 1.46 2.29 1.32
CompareOrdinal 2.99 1.18 2.06 1.12
CompareOrdinal
with Indices
5.56 3.93 4.08 4.41
Simple
Hand-Coded
14.14 3.78 4.14 2.35
100,000 comparisons. Timings in milliseconds.
Unity 2018.1.0f2, Windows Standalone, IL2CPP 4.6, Core i7-6700K
Raw Data
String.Equals/Random on Mono 3.5 = 1
Conclusions & more questions
• String.Equals clearly wins for plain string comparison.
• .NET 4.6 has improvements for String.Compare variants.
• Ordinal comparisons clearly win on culture-sensitive APIs.
• Use String.CompareOrdinal instead of String.Compare.
• Use StringComparison.Ordinal on other String APIs.
• How does this map across platforms?
IL2CPP
Identical Content
Identical Length
Random Content
Identical Length
First Char Equal
Identical Length
Random Content
Random Length
String.Equals 13.48 5.08 5.01 5.26
String.Equals
with Ordinal type
25.42 19.46 19.85 14.16
String.Compare 118.80 128.69 254.30 124.81
String.Compare
with Ordinal type
24.23 11.49 11.57 10.95
CompareOrdinal 23.92 11.09 11.54 10.75
CompareOrdinal
with Indices
23.79 14.76 18.62 15.05
Simple
Hand-Coded
58.02 12.04 21.86 8.13
100,000 comparisons. Timings in milliseconds.
Unity 2018.1.0f2, iOS, IL2CPP 3.5, iPad Mini 3
String.Equals/Random on given platform = 1
Very similar results, in this case.
Another tip!
• See a lot of time going to NullCheck in IL2CPP builds?
• Disable these checks in release builds!
• Works on types, methods & properties.
• Code is in IL2CppSetOptionAttribute.cs, under Unity install folder
[Il2CppSetOption(Option.NullChecks, false)]
public bool MyEquals(String strA, String strB) {
// …
}
IL2CPP
Identical Content
Identical Length
Random Content
Random Length
Normal 58.02 8.13
NullCheck
Disabled
53.02 7.03
100,000 comparisons. Timings in milliseconds.
Unity 2018.1.0f2, iOS, IL2CPP 3.5, iPad Mini 3
Small, but helpful.
Transforms
* no, not this kind of transform
5.3: Discrete Objects
A
B
C
D
Hierarchy
A
D
C
B
Memory
OnTransformChanged
• Internal message, broadcast each time a Transform changes
• Position, rotation, scale, parent, sibling order, etc.
• Tells other components to update their internal state
• PhysX/Box2D, UnityUI, Renderers (AABBs), etc.
• Repeated messages can cause performance problems
• Use Transform.SetPositionAndRotation (5.6+)
5.4+: Contiguous buffers
A
B
C
D
Hierarchy
A
B
C
D
Memory
TransformHierarchy structure
Enter the Dispatch
• TransformChangeDispatch was first introduced in 5.4
• Other systems migrated to use it, slowly.
• Renderers in 5.6
• Animators in 2017.1
• Physics in 2017.2
• RectTransforms in 2017.3
• OnTransformChanged was removed entirely in 2018.1
How Transforms are structured
• 1 TransformHierarchy structure represents a root Transform
• Contains buffers tracking data of all transforms in hierarchy
• TRS, indices for parents & siblings
• Interest bitmask & dirty bitmask
• Internal systems register interest & track state via specific bits
• Physics is one bit, renderer is another bit, etc.
• System walks affected parts of TransformHierarachy structure
• dirtyMask |= -1 & interestMask
When are async changes applied?
• TCD keeps a list of dirty TransformHierarchy pointers
• Systems request list of changed Transforms before running.
• e.g. Before FixedUpdate, before rendering, before animating.
• Use list to update internal system state.
• TCD iterates over list of dirty TransformHierarchies.
• Iterates over all Transforms to check each dirty bit.
Unity's Evolving Best Practices
Quick Reminder
• Buffer size: Transform.hierarchyCapacity
• Set before mass reparenting operations!
• Reparent & reposition during instantiate!
• GameObject.Instantiate( prefab, parent );
• GameObject.Instantiate( prefab, parent, position, rotation );
Split your hierarchies.
• Changing any Transform marks the whole Hierarchy dirty.
• Dirty hierarchies must be fully examined for change bits.
• Smaller hierarchies = more granular Hierarchy tracking.
• Smaller hierarchies = fewer Transforms to check.
• Fewer roots = more Transforms to check for changes.
• Change checks are jobified, but operate on roots.
Extreme cases.
UnparentedParented
100 Root GameObjects
+ 1000 empty GameObjects
+ 1 Cube w/ Rotation script
100,000 empty GameObjects
100 Cubes w/ Rotation script
A welcome effect.
Parented Unparented
Main Thread 553 ms 32 ms
Worker Threads 139 ms 14 ms
100 Rotating Cubes, 100k Empty GameObjects.
iPad Mini 3. CPU time used over 10 seconds.
This is just checking hierarchies!
Parented Unparented
Main Thread 1.77 ms 0.11 ms
100 Rotating Cubes, 100k Empty GameObjects.
iPad Mini 3. CPU time used over 10 seconds.
“PostLateUpdate.UpdateAllRenderers”
Transforms & Physics: 2017.2+
• 2017.1/older: Physics components were synced to Transforms.
• Each Transform change = expensive update of Physics scene.
• 2017.2/newer: Updates can be delayed to next FixedUpdate.
• Update Physics entities from set of changed Transforms.
• Re-indexing computations are batched.
• This could have side-effects!
• Move a Collider + immediately Raycast towards it? No bueno.
Physics.AutoSyncTransforms
• When true, forces legacy behavior.
• Colliders/Rigidbodies check for syncs on every Physics call.
• Yes, every Raycast, Spherecast, etc.
• Huge performance regression, if you’re not batching updates.
• When false, uses delayed-update behavior.
• Can force updates: Physics.SyncTransforms
• Default value is true in 2017.2 through 2018.2
• 2018.3 is the first version where the default is false.
void Update()
{
float rotAmt = 2f * Time.deltaTime;
Vector3 up = Vector3.up;
if (batched)
{
for(int i = 0; i < NUM_PARENTS; ++i)
rotators[i].Rotate(up, rotAmt);
for(int i = 0; i < NUM_PARENTS; ++i)
Physics.Raycast(Vector3.zero, Random.insideUnitSphere);
}
else
{
for (int i = 0; i < NUM_PARENTS; ++i)
{
rotators[i].Rotate(up, rotAmt);
Physics.Raycast(Vector3.zero, Random.insideUnitSphere);
}
}
}
A test.
“Batched”
“Immediate”
Seriously, a big effect.
Parented
Immediate
Unparented
Immediate
Parented
Batched
Unparented
Batched
Script 4450 ms 4270 ms 1980 ms 882 ms
Physics 1410 ms 1820 ms 1840 ms 1770 ms
100 Rotating Cubes, Rigidbodies, Trigger Box Colliders. 100k Empty GameObjects.
App Framerate: 30. Physics Timestep 0.04 sec.
iPad Mini 3. CPU time used over 10 seconds.
Audio
The Basics
• Unity uses FMOD internally.
• Audio decoding & playback occurs on separate threads.
• Unity supports a handful of codecs.
• PCM
• ADPCM
• Vorbis
• MP3
Audio “Load Type” Setting
• Decompress On Load
• Decoding & file I/O happen at load time only.
• Compressed In Memory
• Decoding happens during playback.
• Streamed
• File I/O & decoding happen during playback.
Every frame…
• Unity iterates over all active Audio Sources.
• Calculates distance to Listener(s).
• FMOD mixes active Audio Sources (“voices”).
• True volume = Volume setting * distance to listener * clip.
• If the clip is compressed, FMOD must decode audio data
• Chooses X loudest voices to mix together.
• X = “Real Voices” audio setting.
Everything is done in software.
• Decoding & mixing are done entirely in software.
• Mixing occurs on the FMOD thread.
• Decoding occurs at loading time or on the FMOD thread.
• All playing voices are evaluated and mixed.
• Max number of voices is controlled by Audio settings.
A trap.
This voice is Muted.
This voice is Active.
This voice will not be heard,
but the Clip must be processed.
A warning.
• AudioSystem.Update is Unity updating the AudioSources which
are submitted to FMOD for playback.
• Audio decoding does not show up in the Unity CPU Profiler.
Check both places!
• Decoding & mixing audio is in the details of the Audio profiler.
Audio CPU usage by codec.
10 Voices 100 Voices 500 Voices
PCM 1.5% 5.0% 5.7%
ADPCM 5.2% 16.6% 11.6%
MP3 13.3% 35.0% 23.3%
Vorbis 12.5% 30.3% 21.2%
Identical AudioClip, multiple AudioSources. MP3 & Vorbis Quality = 100.
WTF?
10 Voices 100 Voices 500 Voices
PCM 1.5% 5.0% 5.7%
ADPCM 5.2% 16.6% 11.6%
MP3 13.3% 35.0% 23.3%
Vorbis 12.5% 30.3% 21.2%
Identical AudioClip, multiple AudioSources. MP3 & Vorbis Quality = 100.
Oh. Profiler interference.
~Test time~ <(^^<) (>^^)>
• Identical 4 minute audio clip, copied 4 times.
• Once per codec under test.
• Varying number of AudioSources.
• Captured CPU time on main & FMOD threads
• Sum of CPU time consumed over 10 seconds real-time
Again.
10 Clips 100 Clips 500 Clips
PCM 95 ms 467 ms 2040 ms
ADPCM 89 ms 474 ms 2070 ms
MP3 84 ms 469 ms 2030 ms
Vorbis 93 ms 473 ms 1990 ms
CPU time on main thread, 10 seconds real-time.
With intensity.
10 Voices 100 Voices 500 Voices
PCM 214 ms 451 ms 634 ms
ADPCM 485 ms 1391 ms 1591 ms
MP3 1058 ms 4061 ms 4167 ms
Vorbis 1161 ms 3408 ms 3629 ms
CPU time on all FMOD threads, 10 seconds real-time.
Principles.
• Avoid having many audio sources set to Mute.
• Disable/Stop instead of Mute, if possible.
• If you can afford the memory overhead, Decompress on Load.
• Best for short clips that are frequently played.
• Avoid playing lots of compressed Clips, especially on mobile.
Or clamp the voice count.
10
Playing Clips
100
Playing Clips
500
Playing Clips
512 VV 318 ms 923 ms 2708 ms
100 VV 304 ms 905 ms 1087 ms
10 VV 315 ms 350 ms 495 ms
1 VV 173 ms 210 ms 361 ms
PCM. CPU time on FMOD + Main threads, 10 seconds real-time.
How, you ask?
public void SetNumVoices(int nv) {
var config = AudioSettings.GetConfiguration();
if(config.numVirtualVoices == nv)
return;
config.numVirtualVoices = nv;
config.numRealVoices = Mathf.Clamp(config.numRealVoices,
1, config.numVirtualVoices);
AudioSettings.Reset(config);
}
Just an example! Probably too simple for real use.
Animation & Animator
Animator
• Formerly called Mecanim.
• Graph of logical states.
• Blends between states.
• States contain animation clips and/or
blend trees.
• Animator component attached to
GameObject
• AnimatorController referenced by
Animator component.
Playables
• Technology underlying Animator & Timeline.
• Generic framework for “stuff that can be played back”.
• Animation clips, audio clips, video clips, etc.
• Docs: https://docs.unity3d.com/Manual/Playables.html
Animation
• Unity’s original animation system.
• Custom code
• Not based on Playables.
• Very simple: plays an animation clip.
• Can crossfade, loop.
Let’s test it.
The Test
• 100 GameObjects with Animator or Animation component
• Animator uses simple AnimatorController: 1 state, looping
• Animation plays back 1 AnimationClip, looping
0 ms
10 ms
20 ms
30 ms
1 100 200 300 400 500 600 700 800
Animation Animator
100 Components, Variable Curve Count, iPad Mini 3
TimeperFrame
How do cores affect it?
0 ms
1 ms
2 ms
3 ms
1 100 200 300 400 500 600 700 800
Animation Animator
100 Components, Variable Curve Count, Win10/Core i7
TimeperFrame Crossover on iPad Mini 3
0 ms
1 ms
2 ms
3 ms
1 100 200 300 400 500
Animation Animator
100 Curves, Variable Component Count, Win10/Core i7
TimeperFrame
Scaling Factors
• Performance is heavily dependent on curve & core count.
• Fewer cores: Animation retains advantage longer.
• More cores: Animator rapidly outperforms Animation.
• Both systems scale linearly as number of Components rises.
• “Best” system determined by target hardware vs curve count.
• Use Animation for simple animations.
• Use Animators for high curve counts or complex scenarios.
0 ms
13 ms
27 ms
40 ms
1 100 200 300 400 500
Animation Animator
100 Curves, Variable Component Count, iPad Mini 3
TimeperFrame
What about “constant” curves?
• Still interpolated at runtime.
• No measurable impact on CPU usage.
• Significant memory/file savings.
• Example: 11kb vs. 3.7kb for 100 position curves (XYZ)
What about Animator’s
cool features?
Be careful with Layers!
• The active state on each layer will be evaluated once per frame.
• Layer Weight does not matter.
• Weight=0? Still evaluated!
• This is to ensure that state is correct.
• Zero-weight layers = waste work
• Use layers sparingly!
(Yes, the docs are wrong.)
The Cost of Layering
1 Layer 2 Layers 3 Layers 4 Layers 5 Layers
Aggregate 1966 ms 2260 ms 2510 ms 2690 ms 2890 ms
Per Frame 10.08 ms 11.77 ms 12.86 ms 14.31 ms 17.65 ms
50 x “Ellen” from 3D Gamekit. Unity 2018.1.0f2.
Main Thread CPU time consumed during 10 Seconds Realtime.
iPad Mini 3.
What about Layer Masks?
Nope.
50 x “Ellen” from 3D Gamekit. Layers 2-5 Masked.
Main Thread CPU time consumed during 10 Seconds Realtime.
Unity 2018.1.0f2. iPad Mini 3.
1 Layer 2 Layers 3 Layers 4 Layers 5 Layers
Unmasked 1966 ms 2260 ms 2510 ms 2690 ms 2890 ms
60/108
Masked
1992 ms 2230 ms 2530 ms 2740 ms 2920 ms
Use the right rig!
• The Humanoid rig runs IK & retargeting calculations.
• The Generic rig does not.
1 Layer 2 Layers 3 Layers 4 Layers 5 Layers
Generic 1966 ms 2260 ms 2510 ms 2690 ms 2890 ms
Humanoid 2775 ms 3210 ms 3510 ms 3730 ms 4020 ms
Identical test to previous slide, different Rig import settings.
The pooling problem
• Animators reset their state when their GameObject is disabled.
• The only workaround? Disable Animator component, not
GameObject.
• Leads to messy side effects, like having to manage other
components (e.g. Colliders/Rigidbodies) manually.
• This made Animator-driven objects difficult to pool.
There’s an API to fix it, now!
• Animator.keepControllerStateOnDisable
• Available in 2018.1+
• If true, Animators do not discard data buffers when their
GameObject is disabled.
• Awesome for pooling!
• Careful of the higher memory usage of disabled Animators!
One Last Thing.
Thank these people!
Danke Schön!
Fragen?
Thank you!
Questions?

More Related Content

PDF
Unreal Open Day 2017 Optimize in Mobile UI
PDF
【UE4.25 新機能】ロードの高速化機能「IOStore」について
PPTX
大規模ゲーム開発における build 高速化と安定化
PDF
【Unite Tokyo 2018】『崩壊3rd』開発者が語るアニメ風レンダリングの極意
PDF
【CEDEC2017】Unityを使ったNintendo Switch™向けのタイトル開発・移植テクニック!!
PDF
目指せ脱UE4初心者!?知ってると開発が楽になる便利機能を紹介 - DataAsset, Subsystem, GameplayAbility編 -
PPTX
Built for performance: the UIElements Renderer – Unite Copenhagen 2019
Unreal Open Day 2017 Optimize in Mobile UI
【UE4.25 新機能】ロードの高速化機能「IOStore」について
大規模ゲーム開発における build 高速化と安定化
【Unite Tokyo 2018】『崩壊3rd』開発者が語るアニメ風レンダリングの極意
【CEDEC2017】Unityを使ったNintendo Switch™向けのタイトル開発・移植テクニック!!
目指せ脱UE4初心者!?知ってると開発が楽になる便利機能を紹介 - DataAsset, Subsystem, GameplayAbility編 -
Built for performance: the UIElements Renderer – Unite Copenhagen 2019

What's hot (20)

PPTX
Best practices: Async vs. coroutines - Unite Copenhagen 2019
PPTX
UE4におけるLoadingとGCのProfilingと最適化手法
PDF
레퍼런스만 알면 언리얼 엔진이 제대로 보인다
PDF
Epic Online Services でできること
PPTX
【Unity道場スペシャル 2017博多】クォータニオン完全マスター
PDF
徹底解説!UE4を使ったモバイルゲーム開発におけるコンテンツアップデートの極意!
PDF
Unity2018/2019における最適化事情
PDF
わからないまま使っている?UE4 の AI の基本的なこと
PPTX
OpenVRやOpenXRの基本的なことを調べてみた
PDF
나만의 엔진 개발하기
PDF
そう、UE4ならね。あなたのモバイルゲームをより快適にする沢山の冴えたやり方について Part 2 <Texture Streaming, メモリプロ...
PDF
UE4 Performance and Profiling | Unreal Dev Day Montreal 2017 (日本語訳)
PDF
Unity エディタ拡張
PPTX
빌드 속도를 올려보자
PPTX
マテリアルとマテリアルインスタンスの仕組みと問題点の共有 (Epic Games Japan: 篠山範明) #UE4DD
PDF
【UE4.25 新機能】新しいシリアライゼーション機能「Unversioned Property Serialization」について
PDF
【CEDEC2018】CPUを使い切れ! Entity Component System(通称ECS) が切り開く新しいプログラミング
PDF
【Unite 2018 Tokyo】そろそろ楽がしたい!新アセットバンドルワークフロー&リソースマネージャー詳細解説
Best practices: Async vs. coroutines - Unite Copenhagen 2019
UE4におけるLoadingとGCのProfilingと最適化手法
레퍼런스만 알면 언리얼 엔진이 제대로 보인다
Epic Online Services でできること
【Unity道場スペシャル 2017博多】クォータニオン完全マスター
徹底解説!UE4を使ったモバイルゲーム開発におけるコンテンツアップデートの極意!
Unity2018/2019における最適化事情
わからないまま使っている?UE4 の AI の基本的なこと
OpenVRやOpenXRの基本的なことを調べてみた
나만의 엔진 개발하기
そう、UE4ならね。あなたのモバイルゲームをより快適にする沢山の冴えたやり方について Part 2 <Texture Streaming, メモリプロ...
UE4 Performance and Profiling | Unreal Dev Day Montreal 2017 (日本語訳)
Unity エディタ拡張
빌드 속도를 올려보자
マテリアルとマテリアルインスタンスの仕組みと問題点の共有 (Epic Games Japan: 篠山範明) #UE4DD
【UE4.25 新機能】新しいシリアライゼーション機能「Unversioned Property Serialization」について
【CEDEC2018】CPUを使い切れ! Entity Component System(通称ECS) が切り開く新しいプログラミング
【Unite 2018 Tokyo】そろそろ楽がしたい!新アセットバンドルワークフロー&リソースマネージャー詳細解説
Ad

Similar to Unity's Evolving Best Practices (20)

PPTX
String in .net
PPTX
Core C# Programming Constructs, Part 1
PPTX
Effective C#
PPT
Csphtp1 15
PPTX
13string in c#
PPT
13 Strings and text processing
PPT
Strings Arrays
PDF
Functions for nothing, and your tests for free
PDF
IL2CPP: Debugging and Profiling
PPT
Introduction To C#
PPTX
PPT
Csharp4 operators and_casts
PDF
"Optimization of a .NET application- is it simple ! / ?", Yevhen Tatarynov
PPTX
Puzles C#
PPTX
C# Today and Tomorrow
PDF
Week 9 IUB c#
PDF
Building .NET-based Applications with C#
PPTX
F# Eye For The C# Guy - Seattle 2013
PDF
CallSharp: Automatic Input/Output Matching in .NET
PDF
Highly Strung
String in .net
Core C# Programming Constructs, Part 1
Effective C#
Csphtp1 15
13string in c#
13 Strings and text processing
Strings Arrays
Functions for nothing, and your tests for free
IL2CPP: Debugging and Profiling
Introduction To C#
Csharp4 operators and_casts
"Optimization of a .NET application- is it simple ! / ?", Yevhen Tatarynov
Puzles C#
C# Today and Tomorrow
Week 9 IUB c#
Building .NET-based Applications with C#
F# Eye For The C# Guy - Seattle 2013
CallSharp: Automatic Input/Output Matching in .NET
Highly Strung
Ad

More from Unity Technologies (20)

PDF
Build Immersive Worlds in Virtual Reality
PDF
Augmenting reality: Bring digital objects into the real world
PDF
Let’s get real: An introduction to AR, VR, MR, XR and more
PDF
Using synthetic data for computer vision model training
PDF
The Tipping Point: How Virtual Experiences Are Transforming Global Industries
PDF
Unity Roadmap 2020: Live games
PDF
Unity Roadmap 2020: Core Engine & Creator Tools
PDF
How ABB shapes the future of industry with Microsoft HoloLens and Unity - Uni...
PPTX
Unity XR platform has a new architecture – Unite Copenhagen 2019
PDF
Turn Revit Models into real-time 3D experiences
PDF
How Daimler uses mobile mixed realities for training and sales - Unite Copenh...
PDF
How Volvo embraced real-time 3D and shook up the auto industry- Unite Copenha...
PDF
QA your code: The new Unity Test Framework – Unite Copenhagen 2019
PDF
Engineering.com webinar: Real-time 3D and digital twins: The power of a virtu...
PDF
Supplying scalable VR training applications with Innoactive - Unite Copenhage...
PDF
XR and real-time 3D in automotive digital marketing strategies | Visionaries ...
PDF
Real-time CG animation in Unity: unpacking the Sherman project - Unite Copenh...
PDF
Creating next-gen VR and MR experiences using Varjo VR-1 and XR-1 - Unite Cop...
PDF
What's ahead for film and animation with Unity 2020 - Unite Copenhagen 2019
PDF
How to Improve Visual Rendering Quality in VR - Unite Copenhagen 2019
Build Immersive Worlds in Virtual Reality
Augmenting reality: Bring digital objects into the real world
Let’s get real: An introduction to AR, VR, MR, XR and more
Using synthetic data for computer vision model training
The Tipping Point: How Virtual Experiences Are Transforming Global Industries
Unity Roadmap 2020: Live games
Unity Roadmap 2020: Core Engine & Creator Tools
How ABB shapes the future of industry with Microsoft HoloLens and Unity - Uni...
Unity XR platform has a new architecture – Unite Copenhagen 2019
Turn Revit Models into real-time 3D experiences
How Daimler uses mobile mixed realities for training and sales - Unite Copenh...
How Volvo embraced real-time 3D and shook up the auto industry- Unite Copenha...
QA your code: The new Unity Test Framework – Unite Copenhagen 2019
Engineering.com webinar: Real-time 3D and digital twins: The power of a virtu...
Supplying scalable VR training applications with Innoactive - Unite Copenhage...
XR and real-time 3D in automotive digital marketing strategies | Visionaries ...
Real-time CG animation in Unity: unpacking the Sherman project - Unite Copenh...
Creating next-gen VR and MR experiences using Varjo VR-1 and XR-1 - Unite Cop...
What's ahead for film and animation with Unity 2020 - Unite Copenhagen 2019
How to Improve Visual Rendering Quality in VR - Unite Copenhagen 2019

Recently uploaded (20)

PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PPTX
Reimagine Home Health with the Power of Agentic AI​
PDF
Understanding Forklifts - TECH EHS Solution
PDF
top salesforce developer skills in 2025.pdf
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PPTX
history of c programming in notes for students .pptx
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
How Creative Agencies Leverage Project Management Software.pdf
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PPTX
Operating system designcfffgfgggggggvggggggggg
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PPTX
L1 - Introduction to python Backend.pptx
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
Digital Strategies for Manufacturing Companies
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Reimagine Home Health with the Power of Agentic AI​
Understanding Forklifts - TECH EHS Solution
top salesforce developer skills in 2025.pdf
Design an Analysis of Algorithms II-SECS-1021-03
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
history of c programming in notes for students .pptx
How to Choose the Right IT Partner for Your Business in Malaysia
How Creative Agencies Leverage Project Management Software.pdf
Which alternative to Crystal Reports is best for small or large businesses.pdf
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Design an Analysis of Algorithms I-SECS-1021-03
wealthsignaloriginal-com-DS-text-... (1).pdf
Operating system designcfffgfgggggggvggggggggg
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
L1 - Introduction to python Backend.pptx
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Digital Strategies for Manufacturing Companies
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Upgrade and Innovation Strategies for SAP ERP Customers

Unity's Evolving Best Practices

  • 1. This slide deck was presented at Unite Berlin 2018. This offline version includes numerous additional slides, cut from the original presentation for brevity and/or time. These extra slides contains more examples and data, but are not essential for understanding the presentation.
  • 2. Optimization & Best Practices: Through The Ages Ian Dundore Unity Technologies
  • 4. Spoiler Alert • Scripting Performance • Transforms • Audio • Animations
  • 9. oops. • In the specific case of String.Equals, that advice is wrong! • From a performance perspective, at least. • For all other string comparisons, it’s right! • Compare, StartsWith, EndsWith, IndexOf, etc. • Again, from a performance perspective. • (Psst! This is documented!) https://docs.microsoft.com/en-us/dotnet/standard/base-types/best-practices-strings#common-string-comparison-methods-in-net
  • 11. Testing Considerations • How does the code path differ with different inputs? • What is the environment around the executing code? • Runtime • IL2CPP/Mono? .Net version? • Hardware • Pipeline depth, cache size, cache-line length • # of cores, core affinity settings on threads, throttling • What exactly is your test measuring?
  • 12. Your Test Harness Matters! Profiler.BeginSample(“Test A”); for (int i=0; i<NUM_TESTS; ++i) { DoAThing(i); } Profiler.EndSample(); int i = 0; Profiler.BeginSample(“Test B”); DoAThing(0); while (i<NUM_TESTS) { DoAThing(++i); DoAThing(++i); DoAThing(++i); // … repeat a lot … DoAThing(++i); } Profiler.EndSample(); Less Loop OverheadMore Loop Overhead
  • 13. public bool Equals(String value) { if (this == null) throw new NullReferenceException(); if (value == null) return false; if (Object.ReferenceEquals(this, value)) return true; if (this.Length != value.Length) return false; return EqualsHelper(this, value); } Mono’s String.cs (1)
  • 14. What does EqualsHelper do? • Uses unsafe code to pin strings to memory addresses. • C-style integer comparison of raw bytes of the strings. • Core is a special cache-friendly loop. • 64-bit: Step through strings with a stride of 12 bytes. while (length >= 12) { if (*(long*)a != *(long*)b) return false; if (*(long*)(a+4) != *(long*)(b+4)) return false; if (*(long*)(a+8) != *(long*)(b+8)) return false; a += 12; b += 12; length -= 12; }
  • 15. public bool Equals(String value, StringComparison comparisonType) { if (comparisonType < StringComparison.CurrentCulture || comparisonType > StringComparison.OrdinalIgnoreCase) throw new ArgumentException(…); Contract.EndContractBlock(); if ((Object)this == (Object)value) { return true; } if ((Object)value == null) { return false; } Mono’s String.cs (2)
  • 16. switch (comparisonType) { case StringComparison.CurrentCulture: return (CultureInfo.CurrentCulture.CompareInfo.Compare(this, value, CompareOptions.None) == 0); case StringComparison.CurrentCultureIgnoreCase: return (CultureInfo.CurrentCulture.CompareInfo.Compare(this, value, CompareOptions.IgnoreCase) == 0); case StringComparison.InvariantCulture: return (CultureInfo.InvariantCulture.CompareInfo.Compare(this, value, CompareOptions.None) == 0); case StringComparison.InvariantCultureIgnoreCase: return (CultureInfo.InvariantCulture.CompareInfo.Compare(this, value, CompareOptions.IgnoreCase) == 0); Mono’s String.cs (3)
  • 17. case StringComparison.Ordinal: if (this.Length != value.Length) return false; return EqualsHelper(this, value); Mono’s String.cs (4)
  • 18. But wait! • For non-matching strings, length will often differ. • But for length-invariant strings, first character usually differs. • This optimization is found in CompareOrdinal, but not Equals. public static int CompareOrdinal(String strA, String strB) { if ((Object)strA == (Object)strB) return 0; if (strA == null) return -1; if (strB == null) return 1; // Most common case, first character is different. if ((strA.m_firstChar - strB.m_firstChar) != 0) return strA.m_firstChar - strB.m_firstChar; return CompareOrdinalHelper(strA, strB); }
  • 19. This is getting silly. public static int CompareOrdinal(String strA, int indexA, String strB, int indexB, int length) { if (strA == null || strB == null) { if ((Object)strA==(Object)strB) { //they're both null; return 0; } return (strA==null)? -1 : 1; //-1 if A is null, 1 if B is null. } return nativeCompareOrdinalEx(strA, indexA, strB, indexB, length); } An overload that goes almost directly to native code!
  • 20. Test Design: 4 cases • Case 1: Two identical strings. • Case 2: Two strings of random characters of same length. • Case 3: Two strings of random characters of same length. • First characters identical, to bypass check in Compare. • Case 4: Two strings of random characters, different lengths. • Comparison’s worst case is bounded by the shorter string. • Constrained range to 15-25 characters to be similar to above tests.
  • 21. Mono 3.5 Identical Content Identical Length Random Content Identical Length First Char Equal Identical Length Random Content Random Length String.Equals 2.97 1.75 1.73 1.30 String.Equals with Ordinal type 5.87 3.46 3.56 3.39 String.Compare 37.52 33.29 64.66 31.35 String.Compare with Ordinal type 6.23 3.35 3.35 3.26 CompareOrdinal 5.68 3.10 3.18 2.99 CompareOrdinal with Indices 5.53 3.30 3.42 3.95 Simple Hand-Coded 5.46 1.75 2.18 1.40 100,000 comparisons. Timings in milliseconds. Unity 2018.1.0f2, Windows Standalone, Mono 3.5, Core i7-3500K
  • 22. Mono 3.5 Identical Content Identical Length Random Content Identical Length First Char Equal Identical Length Random Content Random Length String.Equals 3.23 1.80 1.82 1.21 String.Equals with Ordinal type 3.84 2.13 2.03 1.38 String.Compare 34.72 28.70 63.03 29.74 String.Compare with Ordinal type 5.16 1.75 2.68 1.65 CompareOrdinal 4.93 1.55 2.21 1.40 CompareOrdinal with Indices 4.77 3.59 3.59 4.41 Simple Hand-Coded 4.40 1.66 1.95 1.28 100,000 comparisons. Timings in milliseconds. Unity 2018.1.0f2, Windows Standalone, Mono 4.6, Core i7-3500K
  • 23. IL2CPP Identical Content Identical Length Random Content Identical Length First Char Equal Identical Length Random Content Random Length String.Equals 2.61 1.26 1.27 0.95 String.Equals with Ordinal type 5.38 3.80 3.84 3.66 String.Compare 39.12 29.32 60.56 28.01 String.Compare with Ordinal type 4.84 3.58 3.62 3.52 CompareOrdinal 4.78 3.55 3.58 3.51 CompareOrdinal with Indices 4.93 3.71 3.72 4.17 Simple Hand-Coded 13.83 3.52 3.93 2.16 100,000 comparisons. Timings in milliseconds. Unity 2018.1.0f2, Windows Standalone, IL2CPP 3.5, Core i7-6700K
  • 24. IL2CPP Identical Content Identical Length Random Content Identical Length First Char Equal Identical Length Random Content Random Length String.Equals 2.64 1.92 1.93 0.96 String.Equals with Ordinal type 2.94 2.26 2.73 1.49 String.Compare 40.98 30.61 60.82 29.26 String.Compare with Ordinal type 3.18 1.46 2.29 1.32 CompareOrdinal 2.99 1.18 2.06 1.12 CompareOrdinal with Indices 5.56 3.93 4.08 4.41 Simple Hand-Coded 14.14 3.78 4.14 2.35 100,000 comparisons. Timings in milliseconds. Unity 2018.1.0f2, Windows Standalone, IL2CPP 4.6, Core i7-6700K
  • 27. Conclusions & more questions • String.Equals clearly wins for plain string comparison. • .NET 4.6 has improvements for String.Compare variants. • Ordinal comparisons clearly win on culture-sensitive APIs. • Use String.CompareOrdinal instead of String.Compare. • Use StringComparison.Ordinal on other String APIs. • How does this map across platforms?
  • 28. IL2CPP Identical Content Identical Length Random Content Identical Length First Char Equal Identical Length Random Content Random Length String.Equals 13.48 5.08 5.01 5.26 String.Equals with Ordinal type 25.42 19.46 19.85 14.16 String.Compare 118.80 128.69 254.30 124.81 String.Compare with Ordinal type 24.23 11.49 11.57 10.95 CompareOrdinal 23.92 11.09 11.54 10.75 CompareOrdinal with Indices 23.79 14.76 18.62 15.05 Simple Hand-Coded 58.02 12.04 21.86 8.13 100,000 comparisons. Timings in milliseconds. Unity 2018.1.0f2, iOS, IL2CPP 3.5, iPad Mini 3
  • 29. String.Equals/Random on given platform = 1 Very similar results, in this case.
  • 30. Another tip! • See a lot of time going to NullCheck in IL2CPP builds? • Disable these checks in release builds! • Works on types, methods & properties. • Code is in IL2CppSetOptionAttribute.cs, under Unity install folder [Il2CppSetOption(Option.NullChecks, false)] public bool MyEquals(String strA, String strB) { // … }
  • 31. IL2CPP Identical Content Identical Length Random Content Random Length Normal 58.02 8.13 NullCheck Disabled 53.02 7.03 100,000 comparisons. Timings in milliseconds. Unity 2018.1.0f2, iOS, IL2CPP 3.5, iPad Mini 3 Small, but helpful.
  • 32. Transforms * no, not this kind of transform
  • 34. OnTransformChanged • Internal message, broadcast each time a Transform changes • Position, rotation, scale, parent, sibling order, etc. • Tells other components to update their internal state • PhysX/Box2D, UnityUI, Renderers (AABBs), etc. • Repeated messages can cause performance problems • Use Transform.SetPositionAndRotation (5.6+)
  • 36. Enter the Dispatch • TransformChangeDispatch was first introduced in 5.4 • Other systems migrated to use it, slowly. • Renderers in 5.6 • Animators in 2017.1 • Physics in 2017.2 • RectTransforms in 2017.3 • OnTransformChanged was removed entirely in 2018.1
  • 37. How Transforms are structured • 1 TransformHierarchy structure represents a root Transform • Contains buffers tracking data of all transforms in hierarchy • TRS, indices for parents & siblings • Interest bitmask & dirty bitmask • Internal systems register interest & track state via specific bits • Physics is one bit, renderer is another bit, etc. • System walks affected parts of TransformHierarachy structure • dirtyMask |= -1 & interestMask
  • 38. When are async changes applied? • TCD keeps a list of dirty TransformHierarchy pointers • Systems request list of changed Transforms before running. • e.g. Before FixedUpdate, before rendering, before animating. • Use list to update internal system state. • TCD iterates over list of dirty TransformHierarchies. • Iterates over all Transforms to check each dirty bit.
  • 40. Quick Reminder • Buffer size: Transform.hierarchyCapacity • Set before mass reparenting operations! • Reparent & reposition during instantiate! • GameObject.Instantiate( prefab, parent ); • GameObject.Instantiate( prefab, parent, position, rotation );
  • 41. Split your hierarchies. • Changing any Transform marks the whole Hierarchy dirty. • Dirty hierarchies must be fully examined for change bits. • Smaller hierarchies = more granular Hierarchy tracking. • Smaller hierarchies = fewer Transforms to check. • Fewer roots = more Transforms to check for changes. • Change checks are jobified, but operate on roots.
  • 42. Extreme cases. UnparentedParented 100 Root GameObjects + 1000 empty GameObjects + 1 Cube w/ Rotation script 100,000 empty GameObjects 100 Cubes w/ Rotation script
  • 43. A welcome effect. Parented Unparented Main Thread 553 ms 32 ms Worker Threads 139 ms 14 ms 100 Rotating Cubes, 100k Empty GameObjects. iPad Mini 3. CPU time used over 10 seconds.
  • 44. This is just checking hierarchies! Parented Unparented Main Thread 1.77 ms 0.11 ms 100 Rotating Cubes, 100k Empty GameObjects. iPad Mini 3. CPU time used over 10 seconds. “PostLateUpdate.UpdateAllRenderers”
  • 45. Transforms & Physics: 2017.2+ • 2017.1/older: Physics components were synced to Transforms. • Each Transform change = expensive update of Physics scene. • 2017.2/newer: Updates can be delayed to next FixedUpdate. • Update Physics entities from set of changed Transforms. • Re-indexing computations are batched. • This could have side-effects! • Move a Collider + immediately Raycast towards it? No bueno.
  • 46. Physics.AutoSyncTransforms • When true, forces legacy behavior. • Colliders/Rigidbodies check for syncs on every Physics call. • Yes, every Raycast, Spherecast, etc. • Huge performance regression, if you’re not batching updates. • When false, uses delayed-update behavior. • Can force updates: Physics.SyncTransforms • Default value is true in 2017.2 through 2018.2 • 2018.3 is the first version where the default is false.
  • 47. void Update() { float rotAmt = 2f * Time.deltaTime; Vector3 up = Vector3.up; if (batched) { for(int i = 0; i < NUM_PARENTS; ++i) rotators[i].Rotate(up, rotAmt); for(int i = 0; i < NUM_PARENTS; ++i) Physics.Raycast(Vector3.zero, Random.insideUnitSphere); } else { for (int i = 0; i < NUM_PARENTS; ++i) { rotators[i].Rotate(up, rotAmt); Physics.Raycast(Vector3.zero, Random.insideUnitSphere); } } } A test. “Batched” “Immediate”
  • 48. Seriously, a big effect. Parented Immediate Unparented Immediate Parented Batched Unparented Batched Script 4450 ms 4270 ms 1980 ms 882 ms Physics 1410 ms 1820 ms 1840 ms 1770 ms 100 Rotating Cubes, Rigidbodies, Trigger Box Colliders. 100k Empty GameObjects. App Framerate: 30. Physics Timestep 0.04 sec. iPad Mini 3. CPU time used over 10 seconds.
  • 49. Audio
  • 50. The Basics • Unity uses FMOD internally. • Audio decoding & playback occurs on separate threads. • Unity supports a handful of codecs. • PCM • ADPCM • Vorbis • MP3
  • 51. Audio “Load Type” Setting • Decompress On Load • Decoding & file I/O happen at load time only. • Compressed In Memory • Decoding happens during playback. • Streamed • File I/O & decoding happen during playback.
  • 52. Every frame… • Unity iterates over all active Audio Sources. • Calculates distance to Listener(s). • FMOD mixes active Audio Sources (“voices”). • True volume = Volume setting * distance to listener * clip. • If the clip is compressed, FMOD must decode audio data • Chooses X loudest voices to mix together. • X = “Real Voices” audio setting.
  • 53. Everything is done in software. • Decoding & mixing are done entirely in software. • Mixing occurs on the FMOD thread. • Decoding occurs at loading time or on the FMOD thread. • All playing voices are evaluated and mixed. • Max number of voices is controlled by Audio settings.
  • 54. A trap. This voice is Muted. This voice is Active. This voice will not be heard, but the Clip must be processed.
  • 55. A warning. • AudioSystem.Update is Unity updating the AudioSources which are submitted to FMOD for playback. • Audio decoding does not show up in the Unity CPU Profiler.
  • 56. Check both places! • Decoding & mixing audio is in the details of the Audio profiler.
  • 57. Audio CPU usage by codec. 10 Voices 100 Voices 500 Voices PCM 1.5% 5.0% 5.7% ADPCM 5.2% 16.6% 11.6% MP3 13.3% 35.0% 23.3% Vorbis 12.5% 30.3% 21.2% Identical AudioClip, multiple AudioSources. MP3 & Vorbis Quality = 100.
  • 58. WTF? 10 Voices 100 Voices 500 Voices PCM 1.5% 5.0% 5.7% ADPCM 5.2% 16.6% 11.6% MP3 13.3% 35.0% 23.3% Vorbis 12.5% 30.3% 21.2% Identical AudioClip, multiple AudioSources. MP3 & Vorbis Quality = 100.
  • 60. ~Test time~ <(^^<) (>^^)> • Identical 4 minute audio clip, copied 4 times. • Once per codec under test. • Varying number of AudioSources. • Captured CPU time on main & FMOD threads • Sum of CPU time consumed over 10 seconds real-time
  • 61. Again. 10 Clips 100 Clips 500 Clips PCM 95 ms 467 ms 2040 ms ADPCM 89 ms 474 ms 2070 ms MP3 84 ms 469 ms 2030 ms Vorbis 93 ms 473 ms 1990 ms CPU time on main thread, 10 seconds real-time.
  • 62. With intensity. 10 Voices 100 Voices 500 Voices PCM 214 ms 451 ms 634 ms ADPCM 485 ms 1391 ms 1591 ms MP3 1058 ms 4061 ms 4167 ms Vorbis 1161 ms 3408 ms 3629 ms CPU time on all FMOD threads, 10 seconds real-time.
  • 63. Principles. • Avoid having many audio sources set to Mute. • Disable/Stop instead of Mute, if possible. • If you can afford the memory overhead, Decompress on Load. • Best for short clips that are frequently played. • Avoid playing lots of compressed Clips, especially on mobile.
  • 64. Or clamp the voice count. 10 Playing Clips 100 Playing Clips 500 Playing Clips 512 VV 318 ms 923 ms 2708 ms 100 VV 304 ms 905 ms 1087 ms 10 VV 315 ms 350 ms 495 ms 1 VV 173 ms 210 ms 361 ms PCM. CPU time on FMOD + Main threads, 10 seconds real-time.
  • 65. How, you ask? public void SetNumVoices(int nv) { var config = AudioSettings.GetConfiguration(); if(config.numVirtualVoices == nv) return; config.numVirtualVoices = nv; config.numRealVoices = Mathf.Clamp(config.numRealVoices, 1, config.numVirtualVoices); AudioSettings.Reset(config); } Just an example! Probably too simple for real use.
  • 67. Animator • Formerly called Mecanim. • Graph of logical states. • Blends between states. • States contain animation clips and/or blend trees. • Animator component attached to GameObject • AnimatorController referenced by Animator component.
  • 68. Playables • Technology underlying Animator & Timeline. • Generic framework for “stuff that can be played back”. • Animation clips, audio clips, video clips, etc. • Docs: https://docs.unity3d.com/Manual/Playables.html
  • 69. Animation • Unity’s original animation system. • Custom code • Not based on Playables. • Very simple: plays an animation clip. • Can crossfade, loop.
  • 71. The Test • 100 GameObjects with Animator or Animation component • Animator uses simple AnimatorController: 1 state, looping • Animation plays back 1 AnimationClip, looping
  • 72. 0 ms 10 ms 20 ms 30 ms 1 100 200 300 400 500 600 700 800 Animation Animator 100 Components, Variable Curve Count, iPad Mini 3 TimeperFrame
  • 73. How do cores affect it?
  • 74. 0 ms 1 ms 2 ms 3 ms 1 100 200 300 400 500 600 700 800 Animation Animator 100 Components, Variable Curve Count, Win10/Core i7 TimeperFrame Crossover on iPad Mini 3
  • 75. 0 ms 1 ms 2 ms 3 ms 1 100 200 300 400 500 Animation Animator 100 Curves, Variable Component Count, Win10/Core i7 TimeperFrame
  • 76. Scaling Factors • Performance is heavily dependent on curve & core count. • Fewer cores: Animation retains advantage longer. • More cores: Animator rapidly outperforms Animation. • Both systems scale linearly as number of Components rises. • “Best” system determined by target hardware vs curve count. • Use Animation for simple animations. • Use Animators for high curve counts or complex scenarios.
  • 77. 0 ms 13 ms 27 ms 40 ms 1 100 200 300 400 500 Animation Animator 100 Curves, Variable Component Count, iPad Mini 3 TimeperFrame
  • 78. What about “constant” curves? • Still interpolated at runtime. • No measurable impact on CPU usage. • Significant memory/file savings. • Example: 11kb vs. 3.7kb for 100 position curves (XYZ)
  • 80. Be careful with Layers! • The active state on each layer will be evaluated once per frame. • Layer Weight does not matter. • Weight=0? Still evaluated! • This is to ensure that state is correct. • Zero-weight layers = waste work • Use layers sparingly! (Yes, the docs are wrong.)
  • 81. The Cost of Layering 1 Layer 2 Layers 3 Layers 4 Layers 5 Layers Aggregate 1966 ms 2260 ms 2510 ms 2690 ms 2890 ms Per Frame 10.08 ms 11.77 ms 12.86 ms 14.31 ms 17.65 ms 50 x “Ellen” from 3D Gamekit. Unity 2018.1.0f2. Main Thread CPU time consumed during 10 Seconds Realtime. iPad Mini 3.
  • 83. Nope. 50 x “Ellen” from 3D Gamekit. Layers 2-5 Masked. Main Thread CPU time consumed during 10 Seconds Realtime. Unity 2018.1.0f2. iPad Mini 3. 1 Layer 2 Layers 3 Layers 4 Layers 5 Layers Unmasked 1966 ms 2260 ms 2510 ms 2690 ms 2890 ms 60/108 Masked 1992 ms 2230 ms 2530 ms 2740 ms 2920 ms
  • 84. Use the right rig! • The Humanoid rig runs IK & retargeting calculations. • The Generic rig does not. 1 Layer 2 Layers 3 Layers 4 Layers 5 Layers Generic 1966 ms 2260 ms 2510 ms 2690 ms 2890 ms Humanoid 2775 ms 3210 ms 3510 ms 3730 ms 4020 ms Identical test to previous slide, different Rig import settings.
  • 85. The pooling problem • Animators reset their state when their GameObject is disabled. • The only workaround? Disable Animator component, not GameObject. • Leads to messy side effects, like having to manage other components (e.g. Colliders/Rigidbodies) manually. • This made Animator-driven objects difficult to pool.
  • 86. There’s an API to fix it, now! • Animator.keepControllerStateOnDisable • Available in 2018.1+ • If true, Animators do not discard data buffers when their GameObject is disabled. • Awesome for pooling! • Careful of the higher memory usage of disabled Animators!