Understanding Users
Behaviours in User-Centric
Immersive Communications
Laura Toni
UCL - University College London
TEWI Colloquium
26 June 2020
A massive thanks to
Silvia Rossi (UCL)
.. the Phd Student behind this work
Cagri Ozcinar (TCD)
Our collaborators
Aljosa Smolic (TCD) Pascal Frossard
(EPFL)
Francesca De
Simone (CWI)
Main Motivation
4
A virtual - rather than physical - world in which
any user can be fully immersed and interactive
Virtual Reality (VR)
360º video streaming: main challenges
• New spherical geometry
• Large volume of data to store, deliver and
display
• Ultra-low-delay constraints over bandwidth-
limited resources
• Uncertainty on the portion of content
that will be displayed by the user
5
Toward a personalised streaming
• S. Rossi, and L. Toni. “Navigation-aware adaptive streaming strategies for omnidirectional video”, IEEE MMSP 2017.
• Serhan Gül et al., "Low-latency Cloud-based Volumetric Video Streaming Using Head Motion Prediction”, ACM NOSSDAV 2020
• V Swaminathan, M Hosseini, "Prioritizing tile-based virtual reality video streaming using adaptive rate allocation”, US Patent App. 16/784,100
What can we do?
VR systems need to operate at scale, in a personalized manner, remaining
bandwidth-tolerant whilst meeting quality and latency criteria
• Viewport-Aware adaptation logic
• Users-centric coding strategies
• …
6
One key challenge to reach this goal is to fully understand and anticipate user
behaviours in these mixed reality settings.
360 Immersion Communication
But how do users interact in this virtual environment?
Can we predict users’ behaviour?
7
Focus of Today
• Can we identify navigation patterns?
• Can we quantify users’ similarity in their navigation?
• Can some users be more predictable than others?
• Can the navigation pattern be representative of the single user?
• How much is navigation affected by external factors (e.g., video
content features)?
How do users actually navigate in VR environments?
8
Focus of Today Talk
How do users actually navigate in VR environments?
Coding-streaming optimisation
9
VR therapists Live performance
Mu Mu et al, “User attention and behaviour in virtual reality encounter”, 2020
WHIST, AoE 2019
Main Contribution
Main Goal
To design metrics and methodologies to analyse users’
behaviour in 360-degree videos aiming at
• identifying dominant behaviours of immersive navigation
• quantifying similarities across contents and across users
• analysing and quantify the level of interaction of the user
with the content
11
Outline
Users’ navigation pattern analyse:
• a clustering approach
• a device-based study & a use case application
• an information-theory approach
12
Current Analysis
Traditional metrics
13
• Mean exploration angles
• Heat map
• Angular velocity
• Frequency of fixation
• X. Corbillon, F. De Simone, and G. Simon, “360-degree video head movement dataset”, ACM MMSys 2017.
• A. Nguyen and Z. Yan, “A saliency dataset for 360-degree videos”, ACM MMSys 2019.
• V. Sitzmann, A. Serrano,A.Pavel, Agrawala, D.Gutierrez, B.Masia,and G.Wetzstein, “Saliency in VR: How Do People Explore Virtual Environments?” IEEE
Transactions on Visualization and Computer Graphics, 2018.
• Xu, M., Li, C., Zhang, S., & P. Le Callet “State-of-the-art in 360 video/image processing: Perception, assessment and compression”, IEEE Journal of Selected
Topics in Signal Processing, 14(1), 5-26, 2020.
User Behaviour Analysis in VR system
Traditional metrics
Scenario A Scenario B
But do these metrics capture all the actual trajectory behaviour ?
14
• Mean exploration angles
• Heat map
• Angular velocity
• Frequency of fixation
Scenario A Scenario B
• Angular velocity
• Frequency of fixation
• Mean exploration angles
• Heat map
But do these metrics capture all the actual trajectory behaviour ?
User Behaviour Analysis in VR system
Traditional metrics
FAIL
15
Outline
Users’ navigation pattern analyse:
• a clustering approach
• a device-based study & a use case application
• an information-theory approach
16Rossi, S., De Simone, F., Frossard, P., & Toni, L.m "Spherical clustering of users navigating 360 content”, IEEE ICASSP 2019.
Overall Goal
17
Let’s consider each user’s trajectory looking at the viewports centers
t0
t1 t2
…
t0
t1 t2
17
Overall Goal
18
Our Goal: To propose a clustering method able to clusters
users based on their navigation patterns on the sphere.
we cluster themgiven all users’ trajectories
18
Why Clustering in VR?
19
… to a more reliable heat map
from clustering…
… to a predicted navigation paths
… to a practical dataset analysis
… to identify key behaviours for optimal
coding/QoE evaluation etc
What Are the Main Challenges?
20
• To take into account the spherical geometry of the spherical content
2 [✓n ± ✓/2] and 8 2 [ n ± /2]} where
r point of the block n. However, the regular
with variable area, as show in Figure 3. In
S = ✓ while on the rendered view the
sphere. Therefore, this area changes with the
ge. In order to consider this deformation, the
eir surface.
om planar to sphere.
ach frame a high quality in the part with the
icular, the main part of the panorama is the
the user’s viewing direction. The viewport is
the sphere in the point of viewing direction.
and latitude (0   ⇡) values [1]. In the same way a g
be defined on the sphere as the set {(✓, ) s.t. 8✓ 2 [✓n ±
✓, are the dimensions and (✓n, n) is the center point
blocks of the panorama are mapped on the sphere with va
particular, on the planar their surface is equal to S = ✓
surface is S = r2
sin ✓ where r is the ray of the sphere.
latitude introducing distortion in the projected image. In o
quality on the sphere of each block is weighted by their surf
Figure 3: Map projection from plan
3.2 Spherical QoE metric
In our streaming system, we want to ensure inside each fram
most probability to be viewed from the user.In particular, t
viewport that is the portion displayed depending on the use
geodesic distance as distance metric
What Are the Main Challenges?
21
• To identify clusters that are meaningful in the VR domain!
To adopt a metric that reflects the actual viewport overlap
The geodesic distance approximates the actual viewport overlap
is ⇡/8 in both cases Figure 1(a) and (b). However, the green view-
port in the second figure is rotated of ⇡/2. Even if it is an extreme
situation, this rotation reduces the overlap from 87% to 58% of the
total area. It follows that the the geodesic distance is an approxima-
tion (and not exact reflection) of the viewport overlap. The closest
the viewports centers (i.e., the smaller the distance) the smaller is
the approximation error in taking into account the geodesic distance
rather than the viewport overlap. At the same time, for large dis-
tances, the approximation error can be substantial. Therefore, in this
paper we aims at finding a threshold value Gththat minimize the
discrepancy between these two metrics.
(a) Green and blue viewports same
rotation - overlap 87%
(b) Green viewport rotated of ⇡
2
-
overlap 58%
Fig. 1. Comparison of viewport overlap between viewports with
centre distance ⇡
10
but different rotation angles.
To further demonstrate the validity of our assumption, we con-
two videos of the
coaster has one mai
Timelapse, there ar
ple) along the equat
geodesic distance a
wise geodesic dista
axis in red) between
Rollercoster frame
has been plotted wi
value of geodesic d
between the two m
high, the geodesic d
value. Looking at t
can notice that mos
reference user, in m
To formalise th
to all video in the
lem, we used a Rec
What Are the Main Challenges?
22
• To identify clusters that are meaningful in the VR domain!
To adopt a metric that reflects the actual viewport overlap
To identifies users that are actually looking at the same
portion of the sphere
Classical clustering methods do not guarantee this joint overlap
• “CLS: A Cross-user Learning based System for Improving QoE in 360-degree
Video Adaptive Streaming”
ACM Multimedia Conference on Multimedia Conference 2018
Authors: L. Xie, X. Zhang, and Z. Guo
• “Trajectory- Based Viewport Prediction for 360-Degree Virtual Reality
Videos”
IEEE conference on Artificial Intelligence and Virtual Reality 2018
Authors: S. Petrangeli, G. Simon, and V. Swaminathan
✓ Clustering for VR users
✗ Euclidean distance as distance metric
State-of-the-art
✓ Clustering of trajectories + prediction
✓ Spherical geometry taken into account
✗ “Classical” clustering method
23
Our Proposed Approach
Step 1: To evaluate users similarity as a threshold-based geodesic
distance
Step 2: To propose a clique-based clustering method based on the
metric derived in step1
24
Threshold-based geodesic distance
Users are similar if they share at least a portion Oth of
their viewports (say 80%)
how do we translate this into geodesic distance?
Chapter 4. Toward User Prediction in Virtual Reality
- - /2 0 /2
theta
3* /4
/2
/4
0
phi
Rollercoaster - User positions at frame = 1480
1
2
3
4
5
67 8
9 1011
12
13
14
1516
1718
19
20
21
22
23
24
25
26
27
28
29
30 31
32
33
34
35
36
37
38
39
40 41
42
43
44
45
46 47
48
49
50
51
52
53
54
55
56
5758
59
(a) Users position at Rollercoaster frame =
1480
(b) Rollercoaster frame = 1480
3* /4
/2
/4
0
phi
Elephant - User positions at frame = 1308
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
2526 27
28
29
3031
32
33
34 35
3637
38
Chapter 4. Toward User Prediction in Virtual Reality
- - /2 0 /2
theta
3* /4
/2
/4
0
phi
Rollercoaster - User positions at frame = 1480
1
2
3
4
5
67 8
9 1011
12
13
14
1516
1718
19
20
21
22
23
24
25
26
27
28
29
30 31
32
33
34
35
36
37
38
39
40 41
42
43
44
45
46 47
48
49
50
51
52
53
54
55
56
5758
59
(a) Users position at Rollercoaster frame =
1480
(b) Rollercoaster frame = 1480
- - /2 0 /2
theta
3* /4
/2
/4
0
phi
Elephant - User positions at frame = 1308
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
2526 27
28
29
3031
32
33
34 35
3637
38
(c) Users position at Elephant frame =
1308
(d) Elephant frame = 1308
- - /2 0 /2
theta
3* /4
/2
/4
0
phi
Diving - User positions at frame = 1265
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34 35
36
37
38 39
40 41
42
43
4445
46
47
48
49
50
5152
53 54
55
56
57
58
(e) Users position at Diving frame = 1265 (f) Diving frame = 1265
/4
0
Timelapse - User positions at frame = 1228
1
7
11
12
15
1724 33
42
43
44
45
4956
57
58
. Toward User Prediction in Virtual Reality
3
41
frame = (b) Rollercoaster frame = 1480
rame = (d) Elephant frame = 1308
1
3
7
8
11
22
23
35
39
54
57
8
= 1265 (f) Diving frame = 1265
2
3
9
53
4. Toward User Prediction in Virtual Reality
3
41
frame = (b) Rollercoaster frame = 1480
rame = (d) Elephant frame = 1308
1
3
7
8
11
6
22
23
35
39
54
57
58
= 1265 (f) Diving frame = 1265
2
3
9
0
53
frame = (h) Timelapse frame = 1228
ons of users’ viewport. Right column: Frame for
41
Do users at a max
distance of G have an
overlap of at least Oth
false positive,
true negative…
25
Step 1: To evaluate users similarity as a threshold-based geodesic
distance
%
blue’%cluster%with%2%user’s%viewports%
!
ROC Evaluation
Threshold-based geodesic distance
Users are similar if they share at least a portion Oth of their viewports (say 80%)
how do we translate this into geodesic distance?
general, we evaluated this curve using all the selected videos and comparing
all possible users couple. Figure 4.4 shows our result. The best value of
geodesic distance is ⇡
10 since it corresponds to a TPR value very close to 1,
which in our application means a strong matching of neighbours detected
with the overlap and the geodesic distance. At the same time, the FPR
for ⇡
10 is around 0.1, which means that only the 10% of times two users
are wrongly identified as neighbours in term of geodesic distance. For our
application, we want to be sure to detect all the couple of neighbour and
we can tolerate to have an optimistic prediction, therefore ⇡
10 is a suitable
value.
Figure 4.4: ROC curve to evaluate geodesic distance threshold value considering
all selected video.
we look at the ROC curve averaged across videos
26
Threshold-based geodesic distance
Only users below threshold Gth can be neighbour in a graph representation
27
Step 1: To evaluate users similarity as a threshold-based geodesic
distance
ROC to evaluate Gth (Done once for all videos)
1 1 0 1
1 1 0 1
0 0 1 0
1 1 0 1
Fig. 3. ROC curve to evaluate optimal Gth considering all video in
database [13] and Oth = 80% .
3. CLIQUE-BASED CLUSTERING ALGORITHM
We now describe the proposed clustering algorithm, aimed at iden-
tifying clusters of users having a common viewport overlap. We
model the evolution of users’ viewports over a time-window T as
a set of graphs {Gt}T
t=1. Each unweighted and undirected graph
Gt = {V, Et, Wt} represents the set of users2
navigating over time,
where V and Et denote the node and edge sets of Gt. Each node in V
corresponds to a user interacting with the 360 content at instant t.
Each edge in Et connects neighbouring nodes, where two nodes are
neighbours if the geodesic distance between the viewport centers as-
sociated to the users represented by the nodes is lower than Gth , as
defined in Section II. The binary matrix Wt is the adjacency matrix
of Gt, with wt(i, j) = 1 if users are neighbors. More formally:
wt(i, j) =
(
1, if g(i, j)  Gth
0, otherwise
(1)
where g(i, j) is the geodesic distance between the viewport centres
of users i and j and Gth is thresholding value, introduced in Section
II.
Looking at the graphs over time {Gt}T
t=1, we are interested in
clustering users based on their trajectories within a time window of
duration T. In other words, we are interested in identifying users that
have similar behaviour over time. With this goal in mind, we derive
an affinity matrix A that will be the input to our clustering algorithm
Similarly to other clusters of trajectories [23]. Each element of A is
defined as following:
!
Input: {Gt}T
t=1, D
Output: K,QQQ = [Q1, ..
Init: i = 1, A(1)
= ID(
repeat
CCC = [C1, ..., CL] KB
l?
= arg maxl |Cl|
Qi = Cl?
A(i+1)
= A(i)
(CCC  Cl? )
i i + 1
until A(i)
is not empty;
K = i 1
our definition of meaningful
cant pairwise viewport overl
video. Therefore, we propo
ular, we consider the Bron-K
maximal cliques present in o
graphs forming cliques). Wh
ping cliques (one user can b
rather interested in identifyin
the BK algorithm and propo
tifying non overlapping cliqu
the clustering method by eva
Then, we perform the follow
1. Maximal cliques in t
rithm.
2. Among the resulting
(with the highest card
3. A new affinity matrix
sponding to the eleme
These three steps are repeate
Clique-Based Clustering
28
in graph theory, a clique is a set of points all connect among each other
5 40 45 50 55
0
25
50
75
100
%viewportoverlap
ster
50
75
100
ewportoverlap
Fig. 4. Graphical example of the proposed clique clustering
Algorithm 1 Clique-Based Clustering
Input: {Gt}T
t=1, D
Output: K,QQQ = [Q1, ..., QK ]
Init: i = 1, A(1)
= ID(
P
t Wt),QQQ = [{;}, . . . , {;}]
repeat
CCC = [C1, ..., CL] KB(A(i)
)
l?
= arg maxl |Cl|
Looking at the viewports centers as nodes of graphs, we can propose a
clique-based clustering
55
0
25
50
75
100
%viewportoverlap
Fig. 4. Graphical example of the proposed clique clustering.C. Bron and J. Kerbosch, “Algorithm 457: finding all cliques of an undirected graph,” Communications of
the ACM, vol. 16, no. 9, 1973.
Clique-Based Clustering
29
• adjacency matrix constructed based on the threshold based geodesic distance
• elements in the clusters are all neighbors (as only cliques can be clusters)
Each cluster identifies users with a substantial viewport overlap!
points. The bi-
wt(i, j) = 1 if
tres of users i
(1)
wport centre of
g at the graphs
users based on
o other clusters
hat will be the
(2)
eans that users
threshold in N
, j) =
Q
t Wt,
Fig. 4. Graphical example of the proposed clique clustering.
Algorithm 1 Clique-Based Clustering
Input: {Gt}T
t=1, D
Output: K,QQQ = [Q1, ..., QK ]
Init: i = 1, A(1)
= ID(
P
t Wt),QQQ = [{;}, . . . , {;}]
repeat
CCC = [C1, ..., CL] KB(A(i)
)
l?
= arg maxl |Cl|
Qi = Cl?
A(i+1)
= A(i)
(CCC  Cl? )
i i + 1
until A(i)
is not empty;
K = i 1
1. Maximal cliques in the graph are derived from the Bron-
Kerbosch algorithm.
2. Among the resulting cliques, only the most populated one
(i.e., the one with largest cardinality) is kept as cluster.
3. A new affinity matrix is built, by eliminated the entries cor-
responding to the elements of the cluster identified in Step
2).
These three step are repeated until the all nodes are assigned to clus-
ters. It is worth mentioning that this iterative selection does notRossi, S., De Simone, F., Frossard, P., & Toni, L.m "Spherical clustering of users navigating 360 content”, IEEE ICASSP 2019.
•Users navigation data set from IMT Atlantique
•Proposed clustering compared with
•K-means
•Community detection algorithm
•Spectral Clustering of trajectories
Simulations: Settings
“Rollercoaster” “Timelapse NY”
30
Results - Clustering of Trajectories
31
- - /2 0 /2
theta
3* /4
/2
/4
0
phi
- - /2 0 /2
theta
3* /4
/2
/4
0
phi
Spectral clustering of
trajectories
Proposed Clique-Based
clustering
Results - Clustering of Trajectories
32
Trajectory clustering
d of the ”Mean Overlap Cl.” etc.}{SR:Do you prefer leave only the main cluster
esic dis-
n the K-
he value
as well
led “K-
ented in
e-based
among
orts re-
clusters
cluster
ures the
respect
aint that
s. This
pulated
users).
a main
5 10 15 20 25 30 35 40 45 50 55
sec
0
10
20
30
40
50
60
70
80
90
100
%OverallintersectionVPs
Clique clustering (57.45%)
SC - T = 3s. (8.12%)
SC - entire video (29.52%)
SC - K given (49.85%)
(a) Rollercoaster video - T = 3 s.
70
80
90
100
nVPs
Chapter 4. Toward User Prediction in Virtual Reality
(a) Rollercoaster video
Analysis based on Clusters
33
Open Questions
34
• Can we improve the clustering?
• Can we better analyse users similarity?
• Do we know which factors impact on the users
behaviour and the similarity?
Outline
Users’ navigation pattern analyse:
• a clustering approach
• a device-based study & a use case application
• an information-theory approach
S. Rossi, C. Ozcinar, A. Smolic and L. Toni. “Do users behave similarly in VR? Investigation of the influence on the
system design”, ACM Transactions on Multimedia Computing Communications and Applications (2020).
Key Motivation
• How does our clustering algorithm perform?
➡ Collected new dataset
➡ Developed further our analysis
• Can we benefit for our analysis in an applicative scenario?
➡ Proposed a user-centric server optimisation problem and
compared results wrt our analysis
36
DcmeaAciMie
• 15 videos + 3 test videos
(20sec.| 30fps |
2560x1440resolution)
• 5 videos per category
(Documentary, Action,
Movie)
Material
Collecting data
360 Video Renderer
Scene
Te t re
ie port
trajectoriesScene objects
Camera Mesh
Sphere
Geometr
Sensors
ODV Te t re
360 180
M SQL
Implementation test-platform*
• 94 participants took part in our
subjective experiment:
➡ 2/3 from UCL and 1/3 from TCD
➡ 65 males and 29 females
➡ aged between 21 to 52
(avg. 31 years)
VR SUBJECTIVE TEST
Dataset Collection
37
Dataset Collection
https://v-sense.scss.tcd.ie/research/3dof/vr_user_behaviour_system_design/
https://github.com/V-Sense/VR_user_behaviour
Publicly available dataset with head users trajectories
while using three devices (laptop, tablet, HMD)
38
Viewport angular velocity
• Users dynamically navigate more the content with laptop
• Movie are explored slower with all devices
• HMD has the lowest speed across devices and video categories
360 Video Renderer
Scene
Te t re
ie port
trajectoriesScene objects
Camera Mesh
Sphere
Geometr
Sensors
ODV Te t re
360 180
M SQL
360 Video Renderer
Scene
Te t re
ie port
trajectoriesScene objects
Camera Mesh
Sphere
Geometr
Sensors
ODV Te t re
360 180
M SQL
360 Video Renderer
Scene
Te t re
Scene objects
Camera Mesh
Sphere
Geometr
Sensors
➡ Users’ behaviour changes not only based on the video content
categories but also on the selected viewing devices
39
Viewport center distribution
stigation of the User Influence on the System Design 9
locity (b) Viewport center distribution 40
A user affinity metrice while consuming the ODV content. Also, this is done by taking
metry of the ODVs. We therefore introduce a novel metric (based on
orithm) to better reect similarity among users’ navigation trajecto
V. We dene this metric as the User Anity Index (UAI), given as fo
UAI =
ÕC
i=1 xi · wi
ÕC
i=1 wi
ere C is the number of clusters detected in a frame by the clique-clus
, out of the whole population/users sampled) in cluster i andwi is the
other words, the UAI represents the weighted average of cluster popu
e clique-based clustering is applied with a geodesic distance threshold equal to /8.
M Trans. Multimedia Comput. Commun. Appl., Vol. , No. , Article . Publication date:
• C: number of clusters detected in a frame by the clique-clustering
• xi : % of users in cluster i
• wi : number of users in cluster i
41
User Affinity
0 2 4 6 8 10 12 14 16 18 20
sec
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
UserAffinity
Clustering only HMD (44.91%)
Clustering only Laptop (35.10%)
Clustering only Tablet (49.27%)
Clustering all devices (35.51%)
Documentary (1 - Baby Pandas)
Affinity affected by content
42
User Affinity
0 2 4 6 8 10 12 14 16 18 20
sec
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
UserAffinity
Clustering only HMD (79.20%)
Clustering only Laptop (67.59%)
Clustering only Tablet (46.45%)
Clustering all devices (60.09%)
Movie (12 - Help)
Affinity different based on device
43
A user-centric server optimization
ILP for optimal VR representations to be stored at the main server?
x x
CDN Interactive Users
pl
Ingest
Server
Navigation Based
Adaptation Logic
Vie
Tile-based
Encoder
Optimal
set to
store
Network
information
Content Provider
Head
Movement
0 2 4 6 8 10 12 14 16 18 20
Time (sec)
10
15
20
25
StoredBitRate(Mbps)
0
0.5
1
UAI
(a) Documentary (ID 03): total bitrate
0 2 4 6 8 10 12 14 16 18 20
Time (sec)
10
15
20
25
StoredBitRate(Mbps)
0
0.5
1
UAI
0 2
0%
25%
50%
75%
100%
%storedrepresentations
(d) Documen
0 2
0%
25%
50%
75%
100%
%storedrepresentations
Is there any correlation between Stored bitrate and UAI?
44
• Sequences with no main focus of attention
users experience a low affinity, not perturbed by the viewing device.
• Sequences with a main focus of attention
‣ users affinity is strongly related to the selected viewing device.
‣ HMD leads to quite similar navigation among users.
• User-centric server optimization:
‣ The users’ behaviour during the navigation affects the resource
allocation of the optimal set
‣ UAI provides a good representation of the existing correlation
between users’ behaviour and optimal set
‣ UAI could be a key metric in the design of the next generation
systems.
Take-Home Message
45
• UAI is a clustering-based metric. Is this enough?
• Which metrics are usually considered in human-
trajectory studies?
Whatelse?
46
Outline
Users’ navigation pattern analyse:
• a clustering approach
• a device-based study  a use case application
• an information-theory approach
S. Rossi, and L. Toni. “Understanding User Navigation in Immersive Experience: an Information-Theoretic
analysis”, In International Workshop on Immersive Mixed and Virtual Environment Systems (MMVE’20)
48
User Behaviour Analysis in VR system
D) User’s Trajectories Analysis
v1 v2 vj. . .
uiui
A) Experiments B) Raw Data Collected
users
video
C) Pre-Processing
ui =  (x1, t1), . . , (xn, tn) 
users
video
Intra-user behaviour
analysis:
Actual Entropy
Fixation map Entropy
To characterise the
navigation of each user
over time against different
video contents.
Inter-user behaviour analysis
Mutual Information
Transfer Entropy
To study the behaviour of a single user in
correlation with others in the same content.
49
User Behaviour Analysis in VR system
D) User’s Trajectories Analysis
v1 v2 vj. . .
uiui
A) Experiments B) Raw Data Collected
users
video
C) Pre-Processing
ui =  (x1, t1), . . , (xn, tn) 
users
video
Intra-user behaviour
analysis:
Actual Entropy
Fixation map Entropy
To characterise the
navigation of each user
over time against different
video contents.
Inter-user behaviour analysis
Mutual Information
Transfer Entropy
To study the behaviour of a single user in
correlation with others in the same content.
A key quantity in information theory that measures the uncertainty
associated with an event.
Intra-User behaviour metrics
Entropy
H(X) = −
∑
x∈X
p(x)log(p(x))
Actual Entropy
Introduced as a proxy of predictability of human mobility patterns [1],
the actual entropy quantifies the information carried within a given
trajectory.
[1] C. Song, Z. Qu, N. Blumm, and A. Barabási. 2010. Limits of predictability in human mobility. Science.
Hact
(X) ≈
(
1
n
n
∑
t=1
λt
)
−1
log2(n)
50
Intra-User behaviour analysis
A
B
X. Corbillon, F. De Simone, and G. Simon. 2017. 360-degree video head movement dataset.
In Proceedings of the 8th ACM on Multimedia Systems Conference.
51
Intra-User behaviour analysis
A
B
X. Corbillon, F. De Simone, and G. Simon. 2017. 360-degree video head movement dataset.
In Proceedings of the 8th ACM on Multimedia Systems Conference.
2
4681012141618
20
22
24
26
28 30
323436
38
40
42
44464850525456
5860
2
4
6 8
10
12
14 16
18
20
2224
26
283032
34
36 38 404244
46
48
50 52
54
56
58
60
2
46
81012 1416182022
24
26 28 30
32
3436
38
40
42444648
50
525456
58
60
User 30: = 0.12
= 0.21·10−2
Hact
(X)
H(M)
User 48: = 0.65
= 0.43·10−2
Hact
(X)
H(M)
User 49: = 0.28
= 0.32·10−2
Hact
(X)
H(M)
Intra-User behaviour analysis
A
B
X. Corbillon, F. De Simone, and G. Simon. 2017. 360-degree video head movement dataset.
In Proceedings of the 8th ACM on Multimedia Systems Conference.
2
4681012141618
20
22
24
26
28 30
323436
38
40
42
44464850525456
5860
2
4
6 8
10
12
14 16
18
20
2224
26
283032
34
36 38 404244
46
48
50 52
54
56
58
60
2
46
81012 1416182022
24
26 28 30
32
3436
38
40
42444648
50
525456
58
60
User 30: = 0.12
= 0.21·10−2
Hact
(X)
H(M)
User 48: = 0.65
= 0.43·10−2
Hact
(X)
H(M)
User 49: = 0.28
= 0.32·10−2
Hact
(X)
H(M)
➡ High indicates more
randomness in the navigation
Hact
54
Inter-User behaviour metrics
A) Experiments B) Raw Data Collected
user
vide
C) Pre-Processing
ui =  (x1, t1), . . , (xn, tn) 
D) User’s Trajectories Analysis
v1 v2 vj. . .
uiui
users
video
Intra-user behaviour
analysis:
Actual Entropy
Fixation map Entropy
To characterise the
navigation of each user
over time against different
video contents.
Inter-user behaviour analysis
Mutual Information
Transfer Entropy
To study the behaviour of a single user in
correlation with others in the same content.
An other fundamental metric of information theory that measures the reduction of
uncertainty of a random variable provided by the knowledge of a second
variable .
X
Y
Inter-User behaviour metrics
Mutual
Transfer
I(X, Y) =
∑
x∈X,y∈Y
p(x, y)log
(
p(x, y)
p(x)p(y) )
Considering not only the occurrence of events but also their temporal ordering,
this metric measures reduction of uncertainty about the future value of a variable
by knowing the whole past history of itself and of a second variable.
TE(X → Y) = H(Yf |Yp) − H(Yf |Xp, Yp)
55
We need to study, understand, and predict users behaviour when navigating in the
spherical domain
• Clusters are meaningful if identifying users looking at the same portion of content
• We proposed a clique-based clustering to guarantee a viewport overlap among users in
the same clusters
• Deeper analysis showed us correlation between content-device and level of interactivity
• UAI can be a good metric for system design
• The above correlation can be formalised via information-theory metric
• The intra-user behavioural analysis has showed:
‣ some users have consistent patterns across different contents
‣ the lack of a dominant FoA leads to higher randomness in navigation trajectories
Conclusions
56
• To investigate further the link between content (FoAs) - device -
and users navigation
• To be able to expand existing datasets
• To understand if the information-theory metrics have an impact
with the users’ prediction
• To extend the users’ behaviour analysis to 6DoF
Future Directions
57
Thank You! Questions?
Learning and Signal Processing Lab
UCL
https://laspucl2016.com

More Related Content

PDF
Predicting the “Stars of Tomorrow” on Social Media
PDF
User-centric Networks for Immersive Communication
PDF
Understanding user interactivity for immersive communications and its impact ...
PDF
PREVENTING COPYRIGHTS INFRINGEMENT OF IMAGES BY WATERMARKING IN TRANSFORM DOM...
PDF
A NOVEL APPROACH FOR IMAGE WATERMARKING USING DCT AND JND TECHNIQUES
PDF
[IJET V2I4P2] Authors:Damanbir Singh, Guneet Kaur
PPTX
A blind dual color images watermarking based on IWT and sub-sampling
PDF
1918 1923
Predicting the “Stars of Tomorrow” on Social Media
User-centric Networks for Immersive Communication
Understanding user interactivity for immersive communications and its impact ...
PREVENTING COPYRIGHTS INFRINGEMENT OF IMAGES BY WATERMARKING IN TRANSFORM DOM...
A NOVEL APPROACH FOR IMAGE WATERMARKING USING DCT AND JND TECHNIQUES
[IJET V2I4P2] Authors:Damanbir Singh, Guneet Kaur
A blind dual color images watermarking based on IWT and sub-sampling
1918 1923

What's hot (10)

PDF
Self Attested Images for Secured Transactions using Superior SOM
PDF
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
PDF
M0262076085
PDF
Digital video watermarking scheme using discrete wavelet transform and standa...
PDF
International Journal of Engineering Research and Development (IJERD)
PDF
Dataset creation for Deep Learning-based Geometric Computer Vision problems
PDF
Paper id 24201422
PDF
DeepVO - Towards Visual Odometry with Deep Learning
PPT
robust image watermarking
PDF
Ja2415771582
Self Attested Images for Secured Transactions using Superior SOM
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...
M0262076085
Digital video watermarking scheme using discrete wavelet transform and standa...
International Journal of Engineering Research and Development (IJERD)
Dataset creation for Deep Learning-based Geometric Computer Vision problems
Paper id 24201422
DeepVO - Towards Visual Odometry with Deep Learning
robust image watermarking
Ja2415771582
Ad

Similar to Understanding Users Behaviours in User-Centric Immersive Communications (20)

PDF
A CONTENT BASED WATERMARKING SCHEME USING RADIAL SYMMETRY TRANSFORM AND SINGU...
PDF
Understanding user interactivity for immersive communications and its impact ...
PDF
Robust content based watermarking algorithm using singular value decompositio...
PDF
PROXIMITY-BASED ADAPTATION OF CONTENT TO GROUPS OF VIEWERS OF PUBLIC DISPLAYS
PDF
Proximity Based Adaptation of Content to Groups of Viewers of Public Displays
PDF
PROXIMITY-BASED ADAPTATION OF CONTENT TO GROUPS OF VIEWERS OF PUBLIC DISPLAYS
PDF
PROXIMITY-BASED ADAPTATION OF CONTENT TO GROUPS OF VIEWERS OF PUBLIC DISPLAYS
PPT
Osm Quality Assessment 2008
PDF
Spherical Clustering of users navigating in VR content
PDF
Performance Evaluation of Lane Detection Images Based on Fuzzy Logic
PDF
Detection of Bridges using Different Types of High Resolution Satellite Images
PDF
Reduced-reference Video Quality Metric Using Spatial Information in Salient R...
PDF
Enhanced Watemarked Images by Various Attacks Based on DWT with Differential ...
PPTX
Big Geo Data: Open Source and Open Standards
PDF
3 video segmentation
PDF
2013APRU_NO40-abstract-mobilePIV_YangYaoYu
PPTX
Mapping virtual and physical reality
PDF
Object Detection with Computer Vision
PDF
An efficient image segmentation approach through enhanced watershed algorithm
PDF
Review on Various Algorithm for Cloud Detection and Removal for Images
A CONTENT BASED WATERMARKING SCHEME USING RADIAL SYMMETRY TRANSFORM AND SINGU...
Understanding user interactivity for immersive communications and its impact ...
Robust content based watermarking algorithm using singular value decompositio...
PROXIMITY-BASED ADAPTATION OF CONTENT TO GROUPS OF VIEWERS OF PUBLIC DISPLAYS
Proximity Based Adaptation of Content to Groups of Viewers of Public Displays
PROXIMITY-BASED ADAPTATION OF CONTENT TO GROUPS OF VIEWERS OF PUBLIC DISPLAYS
PROXIMITY-BASED ADAPTATION OF CONTENT TO GROUPS OF VIEWERS OF PUBLIC DISPLAYS
Osm Quality Assessment 2008
Spherical Clustering of users navigating in VR content
Performance Evaluation of Lane Detection Images Based on Fuzzy Logic
Detection of Bridges using Different Types of High Resolution Satellite Images
Reduced-reference Video Quality Metric Using Spatial Information in Salient R...
Enhanced Watemarked Images by Various Attacks Based on DWT with Differential ...
Big Geo Data: Open Source and Open Standards
3 video segmentation
2013APRU_NO40-abstract-mobilePIV_YangYaoYu
Mapping virtual and physical reality
Object Detection with Computer Vision
An efficient image segmentation approach through enhanced watershed algorithm
Review on Various Algorithm for Cloud Detection and Removal for Images
Ad

More from Förderverein Technische Fakultät (20)

PDF
„Die Klimakrise ist da! Wo führt sie hin?“
PDF
Constrained text generation to measure reading performance: A new approach ba...
PPTX
Greening local government units: Current status and required competences
PDF
Supervisory control of business processes
PPTX
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
PDF
A Game of Chess is Like a Swordfight.pdf
PDF
From Mind to Meta.pdf
PDF
Miniatures Design for Tabletop Games.pdf
PPTX
Distributed Systems in the Post-Moore Era.pptx
PPTX
Don't Treat the Symptom, Find the Cause!.pptx
PDF
Engineering Serverless Workflow Applications in Federated FaaS.pdf
PDF
The Role of Machine Learning in Fluid Network Control and Data Planes.pdf
PDF
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
PDF
Towards a data driven identification of teaching patterns.pdf
PPTX
Förderverein Technische Fakultät.pptx
PDF
The Computing Continuum.pdf
PPTX
East-west oriented photovoltaic power systems: model, benefits and technical ...
PDF
Machine Learning in Finance via Randomization
PPTX
Advances in Visual Quality Restoration with Generative Adversarial Networks
„Die Klimakrise ist da! Wo führt sie hin?“
Constrained text generation to measure reading performance: A new approach ba...
Greening local government units: Current status and required competences
Supervisory control of business processes
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
A Game of Chess is Like a Swordfight.pdf
From Mind to Meta.pdf
Miniatures Design for Tabletop Games.pdf
Distributed Systems in the Post-Moore Era.pptx
Don't Treat the Symptom, Find the Cause!.pptx
Engineering Serverless Workflow Applications in Federated FaaS.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdf
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Towards a data driven identification of teaching patterns.pdf
Förderverein Technische Fakultät.pptx
The Computing Continuum.pdf
East-west oriented photovoltaic power systems: model, benefits and technical ...
Machine Learning in Finance via Randomization
Advances in Visual Quality Restoration with Generative Adversarial Networks

Recently uploaded (20)

PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
DOCX
search engine optimization ppt fir known well about this
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
WOOl fibre morphology and structure.pdf for textiles
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PDF
STKI Israel Market Study 2025 version august
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
Developing a website for English-speaking practice to English as a foreign la...
PDF
Hybrid model detection and classification of lung cancer
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
Modernising the Digital Integration Hub
PPT
What is a Computer? Input Devices /output devices
PPTX
Web Crawler for Trend Tracking Gen Z Insights.pptx
PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
PDF
August Patch Tuesday
PPTX
O2C Customer Invoices to Receipt V15A.pptx
PDF
Zenith AI: Advanced Artificial Intelligence
PPTX
Benefits of Physical activity for teenagers.pptx
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
search engine optimization ppt fir known well about this
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
WOOl fibre morphology and structure.pdf for textiles
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
STKI Israel Market Study 2025 version august
NewMind AI Weekly Chronicles – August ’25 Week III
A comparative study of natural language inference in Swahili using monolingua...
Developing a website for English-speaking practice to English as a foreign la...
Hybrid model detection and classification of lung cancer
Assigned Numbers - 2025 - Bluetooth® Document
Modernising the Digital Integration Hub
What is a Computer? Input Devices /output devices
Web Crawler for Trend Tracking Gen Z Insights.pptx
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
August Patch Tuesday
O2C Customer Invoices to Receipt V15A.pptx
Zenith AI: Advanced Artificial Intelligence
Benefits of Physical activity for teenagers.pptx

Understanding Users Behaviours in User-Centric Immersive Communications

  • 1. Understanding Users Behaviours in User-Centric Immersive Communications Laura Toni UCL - University College London TEWI Colloquium 26 June 2020
  • 2. A massive thanks to Silvia Rossi (UCL) .. the Phd Student behind this work Cagri Ozcinar (TCD) Our collaborators Aljosa Smolic (TCD) Pascal Frossard (EPFL) Francesca De Simone (CWI)
  • 4. 4 A virtual - rather than physical - world in which any user can be fully immersed and interactive Virtual Reality (VR)
  • 5. 360º video streaming: main challenges • New spherical geometry • Large volume of data to store, deliver and display • Ultra-low-delay constraints over bandwidth- limited resources • Uncertainty on the portion of content that will be displayed by the user 5
  • 6. Toward a personalised streaming • S. Rossi, and L. Toni. “Navigation-aware adaptive streaming strategies for omnidirectional video”, IEEE MMSP 2017. • Serhan Gül et al., "Low-latency Cloud-based Volumetric Video Streaming Using Head Motion Prediction”, ACM NOSSDAV 2020 • V Swaminathan, M Hosseini, "Prioritizing tile-based virtual reality video streaming using adaptive rate allocation”, US Patent App. 16/784,100 What can we do? VR systems need to operate at scale, in a personalized manner, remaining bandwidth-tolerant whilst meeting quality and latency criteria • Viewport-Aware adaptation logic • Users-centric coding strategies • … 6 One key challenge to reach this goal is to fully understand and anticipate user behaviours in these mixed reality settings.
  • 7. 360 Immersion Communication But how do users interact in this virtual environment? Can we predict users’ behaviour? 7
  • 8. Focus of Today • Can we identify navigation patterns? • Can we quantify users’ similarity in their navigation? • Can some users be more predictable than others? • Can the navigation pattern be representative of the single user? • How much is navigation affected by external factors (e.g., video content features)? How do users actually navigate in VR environments? 8
  • 9. Focus of Today Talk How do users actually navigate in VR environments? Coding-streaming optimisation 9 VR therapists Live performance Mu Mu et al, “User attention and behaviour in virtual reality encounter”, 2020 WHIST, AoE 2019
  • 11. Main Goal To design metrics and methodologies to analyse users’ behaviour in 360-degree videos aiming at • identifying dominant behaviours of immersive navigation • quantifying similarities across contents and across users • analysing and quantify the level of interaction of the user with the content 11
  • 12. Outline Users’ navigation pattern analyse: • a clustering approach • a device-based study & a use case application • an information-theory approach 12
  • 13. Current Analysis Traditional metrics 13 • Mean exploration angles • Heat map • Angular velocity • Frequency of fixation • X. Corbillon, F. De Simone, and G. Simon, “360-degree video head movement dataset”, ACM MMSys 2017. • A. Nguyen and Z. Yan, “A saliency dataset for 360-degree videos”, ACM MMSys 2019. • V. Sitzmann, A. Serrano,A.Pavel, Agrawala, D.Gutierrez, B.Masia,and G.Wetzstein, “Saliency in VR: How Do People Explore Virtual Environments?” IEEE Transactions on Visualization and Computer Graphics, 2018. • Xu, M., Li, C., Zhang, S., & P. Le Callet “State-of-the-art in 360 video/image processing: Perception, assessment and compression”, IEEE Journal of Selected Topics in Signal Processing, 14(1), 5-26, 2020.
  • 14. User Behaviour Analysis in VR system Traditional metrics Scenario A Scenario B But do these metrics capture all the actual trajectory behaviour ? 14 • Mean exploration angles • Heat map • Angular velocity • Frequency of fixation
  • 15. Scenario A Scenario B • Angular velocity • Frequency of fixation • Mean exploration angles • Heat map But do these metrics capture all the actual trajectory behaviour ? User Behaviour Analysis in VR system Traditional metrics FAIL 15
  • 16. Outline Users’ navigation pattern analyse: • a clustering approach • a device-based study & a use case application • an information-theory approach 16Rossi, S., De Simone, F., Frossard, P., & Toni, L.m "Spherical clustering of users navigating 360 content”, IEEE ICASSP 2019.
  • 17. Overall Goal 17 Let’s consider each user’s trajectory looking at the viewports centers t0 t1 t2 … t0 t1 t2 17
  • 18. Overall Goal 18 Our Goal: To propose a clustering method able to clusters users based on their navigation patterns on the sphere. we cluster themgiven all users’ trajectories 18
  • 19. Why Clustering in VR? 19 … to a more reliable heat map from clustering… … to a predicted navigation paths … to a practical dataset analysis … to identify key behaviours for optimal coding/QoE evaluation etc
  • 20. What Are the Main Challenges? 20 • To take into account the spherical geometry of the spherical content 2 [✓n ± ✓/2] and 8 2 [ n ± /2]} where r point of the block n. However, the regular with variable area, as show in Figure 3. In S = ✓ while on the rendered view the sphere. Therefore, this area changes with the ge. In order to consider this deformation, the eir surface. om planar to sphere. ach frame a high quality in the part with the icular, the main part of the panorama is the the user’s viewing direction. The viewport is the sphere in the point of viewing direction. and latitude (0   ⇡) values [1]. In the same way a g be defined on the sphere as the set {(✓, ) s.t. 8✓ 2 [✓n ± ✓, are the dimensions and (✓n, n) is the center point blocks of the panorama are mapped on the sphere with va particular, on the planar their surface is equal to S = ✓ surface is S = r2 sin ✓ where r is the ray of the sphere. latitude introducing distortion in the projected image. In o quality on the sphere of each block is weighted by their surf Figure 3: Map projection from plan 3.2 Spherical QoE metric In our streaming system, we want to ensure inside each fram most probability to be viewed from the user.In particular, t viewport that is the portion displayed depending on the use geodesic distance as distance metric
  • 21. What Are the Main Challenges? 21 • To identify clusters that are meaningful in the VR domain! To adopt a metric that reflects the actual viewport overlap The geodesic distance approximates the actual viewport overlap is ⇡/8 in both cases Figure 1(a) and (b). However, the green view- port in the second figure is rotated of ⇡/2. Even if it is an extreme situation, this rotation reduces the overlap from 87% to 58% of the total area. It follows that the the geodesic distance is an approxima- tion (and not exact reflection) of the viewport overlap. The closest the viewports centers (i.e., the smaller the distance) the smaller is the approximation error in taking into account the geodesic distance rather than the viewport overlap. At the same time, for large dis- tances, the approximation error can be substantial. Therefore, in this paper we aims at finding a threshold value Gththat minimize the discrepancy between these two metrics. (a) Green and blue viewports same rotation - overlap 87% (b) Green viewport rotated of ⇡ 2 - overlap 58% Fig. 1. Comparison of viewport overlap between viewports with centre distance ⇡ 10 but different rotation angles. To further demonstrate the validity of our assumption, we con- two videos of the coaster has one mai Timelapse, there ar ple) along the equat geodesic distance a wise geodesic dista axis in red) between Rollercoster frame has been plotted wi value of geodesic d between the two m high, the geodesic d value. Looking at t can notice that mos reference user, in m To formalise th to all video in the lem, we used a Rec
  • 22. What Are the Main Challenges? 22 • To identify clusters that are meaningful in the VR domain! To adopt a metric that reflects the actual viewport overlap To identifies users that are actually looking at the same portion of the sphere Classical clustering methods do not guarantee this joint overlap
  • 23. • “CLS: A Cross-user Learning based System for Improving QoE in 360-degree Video Adaptive Streaming” ACM Multimedia Conference on Multimedia Conference 2018 Authors: L. Xie, X. Zhang, and Z. Guo • “Trajectory- Based Viewport Prediction for 360-Degree Virtual Reality Videos” IEEE conference on Artificial Intelligence and Virtual Reality 2018 Authors: S. Petrangeli, G. Simon, and V. Swaminathan ✓ Clustering for VR users ✗ Euclidean distance as distance metric State-of-the-art ✓ Clustering of trajectories + prediction ✓ Spherical geometry taken into account ✗ “Classical” clustering method 23
  • 24. Our Proposed Approach Step 1: To evaluate users similarity as a threshold-based geodesic distance Step 2: To propose a clique-based clustering method based on the metric derived in step1 24
  • 25. Threshold-based geodesic distance Users are similar if they share at least a portion Oth of their viewports (say 80%) how do we translate this into geodesic distance? Chapter 4. Toward User Prediction in Virtual Reality - - /2 0 /2 theta 3* /4 /2 /4 0 phi Rollercoaster - User positions at frame = 1480 1 2 3 4 5 67 8 9 1011 12 13 14 1516 1718 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 5758 59 (a) Users position at Rollercoaster frame = 1480 (b) Rollercoaster frame = 1480 3* /4 /2 /4 0 phi Elephant - User positions at frame = 1308 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 2526 27 28 29 3031 32 33 34 35 3637 38 Chapter 4. Toward User Prediction in Virtual Reality - - /2 0 /2 theta 3* /4 /2 /4 0 phi Rollercoaster - User positions at frame = 1480 1 2 3 4 5 67 8 9 1011 12 13 14 1516 1718 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 5758 59 (a) Users position at Rollercoaster frame = 1480 (b) Rollercoaster frame = 1480 - - /2 0 /2 theta 3* /4 /2 /4 0 phi Elephant - User positions at frame = 1308 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 2526 27 28 29 3031 32 33 34 35 3637 38 (c) Users position at Elephant frame = 1308 (d) Elephant frame = 1308 - - /2 0 /2 theta 3* /4 /2 /4 0 phi Diving - User positions at frame = 1265 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 4445 46 47 48 49 50 5152 53 54 55 56 57 58 (e) Users position at Diving frame = 1265 (f) Diving frame = 1265 /4 0 Timelapse - User positions at frame = 1228 1 7 11 12 15 1724 33 42 43 44 45 4956 57 58 . Toward User Prediction in Virtual Reality 3 41 frame = (b) Rollercoaster frame = 1480 rame = (d) Elephant frame = 1308 1 3 7 8 11 22 23 35 39 54 57 8 = 1265 (f) Diving frame = 1265 2 3 9 53 4. Toward User Prediction in Virtual Reality 3 41 frame = (b) Rollercoaster frame = 1480 rame = (d) Elephant frame = 1308 1 3 7 8 11 6 22 23 35 39 54 57 58 = 1265 (f) Diving frame = 1265 2 3 9 0 53 frame = (h) Timelapse frame = 1228 ons of users’ viewport. Right column: Frame for 41 Do users at a max distance of G have an overlap of at least Oth false positive, true negative… 25 Step 1: To evaluate users similarity as a threshold-based geodesic distance % blue’%cluster%with%2%user’s%viewports% ! ROC Evaluation
  • 26. Threshold-based geodesic distance Users are similar if they share at least a portion Oth of their viewports (say 80%) how do we translate this into geodesic distance? general, we evaluated this curve using all the selected videos and comparing all possible users couple. Figure 4.4 shows our result. The best value of geodesic distance is ⇡ 10 since it corresponds to a TPR value very close to 1, which in our application means a strong matching of neighbours detected with the overlap and the geodesic distance. At the same time, the FPR for ⇡ 10 is around 0.1, which means that only the 10% of times two users are wrongly identified as neighbours in term of geodesic distance. For our application, we want to be sure to detect all the couple of neighbour and we can tolerate to have an optimistic prediction, therefore ⇡ 10 is a suitable value. Figure 4.4: ROC curve to evaluate geodesic distance threshold value considering all selected video. we look at the ROC curve averaged across videos 26
  • 27. Threshold-based geodesic distance Only users below threshold Gth can be neighbour in a graph representation 27 Step 1: To evaluate users similarity as a threshold-based geodesic distance ROC to evaluate Gth (Done once for all videos) 1 1 0 1 1 1 0 1 0 0 1 0 1 1 0 1 Fig. 3. ROC curve to evaluate optimal Gth considering all video in database [13] and Oth = 80% . 3. CLIQUE-BASED CLUSTERING ALGORITHM We now describe the proposed clustering algorithm, aimed at iden- tifying clusters of users having a common viewport overlap. We model the evolution of users’ viewports over a time-window T as a set of graphs {Gt}T t=1. Each unweighted and undirected graph Gt = {V, Et, Wt} represents the set of users2 navigating over time, where V and Et denote the node and edge sets of Gt. Each node in V corresponds to a user interacting with the 360 content at instant t. Each edge in Et connects neighbouring nodes, where two nodes are neighbours if the geodesic distance between the viewport centers as- sociated to the users represented by the nodes is lower than Gth , as defined in Section II. The binary matrix Wt is the adjacency matrix of Gt, with wt(i, j) = 1 if users are neighbors. More formally: wt(i, j) = ( 1, if g(i, j)  Gth 0, otherwise (1) where g(i, j) is the geodesic distance between the viewport centres of users i and j and Gth is thresholding value, introduced in Section II. Looking at the graphs over time {Gt}T t=1, we are interested in clustering users based on their trajectories within a time window of duration T. In other words, we are interested in identifying users that have similar behaviour over time. With this goal in mind, we derive an affinity matrix A that will be the input to our clustering algorithm Similarly to other clusters of trajectories [23]. Each element of A is defined as following: ! Input: {Gt}T t=1, D Output: K,QQQ = [Q1, .. Init: i = 1, A(1) = ID( repeat CCC = [C1, ..., CL] KB l? = arg maxl |Cl| Qi = Cl? A(i+1) = A(i) (CCC Cl? ) i i + 1 until A(i) is not empty; K = i 1 our definition of meaningful cant pairwise viewport overl video. Therefore, we propo ular, we consider the Bron-K maximal cliques present in o graphs forming cliques). Wh ping cliques (one user can b rather interested in identifyin the BK algorithm and propo tifying non overlapping cliqu the clustering method by eva Then, we perform the follow 1. Maximal cliques in t rithm. 2. Among the resulting (with the highest card 3. A new affinity matrix sponding to the eleme These three steps are repeate
  • 28. Clique-Based Clustering 28 in graph theory, a clique is a set of points all connect among each other 5 40 45 50 55 0 25 50 75 100 %viewportoverlap ster 50 75 100 ewportoverlap Fig. 4. Graphical example of the proposed clique clustering Algorithm 1 Clique-Based Clustering Input: {Gt}T t=1, D Output: K,QQQ = [Q1, ..., QK ] Init: i = 1, A(1) = ID( P t Wt),QQQ = [{;}, . . . , {;}] repeat CCC = [C1, ..., CL] KB(A(i) ) l? = arg maxl |Cl| Looking at the viewports centers as nodes of graphs, we can propose a clique-based clustering 55 0 25 50 75 100 %viewportoverlap Fig. 4. Graphical example of the proposed clique clustering.C. Bron and J. Kerbosch, “Algorithm 457: finding all cliques of an undirected graph,” Communications of the ACM, vol. 16, no. 9, 1973.
  • 29. Clique-Based Clustering 29 • adjacency matrix constructed based on the threshold based geodesic distance • elements in the clusters are all neighbors (as only cliques can be clusters) Each cluster identifies users with a substantial viewport overlap! points. The bi- wt(i, j) = 1 if tres of users i (1) wport centre of g at the graphs users based on o other clusters hat will be the (2) eans that users threshold in N , j) = Q t Wt, Fig. 4. Graphical example of the proposed clique clustering. Algorithm 1 Clique-Based Clustering Input: {Gt}T t=1, D Output: K,QQQ = [Q1, ..., QK ] Init: i = 1, A(1) = ID( P t Wt),QQQ = [{;}, . . . , {;}] repeat CCC = [C1, ..., CL] KB(A(i) ) l? = arg maxl |Cl| Qi = Cl? A(i+1) = A(i) (CCC Cl? ) i i + 1 until A(i) is not empty; K = i 1 1. Maximal cliques in the graph are derived from the Bron- Kerbosch algorithm. 2. Among the resulting cliques, only the most populated one (i.e., the one with largest cardinality) is kept as cluster. 3. A new affinity matrix is built, by eliminated the entries cor- responding to the elements of the cluster identified in Step 2). These three step are repeated until the all nodes are assigned to clus- ters. It is worth mentioning that this iterative selection does notRossi, S., De Simone, F., Frossard, P., & Toni, L.m "Spherical clustering of users navigating 360 content”, IEEE ICASSP 2019.
  • 30. •Users navigation data set from IMT Atlantique •Proposed clustering compared with •K-means •Community detection algorithm •Spectral Clustering of trajectories Simulations: Settings “Rollercoaster” “Timelapse NY” 30
  • 31. Results - Clustering of Trajectories 31 - - /2 0 /2 theta 3* /4 /2 /4 0 phi - - /2 0 /2 theta 3* /4 /2 /4 0 phi Spectral clustering of trajectories Proposed Clique-Based clustering
  • 32. Results - Clustering of Trajectories 32 Trajectory clustering d of the ”Mean Overlap Cl.” etc.}{SR:Do you prefer leave only the main cluster esic dis- n the K- he value as well led “K- ented in e-based among orts re- clusters cluster ures the respect aint that s. This pulated users). a main 5 10 15 20 25 30 35 40 45 50 55 sec 0 10 20 30 40 50 60 70 80 90 100 %OverallintersectionVPs Clique clustering (57.45%) SC - T = 3s. (8.12%) SC - entire video (29.52%) SC - K given (49.85%) (a) Rollercoaster video - T = 3 s. 70 80 90 100 nVPs
  • 33. Chapter 4. Toward User Prediction in Virtual Reality (a) Rollercoaster video Analysis based on Clusters 33
  • 34. Open Questions 34 • Can we improve the clustering? • Can we better analyse users similarity? • Do we know which factors impact on the users behaviour and the similarity?
  • 35. Outline Users’ navigation pattern analyse: • a clustering approach • a device-based study & a use case application • an information-theory approach S. Rossi, C. Ozcinar, A. Smolic and L. Toni. “Do users behave similarly in VR? Investigation of the influence on the system design”, ACM Transactions on Multimedia Computing Communications and Applications (2020).
  • 36. Key Motivation • How does our clustering algorithm perform? ➡ Collected new dataset ➡ Developed further our analysis • Can we benefit for our analysis in an applicative scenario? ➡ Proposed a user-centric server optimisation problem and compared results wrt our analysis 36
  • 37. DcmeaAciMie • 15 videos + 3 test videos (20sec.| 30fps | 2560x1440resolution) • 5 videos per category (Documentary, Action, Movie) Material Collecting data 360 Video Renderer Scene Te t re ie port trajectoriesScene objects Camera Mesh Sphere Geometr Sensors ODV Te t re 360 180 M SQL Implementation test-platform* • 94 participants took part in our subjective experiment: ➡ 2/3 from UCL and 1/3 from TCD ➡ 65 males and 29 females ➡ aged between 21 to 52 (avg. 31 years) VR SUBJECTIVE TEST Dataset Collection 37
  • 39. Viewport angular velocity • Users dynamically navigate more the content with laptop • Movie are explored slower with all devices • HMD has the lowest speed across devices and video categories 360 Video Renderer Scene Te t re ie port trajectoriesScene objects Camera Mesh Sphere Geometr Sensors ODV Te t re 360 180 M SQL 360 Video Renderer Scene Te t re ie port trajectoriesScene objects Camera Mesh Sphere Geometr Sensors ODV Te t re 360 180 M SQL 360 Video Renderer Scene Te t re Scene objects Camera Mesh Sphere Geometr Sensors ➡ Users’ behaviour changes not only based on the video content categories but also on the selected viewing devices 39
  • 40. Viewport center distribution stigation of the User Influence on the System Design 9 locity (b) Viewport center distribution 40
  • 41. A user affinity metrice while consuming the ODV content. Also, this is done by taking metry of the ODVs. We therefore introduce a novel metric (based on orithm) to better reect similarity among users’ navigation trajecto V. We dene this metric as the User Anity Index (UAI), given as fo UAI = ÕC i=1 xi · wi ÕC i=1 wi ere C is the number of clusters detected in a frame by the clique-clus , out of the whole population/users sampled) in cluster i andwi is the other words, the UAI represents the weighted average of cluster popu e clique-based clustering is applied with a geodesic distance threshold equal to /8. M Trans. Multimedia Comput. Commun. Appl., Vol. , No. , Article . Publication date: • C: number of clusters detected in a frame by the clique-clustering • xi : % of users in cluster i • wi : number of users in cluster i 41
  • 42. User Affinity 0 2 4 6 8 10 12 14 16 18 20 sec 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 UserAffinity Clustering only HMD (44.91%) Clustering only Laptop (35.10%) Clustering only Tablet (49.27%) Clustering all devices (35.51%) Documentary (1 - Baby Pandas) Affinity affected by content 42
  • 43. User Affinity 0 2 4 6 8 10 12 14 16 18 20 sec 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 UserAffinity Clustering only HMD (79.20%) Clustering only Laptop (67.59%) Clustering only Tablet (46.45%) Clustering all devices (60.09%) Movie (12 - Help) Affinity different based on device 43
  • 44. A user-centric server optimization ILP for optimal VR representations to be stored at the main server? x x CDN Interactive Users pl Ingest Server Navigation Based Adaptation Logic Vie Tile-based Encoder Optimal set to store Network information Content Provider Head Movement 0 2 4 6 8 10 12 14 16 18 20 Time (sec) 10 15 20 25 StoredBitRate(Mbps) 0 0.5 1 UAI (a) Documentary (ID 03): total bitrate 0 2 4 6 8 10 12 14 16 18 20 Time (sec) 10 15 20 25 StoredBitRate(Mbps) 0 0.5 1 UAI 0 2 0% 25% 50% 75% 100% %storedrepresentations (d) Documen 0 2 0% 25% 50% 75% 100% %storedrepresentations Is there any correlation between Stored bitrate and UAI? 44
  • 45. • Sequences with no main focus of attention users experience a low affinity, not perturbed by the viewing device. • Sequences with a main focus of attention ‣ users affinity is strongly related to the selected viewing device. ‣ HMD leads to quite similar navigation among users. • User-centric server optimization: ‣ The users’ behaviour during the navigation affects the resource allocation of the optimal set ‣ UAI provides a good representation of the existing correlation between users’ behaviour and optimal set ‣ UAI could be a key metric in the design of the next generation systems. Take-Home Message 45
  • 46. • UAI is a clustering-based metric. Is this enough? • Which metrics are usually considered in human- trajectory studies? Whatelse? 46
  • 47. Outline Users’ navigation pattern analyse: • a clustering approach • a device-based study a use case application • an information-theory approach S. Rossi, and L. Toni. “Understanding User Navigation in Immersive Experience: an Information-Theoretic analysis”, In International Workshop on Immersive Mixed and Virtual Environment Systems (MMVE’20)
  • 48. 48 User Behaviour Analysis in VR system D) User’s Trajectories Analysis v1 v2 vj. . . uiui A) Experiments B) Raw Data Collected users video C) Pre-Processing ui = (x1, t1), . . , (xn, tn) users video Intra-user behaviour analysis: Actual Entropy Fixation map Entropy To characterise the navigation of each user over time against different video contents. Inter-user behaviour analysis Mutual Information Transfer Entropy To study the behaviour of a single user in correlation with others in the same content.
  • 49. 49 User Behaviour Analysis in VR system D) User’s Trajectories Analysis v1 v2 vj. . . uiui A) Experiments B) Raw Data Collected users video C) Pre-Processing ui = (x1, t1), . . , (xn, tn) users video Intra-user behaviour analysis: Actual Entropy Fixation map Entropy To characterise the navigation of each user over time against different video contents. Inter-user behaviour analysis Mutual Information Transfer Entropy To study the behaviour of a single user in correlation with others in the same content.
  • 50. A key quantity in information theory that measures the uncertainty associated with an event. Intra-User behaviour metrics Entropy H(X) = − ∑ x∈X p(x)log(p(x)) Actual Entropy Introduced as a proxy of predictability of human mobility patterns [1], the actual entropy quantifies the information carried within a given trajectory. [1] C. Song, Z. Qu, N. Blumm, and A. Barabási. 2010. Limits of predictability in human mobility. Science. Hact (X) ≈ ( 1 n n ∑ t=1 λt ) −1 log2(n) 50
  • 51. Intra-User behaviour analysis A B X. Corbillon, F. De Simone, and G. Simon. 2017. 360-degree video head movement dataset. In Proceedings of the 8th ACM on Multimedia Systems Conference. 51
  • 52. Intra-User behaviour analysis A B X. Corbillon, F. De Simone, and G. Simon. 2017. 360-degree video head movement dataset. In Proceedings of the 8th ACM on Multimedia Systems Conference. 2 4681012141618 20 22 24 26 28 30 323436 38 40 42 44464850525456 5860 2 4 6 8 10 12 14 16 18 20 2224 26 283032 34 36 38 404244 46 48 50 52 54 56 58 60 2 46 81012 1416182022 24 26 28 30 32 3436 38 40 42444648 50 525456 58 60 User 30: = 0.12 = 0.21·10−2 Hact (X) H(M) User 48: = 0.65 = 0.43·10−2 Hact (X) H(M) User 49: = 0.28 = 0.32·10−2 Hact (X) H(M)
  • 53. Intra-User behaviour analysis A B X. Corbillon, F. De Simone, and G. Simon. 2017. 360-degree video head movement dataset. In Proceedings of the 8th ACM on Multimedia Systems Conference. 2 4681012141618 20 22 24 26 28 30 323436 38 40 42 44464850525456 5860 2 4 6 8 10 12 14 16 18 20 2224 26 283032 34 36 38 404244 46 48 50 52 54 56 58 60 2 46 81012 1416182022 24 26 28 30 32 3436 38 40 42444648 50 525456 58 60 User 30: = 0.12 = 0.21·10−2 Hact (X) H(M) User 48: = 0.65 = 0.43·10−2 Hact (X) H(M) User 49: = 0.28 = 0.32·10−2 Hact (X) H(M) ➡ High indicates more randomness in the navigation Hact
  • 54. 54 Inter-User behaviour metrics A) Experiments B) Raw Data Collected user vide C) Pre-Processing ui = (x1, t1), . . , (xn, tn) D) User’s Trajectories Analysis v1 v2 vj. . . uiui users video Intra-user behaviour analysis: Actual Entropy Fixation map Entropy To characterise the navigation of each user over time against different video contents. Inter-user behaviour analysis Mutual Information Transfer Entropy To study the behaviour of a single user in correlation with others in the same content.
  • 55. An other fundamental metric of information theory that measures the reduction of uncertainty of a random variable provided by the knowledge of a second variable . X Y Inter-User behaviour metrics Mutual Transfer I(X, Y) = ∑ x∈X,y∈Y p(x, y)log ( p(x, y) p(x)p(y) ) Considering not only the occurrence of events but also their temporal ordering, this metric measures reduction of uncertainty about the future value of a variable by knowing the whole past history of itself and of a second variable. TE(X → Y) = H(Yf |Yp) − H(Yf |Xp, Yp) 55
  • 56. We need to study, understand, and predict users behaviour when navigating in the spherical domain • Clusters are meaningful if identifying users looking at the same portion of content • We proposed a clique-based clustering to guarantee a viewport overlap among users in the same clusters • Deeper analysis showed us correlation between content-device and level of interactivity • UAI can be a good metric for system design • The above correlation can be formalised via information-theory metric • The intra-user behavioural analysis has showed: ‣ some users have consistent patterns across different contents ‣ the lack of a dominant FoA leads to higher randomness in navigation trajectories Conclusions 56
  • 57. • To investigate further the link between content (FoAs) - device - and users navigation • To be able to expand existing datasets • To understand if the information-theory metrics have an impact with the users’ prediction • To extend the users’ behaviour analysis to 6DoF Future Directions 57
  • 58. Thank You! Questions? Learning and Signal Processing Lab UCL https://laspucl2016.com