Transcending our views to sequential data 

Markus Luczak-Roesch | @mluczak
University of Southampton, Web and Internet Science
http://markus-luczak.de
HF LF
[1] Kleinberg, Jon. "Bursty and hierarchical structure in streams." Data
Mining and Knowledge Discovery 7.4 (2003): 373-397.
[2] Subašić, I., & Berendt, B. (2013). Story graphs: Tracking document
set evolution using dynamic graphs. Intelligent Data Analysis, 17(1),
125-147.
Time
Numberofobserveddocuments
Content streams as automata [1]
“The key notion of TTM is
burstiness – sudden increases in
frequency of text fragments, and
all TTM methods aim to model
burstiness.” [2]
t
System A
System B
System C
Related activity?
t
Building transcendental information cascades
conditionality.
In [20] we presented the initial definition of a transcenden-
tal information cascade as a 4-tupel TC = (V, E, R, F). This
4-tupel represents a directed network consisting of a set of
nodes V and edges E, derived when applying a set of matching
functions F to a set of resources R = {r1, r2, ..., rm}, ri =
(ui, ti, ci), where every ui is a unique identifier of a resource
ri that was shared at the time ti with the content ci. Nodes in
the network are those resources from R that contain a set Ii of
one or multiple cascade identifiers. A cascade identifier is any
unique informational pattern that is recognized by applying
a matching function to the content or any other inherent
properties of a resource (e.g. simple string matching algorithms
to identify keywords in content). Formally a matching function
fk 2 F, k 2 N, k  n is defined as:
fk(ci) =
8
>>>>><
>>>>>:
{i1, i2, ..., ix} if fk matches patterns
{i1, i2, ..., ix} in ci
x 2 N
; otherwise
Nodes V and edges E are then given as follows
V ={v1, v2, ..., vp}
vy = (uy, ty, Iy),
E ={e1, e2, ..., eq}
ez =(ua, ub, ⇤z)
with Ii = {i1, i2, ..., io} = f1(ci) [ f2(ci) [ ... [ fn(ci) being
the result of the concatenation of all identifiers found by all
matching functions2
. An edge exists between any two nodes
that share a unique subset of all the cascade identifiers that
were found for them. This subset and none of its subsets is
part of the identifiers found for any node that was created in the
time period between when the two linked nodes were created.
⇤z ={ir|
ir 2 Ia ^ ir 2 Ib,
8ir ! V 0
=
{vc|vc = (uc,tc, Ic), ir 2 Ic ^ ta  tc  tb} = ;,
vc 2 V, r 2 N, r  |Ib|}
A node that contains a cascade identifier that was not
detected for any other nodes before is called the identifier
root. Beside this we call a node without any incoming edges
a network root and node that has no outgoing edges a stub.
network are those resources from R that contain a set Ii of
e or multiple cascade identifiers. A cascade identifier is any
que informational pattern that is recognized by applying
matching function to the content or any other inherent
perties of a resource (e.g. simple string matching algorithms
dentify keywords in content). Formally a matching function
2 F, k 2 N, k  n is defined as:
fk(ci) =
8
>>>>><
>>>>>:
{i1, i2, ..., ix} if fk matches patterns
{i1, i2, ..., ix} in ci
x 2 N
; otherwise
des V and edges E are then given as follows
V ={v1, v2, ..., vp}
vy = (uy, ty, Iy),
E ={e1, e2, ..., eq}
ez =(ua, ub, ⇤z)
h Ii = {i1, i2, ..., io} = f1(ci) [ f2(ci) [ ... [ fn(ci) being
result of the concatenation of all identifiers found by all
tching functions2
. An edge exists between any two nodes
t share a unique subset of all the cascade identifiers that
re found for them. This subset and none of its subsets is
t of the identifiers found for any node that was created in the
e period between when the two linked nodes were created.
⇤z ={ir|
ir 2 Ia ^ ir 2 Ib,
8ir ! V 0
=
{vc|vc = (uc,tc, Ic), ir 2 Ic ^ ta  tc  tb} = ;,
vc 2 V, r 2 N, r  |Ib|}
A node that contains a cascade identifier that was not
ected for any other nodes before is called the identifier
t. Beside this we call a node without any incoming edges
etwork root and node that has no outgoing edges a stub.
r cascade model clearly yields different outputs depending
the data to hand (e.g. determined by the extent of the
Please note that [20] contains an unintentionally malformed equation for
as the wrong symbol was used to refer to the concatenation of the matching
ctions.
Fig. 1. Depending on the applied matching functions, different transcendental
information cascade representations can be generated for the same input data.
A fictive example of a transcendental cascade based on our
model is shown in Figure 2. Consider a system that features
hashtags as an established form of identifying content patterns.
The visualisation uses the following approach to represent
distinct identifiers and time: Nodes are chronologically ordered
alongside the horizontal dimension from left (the oldest node)
to right (the most recent node); additionally nodes are ordered
alongside the vertical dimension depending on the set of
identifiers present in a node (each unique set is assigned to
a distinct level). Consequently, the visualisation represents the
content creation sequence (“#A”) - (“#A#B”) - (“#A”) - (“#A”)
- (“#A#B#C”) - (“#C”) - (“#A”) - (“#B#D”) - (“#A”).
Fig. 2. Example of a cascade that emerges along five different identifiers.
#A, #B, #A#B#C, #B#D and #C are fictive hashtags (or hashtag combinations
resepectively) treated as the indentifying content patterns
In order to understand how edges are labelled we highlight
the sub-graph involving the nodes 2, 3, 4, and 5. Conforming
to our cascade model an edge exist between nodes 2 and 3
nding of its use but also an abstract global
ropose a new model that we call transcen-
ascades. Informed by Kleinbergs work on
document streams [2] it regards time as
le condition for relationships between any
meaning that we focus on coincidence of
activities rather than socially-determined
nted the initial definition of a transcenden-
ade as a 4-tupel TC = (V, E, R, F). This
a directed network consisting of a set of
E, derived when applying a set of matching
et of resources R = {r1, r2, ..., rm}, ri =
very ui is a unique identifier of a resource
t the time ti with the content ci. Nodes in
se resources from R that contain a set Ii of
cade identifiers. A cascade identifier is any
al pattern that is recognized by applying
n to the content or any other inherent
rce (e.g. simple string matching algorithms
s in content). Formally a matching function
n is defined as:
, i2, ..., ix} if fk matches patterns
{i1, i2, ..., ix} in ci
x 2 N
otherwise
E are then given as follows
V ={v1, v2, ..., vp}
vy = (uy, ty, Iy),
E ={e1, e2, ..., eq}
ez =(ua, ub, ⇤z)
, io} = f1(ci) [ f2(ci) [ ... [ fn(ci) being
ncatenation of all identifiers found by all
2
. An edge exists between any two nodes
subset of all the cascade identifiers that
m. This subset and none of its subsets is
s found for any node that was created in the
n when the two linked nodes were created.
{ir|
Web crawl), and the matching algorithms determining which
cascade identifiers will be spotted (e.g. reuse of hashtags,
URIs, quotes, images, or maybe exploiting wider semantics
or sentiment) as depicted in Figure ??.
Fig. 1. Depending on the applied matching functions, different transcendental
information cascade representations can be generated for the same input data.
A fictive example of a transcendental cascade based on our
model is shown in Figure 2. Consider a system that features
hashtags as an established form of identifying content patterns.
The visualisation uses the following approach to represent
distinct identifiers and time: Nodes are chronologically ordered
alongside the horizontal dimension from left (the oldest node)
to right (the most recent node); additionally nodes are ordered
alongside the vertical dimension depending on the set of
identifiers present in a node (each unique set is assigned to
a distinct level). Consequently, the visualisation represents the
content creation sequence (“#A”) - (“#A#B”) - (“#A”) - (“#A”)
- (“#A#B#C”) - (“#C”) - (“#A”) - (“#B#D”) - (“#A”).
i that was shared at the time ti with the content ci. Nodes in
he network are those resources from R that contain a set Ii of
ne or multiple cascade identifiers. A cascade identifier is any
nique informational pattern that is recognized by applying
matching function to the content or any other inherent
roperties of a resource (e.g. simple string matching algorithms
o identify keywords in content). Formally a matching function
k 2 F, k 2 N, k  n is defined as:
fk(ci) =
8
>>>>><
>>>>>:
{i1, i2, ..., ix} if fk matches patterns
{i1, i2, ..., ix} in ci
x 2 N
; otherwise
Nodes V and edges E are then given as follows
V ={v1, v2, ..., vp}
vy = (uy, ty, Iy),
E ={e1, e2, ..., eq}
ez =(ua, ub, ⇤z)
with Ii = {i1, i2, ..., io} = f1(ci) [ f2(ci) [ ... [ fn(ci) being
he result of the concatenation of all identifiers found by all
matching functions2
. An edge exists between any two nodes
hat share a unique subset of all the cascade identifiers that
were found for them. This subset and none of its subsets is
art of the identifiers found for any node that was created in the
ime period between when the two linked nodes were created.
⇤z ={ir|
ir 2 Ia ^ ir 2 Ib,
8ir ! V 0
=
{vc|vc = (uc,tc, Ic), ir 2 Ic ^ ta  tc  tb} = ;,
vc 2 V, r 2 N, r  |Ib|}
A node that contains a cascade identifier that was not
etected for any other nodes before is called the identifier
oot. Beside this we call a node without any incoming edges
network root and node that has no outgoing edges a stub.
Our cascade model clearly yields different outputs depending
n the data to hand (e.g. determined by the extent of the
2Please note that [20] contains an unintentionally malformed equation for
his as the wrong symbol was used to refer to the concatenation of the matching
unctions.
Fig. 1. Depending on the applied matching functions, different transcendental
information cascade representations can be generated for the same input data.
A fictive example of a transcendental cascade based on our
model is shown in Figure 2. Consider a system that features
hashtags as an established form of identifying content patterns.
The visualisation uses the following approach to represent
distinct identifiers and time: Nodes are chronologically ordered
alongside the horizontal dimension from left (the oldest node)
to right (the most recent node); additionally nodes are ordered
alongside the vertical dimension depending on the set of
identifiers present in a node (each unique set is assigned to
a distinct level). Consequently, the visualisation represents the
content creation sequence (“#A”) - (“#A#B”) - (“#A”) - (“#A”)
- (“#A#B#C”) - (“#C”) - (“#A”) - (“#B#D”) - (“#A”).
Fig. 2. Example of a cascade that emerges along five different identifiers.
#A, #B, #A#B#C, #B#D and #C are fictive hashtags (or hashtag combinations
resepectively) treated as the indentifying content patterns
In order to understand how edges are labelled we highlight
the sub-graph involving the nodes 2, 3, 4, and 5. Conforming
to our cascade model an edge exist between nodes 2 and 3
Markus Luczak-Roesch, Ramine Tinati, and Nigel Shadbolt. 2015. When Resources Collide: Towards a
Theory of Coincidence in Information Spaces. To appear in WWW’15 Companion, May 18–22, 2015,
Florence, Italy. http://dx.doi.org/10.1145/2740908.2743973
Transcendental
information cascades
t	
#A	
#A#B	
#A#B#C	
#B#D	
#C
Cascade motifs as an indicator of state?
?
Markus Luczak-Roesch, Ramine Tinati, Max van Kleek, and Nigel Shadbolt. 2015. From
coincidence to purposeful flow? Properties of transcendental information cascades. In
IEEE/ACM International Conference on Advances in Social Networks Analysis and
Mining (ASONAM), Paris, FR.
Analyzing low-level properties of the multiple
states of a system that exist at the same time
4
1
 15
10
Tags	 URIs	
KID & APH	
Single node motifs	
long uniform paths	
short uniform paths	
long non-uniform paths
Analyzing low-level properties of the multiple
states of a system that exist at the same time
Tags	 URIs	
KID&APH	
Identifier entropy	
4. Overview of the results of the cascade comparison. Cascade size distribution and wi
d with a log scale on the y-axis.
ain one or few identifiers equally distributed. Very large identifiers
e size distribution and wiener index are plotted on a log-log scale; identifier entropy is
large identifiers (KID, APH, URIs), cascades which are based on
varying profiles of increasing
randomness with growing
cascade size
From information co-occurrence to the discovery
of hidden structure in Wikipedia
Figure 1: Wikipedia edits in a three dimensional space. The di-
mensions are (1) time; (2) information diversity as the chronologi-
Tinati, Ramine, Luczak-Roesch, Markus, Hall, Wendy and Shadbolt, Nigel (2016) More than an
edit: using transcendental information cascades to capture hidden structure in Wikipedia. At
25th International World Wide Web Conference, Montreal, Canada, 11 - 15 Apr 2016. ACM (doi:
10.1145/2872518.2889401).
Tinati, R., Luczak-Rösch, M., & Hall, W. Finding Structure in Wikipedia Edit Activity: An
Information Cascade Approach . In WikiWorkshop 2016, co-located with WWW 2016.
Events detected:
•  Edward Snowden speech at SXSW
conference
•  US supreme court case on same sex
marriage
(a) Cascade Article Network (CAN): Nodes represent unique
Wikipedia articles, edges are shared edits based on a shared
identifier matched. A force directed layout has been ap-
plied, with edge path lengths determined by edge weight. The
strongly connected component (A) contains articles associated
with South Korean media, (B) and (C) contain articles related
to the USA.
(b) Cascade-to-Cascade path network graph: Nodes are cas-
cades, Edges are the shared articles between cascades. The cen-
tral strongly connected component is established by the Identi-
fiers shown in Table 3. A force directed layout has been applied,
with edge path lengths determined by edge weight.
Discrete vs. continuous data
Image source: https://en.wikipedia.org/wiki/Electroencephalography#/media/File:Spike-waves.png, CC BY-SA 2.0
EEG brain wave recordings
Image source: https://en.wikipedia.org/wiki/Electroencephalography#/media/File:Spike-waves.png, CC BY-SA 2.0
EEG brain wave recordings
Image source: https://en.wikipedia.org/wiki/Electroencephalography#/media/File:Spike-waves.png, CC BY-SA 2.0
EEG brain wave recordings
Image source: https://en.wikipedia.org/wiki/Electroencephalography#/media/File:Spike-waves.png, CC BY-SA 2.0
Linking based on similarity of spectral density
(Euclidian distance)
t	
F1	
Fn	
…	
…	
C11	
C21	
C22	
C23	
Formalising the
multiple possible
representations of
a system at any time
and their relationships.

Not all representing
purposeful action but
reflecting useful
informational properties.
•  Applying Transcendental Information Cascades to 
– data from the complex engineering industries (e.g. shipping)
– urban traffic data
– disaster response data
Reducing risk and enhancing security by understanding
coincidence in information spaces (RECOIN)
PI: Markus Luczak-Roesch
F1	
Fn	…	
Transcendental
Information Cascades
Generic time-ordered
networks of information co-
occurrence
t	
…	
C11	
C21	
C22	
C23	
t6	-	t0	
t2	-	t1	 t8	-	t2	
t4	-	t2	
t7	-	t4	
t5	-	t3	
t1	-	t0	
t2	-	t1	
t4	-	t1	
t4	-	t3	
t6	-	t5	
t8	-	t6	
t7	-	t4	
t5	-	t4	
t3	-	t2

More Related Content

PDF
Dancing Links: an educational pearl
PDF
Sparse autoencoder
PDF
SNMP Project: SNMP-based Network Anomaly Detection Using Clustering
PDF
Linked list
PPTX
Deep Learning: R with Keras and TensorFlow
PPTX
Deep Learning, Scala, and Spark
PPTX
Java and Deep Learning (Introduction)
PPTX
Content addressable network(can)
Dancing Links: an educational pearl
Sparse autoencoder
SNMP Project: SNMP-based Network Anomaly Detection Using Clustering
Linked list
Deep Learning: R with Keras and TensorFlow
Deep Learning, Scala, and Spark
Java and Deep Learning (Introduction)
Content addressable network(can)

Viewers also liked (20)

PDF
Zooniverse - Through the Observatory
PDF
The Web Science MacroScope: Mixed-methods Approach for Understanding Web Acti...
PDF
From coincidence to purposeful flow? Properties of transcendental information...
PDF
Observation and Analysis of Social Machines
PDF
When resources collide: Towards a theory of coincidence in information spaces...
PDF
Web of Data Usage Mining
PPT
UKSG Conference 2016 Breakout Session - With Or Without You: subscription age...
PDF
measureup-datasheet
PDF
Context-free data analysis with Transcendental Information Cascades.
PDF
Earth Observation and Citizen Science with 1.4 Million Zooniverse Volunteers
PDF
Rebecca Stephens: Digital Design Artist
PPTX
Youngwriterscamp
PPT
The devil in the mirror tp ingles
PPTX
Statistical Analysis of Web of Data Usage
PDF
Chris Lintott
 
PDF
CHELLARAM DIABETES HOSPITAL IMG's
DOCX
Bộ ảnh những bà mẹ ngực trần cho con bú giữa thiên nhiên đầy cảm hứng
PDF
City's response to Kaawa
PDF
Shots in screenplay
Zooniverse - Through the Observatory
The Web Science MacroScope: Mixed-methods Approach for Understanding Web Acti...
From coincidence to purposeful flow? Properties of transcendental information...
Observation and Analysis of Social Machines
When resources collide: Towards a theory of coincidence in information spaces...
Web of Data Usage Mining
UKSG Conference 2016 Breakout Session - With Or Without You: subscription age...
measureup-datasheet
Context-free data analysis with Transcendental Information Cascades.
Earth Observation and Citizen Science with 1.4 Million Zooniverse Volunteers
Rebecca Stephens: Digital Design Artist
Youngwriterscamp
The devil in the mirror tp ingles
Statistical Analysis of Web of Data Usage
Chris Lintott
 
CHELLARAM DIABETES HOSPITAL IMG's
Bộ ảnh những bà mẹ ngực trần cho con bú giữa thiên nhiên đầy cảm hứng
City's response to Kaawa
Shots in screenplay
Ad

Similar to Transcending our views to sequential data (20)

PPTX
Temporal graph
PPTX
Temporal Network
PPTX
Knowledge Graphs and Milestone
PDF
Synthesis and performance analysis of network topology using graph theory
PDF
Synthesis and performance analysis of network topology using graph theory
PDF
Network Topology.PDF
PPT
Graphs in c language
PDF
Scalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
PDF
Predictive Datacenter Analytics with Strymon
PDF
Network topology
PDF
Networks: Some Notes
PPT
Measurement and modeling of the web and related data sets
PPT
Graph and Trees.pptGraph and Trees.ppt i detailed topic about graph and trees
PPTX
Collaborative eventsourcing
PDF
Temporal networks - Alain Barrat
PPTX
Data Structure of computer science and technology
PPTX
Keynote at AImWD
PPTX
ppt 1.pptx
PDF
Building Identity Graphs over Heterogeneous Data
Temporal graph
Temporal Network
Knowledge Graphs and Milestone
Synthesis and performance analysis of network topology using graph theory
Synthesis and performance analysis of network topology using graph theory
Network Topology.PDF
Graphs in c language
Scalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
Predictive Datacenter Analytics with Strymon
Network topology
Networks: Some Notes
Measurement and modeling of the web and related data sets
Graph and Trees.pptGraph and Trees.ppt i detailed topic about graph and trees
Collaborative eventsourcing
Temporal networks - Alain Barrat
Data Structure of computer science and technology
Keynote at AImWD
ppt 1.pptx
Building Identity Graphs over Heterogeneous Data
Ad

Recently uploaded (20)

PDF
Wound infection.pdfWound infection.pdf123
PDF
Communicating Health Policies to Diverse Populations (www.kiu.ac.ug)
PPTX
Probability.pptx pearl lecture first year
PPTX
TORCH INFECTIONS in pregnancy with toxoplasma
PDF
CHAPTER 2 The Chemical Basis of Life Lecture Outline.pdf
PPTX
Substance Disorders- part different drugs change body
PPTX
Microbes in human welfare class 12 .pptx
PPT
Biochemestry- PPT ON Protein,Nitrogenous constituents of Urine, Blood, their ...
PPTX
BODY FLUIDS AND CIRCULATION class 11 .pptx
PPT
Presentation of a Romanian Institutee 2.
PPTX
perinatal infections 2-171220190027.pptx
PPTX
SCIENCE 4 Q2W5 PPT.pptx Lesson About Plnts and animals and their habitat
PPT
Enhancing Laboratory Quality Through ISO 15189 Compliance
PDF
CHAPTER 3 Cell Structures and Their Functions Lecture Outline.pdf
PDF
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
PPT
veterinary parasitology ````````````.ppt
PPT
Heredity-grade-9 Heredity-grade-9. Heredity-grade-9.
PPTX
PMR- PPT.pptx for students and doctors tt
PDF
Looking into the jet cone of the neutrino-associated very high-energy blazar ...
PPT
THE CELL THEORY AND ITS FUNDAMENTALS AND USE
Wound infection.pdfWound infection.pdf123
Communicating Health Policies to Diverse Populations (www.kiu.ac.ug)
Probability.pptx pearl lecture first year
TORCH INFECTIONS in pregnancy with toxoplasma
CHAPTER 2 The Chemical Basis of Life Lecture Outline.pdf
Substance Disorders- part different drugs change body
Microbes in human welfare class 12 .pptx
Biochemestry- PPT ON Protein,Nitrogenous constituents of Urine, Blood, their ...
BODY FLUIDS AND CIRCULATION class 11 .pptx
Presentation of a Romanian Institutee 2.
perinatal infections 2-171220190027.pptx
SCIENCE 4 Q2W5 PPT.pptx Lesson About Plnts and animals and their habitat
Enhancing Laboratory Quality Through ISO 15189 Compliance
CHAPTER 3 Cell Structures and Their Functions Lecture Outline.pdf
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
veterinary parasitology ````````````.ppt
Heredity-grade-9 Heredity-grade-9. Heredity-grade-9.
PMR- PPT.pptx for students and doctors tt
Looking into the jet cone of the neutrino-associated very high-energy blazar ...
THE CELL THEORY AND ITS FUNDAMENTALS AND USE

Transcending our views to sequential data

  • 1. Transcending our views to sequential data Markus Luczak-Roesch | @mluczak University of Southampton, Web and Internet Science http://markus-luczak.de
  • 2. HF LF [1] Kleinberg, Jon. "Bursty and hierarchical structure in streams." Data Mining and Knowledge Discovery 7.4 (2003): 373-397. [2] Subašić, I., & Berendt, B. (2013). Story graphs: Tracking document set evolution using dynamic graphs. Intelligent Data Analysis, 17(1), 125-147. Time Numberofobserveddocuments Content streams as automata [1] “The key notion of TTM is burstiness – sudden increases in frequency of text fragments, and all TTM methods aim to model burstiness.” [2] t
  • 3. System A System B System C Related activity? t
  • 4. Building transcendental information cascades conditionality. In [20] we presented the initial definition of a transcenden- tal information cascade as a 4-tupel TC = (V, E, R, F). This 4-tupel represents a directed network consisting of a set of nodes V and edges E, derived when applying a set of matching functions F to a set of resources R = {r1, r2, ..., rm}, ri = (ui, ti, ci), where every ui is a unique identifier of a resource ri that was shared at the time ti with the content ci. Nodes in the network are those resources from R that contain a set Ii of one or multiple cascade identifiers. A cascade identifier is any unique informational pattern that is recognized by applying a matching function to the content or any other inherent properties of a resource (e.g. simple string matching algorithms to identify keywords in content). Formally a matching function fk 2 F, k 2 N, k  n is defined as: fk(ci) = 8 >>>>>< >>>>>: {i1, i2, ..., ix} if fk matches patterns {i1, i2, ..., ix} in ci x 2 N ; otherwise Nodes V and edges E are then given as follows V ={v1, v2, ..., vp} vy = (uy, ty, Iy), E ={e1, e2, ..., eq} ez =(ua, ub, ⇤z) with Ii = {i1, i2, ..., io} = f1(ci) [ f2(ci) [ ... [ fn(ci) being the result of the concatenation of all identifiers found by all matching functions2 . An edge exists between any two nodes that share a unique subset of all the cascade identifiers that were found for them. This subset and none of its subsets is part of the identifiers found for any node that was created in the time period between when the two linked nodes were created. ⇤z ={ir| ir 2 Ia ^ ir 2 Ib, 8ir ! V 0 = {vc|vc = (uc,tc, Ic), ir 2 Ic ^ ta  tc  tb} = ;, vc 2 V, r 2 N, r  |Ib|} A node that contains a cascade identifier that was not detected for any other nodes before is called the identifier root. Beside this we call a node without any incoming edges a network root and node that has no outgoing edges a stub. network are those resources from R that contain a set Ii of e or multiple cascade identifiers. A cascade identifier is any que informational pattern that is recognized by applying matching function to the content or any other inherent perties of a resource (e.g. simple string matching algorithms dentify keywords in content). Formally a matching function 2 F, k 2 N, k  n is defined as: fk(ci) = 8 >>>>>< >>>>>: {i1, i2, ..., ix} if fk matches patterns {i1, i2, ..., ix} in ci x 2 N ; otherwise des V and edges E are then given as follows V ={v1, v2, ..., vp} vy = (uy, ty, Iy), E ={e1, e2, ..., eq} ez =(ua, ub, ⇤z) h Ii = {i1, i2, ..., io} = f1(ci) [ f2(ci) [ ... [ fn(ci) being result of the concatenation of all identifiers found by all tching functions2 . An edge exists between any two nodes t share a unique subset of all the cascade identifiers that re found for them. This subset and none of its subsets is t of the identifiers found for any node that was created in the e period between when the two linked nodes were created. ⇤z ={ir| ir 2 Ia ^ ir 2 Ib, 8ir ! V 0 = {vc|vc = (uc,tc, Ic), ir 2 Ic ^ ta  tc  tb} = ;, vc 2 V, r 2 N, r  |Ib|} A node that contains a cascade identifier that was not ected for any other nodes before is called the identifier t. Beside this we call a node without any incoming edges etwork root and node that has no outgoing edges a stub. r cascade model clearly yields different outputs depending the data to hand (e.g. determined by the extent of the Please note that [20] contains an unintentionally malformed equation for as the wrong symbol was used to refer to the concatenation of the matching ctions. Fig. 1. Depending on the applied matching functions, different transcendental information cascade representations can be generated for the same input data. A fictive example of a transcendental cascade based on our model is shown in Figure 2. Consider a system that features hashtags as an established form of identifying content patterns. The visualisation uses the following approach to represent distinct identifiers and time: Nodes are chronologically ordered alongside the horizontal dimension from left (the oldest node) to right (the most recent node); additionally nodes are ordered alongside the vertical dimension depending on the set of identifiers present in a node (each unique set is assigned to a distinct level). Consequently, the visualisation represents the content creation sequence (“#A”) - (“#A#B”) - (“#A”) - (“#A”) - (“#A#B#C”) - (“#C”) - (“#A”) - (“#B#D”) - (“#A”). Fig. 2. Example of a cascade that emerges along five different identifiers. #A, #B, #A#B#C, #B#D and #C are fictive hashtags (or hashtag combinations resepectively) treated as the indentifying content patterns In order to understand how edges are labelled we highlight the sub-graph involving the nodes 2, 3, 4, and 5. Conforming to our cascade model an edge exist between nodes 2 and 3 nding of its use but also an abstract global ropose a new model that we call transcen- ascades. Informed by Kleinbergs work on document streams [2] it regards time as le condition for relationships between any meaning that we focus on coincidence of activities rather than socially-determined nted the initial definition of a transcenden- ade as a 4-tupel TC = (V, E, R, F). This a directed network consisting of a set of E, derived when applying a set of matching et of resources R = {r1, r2, ..., rm}, ri = very ui is a unique identifier of a resource t the time ti with the content ci. Nodes in se resources from R that contain a set Ii of cade identifiers. A cascade identifier is any al pattern that is recognized by applying n to the content or any other inherent rce (e.g. simple string matching algorithms s in content). Formally a matching function n is defined as: , i2, ..., ix} if fk matches patterns {i1, i2, ..., ix} in ci x 2 N otherwise E are then given as follows V ={v1, v2, ..., vp} vy = (uy, ty, Iy), E ={e1, e2, ..., eq} ez =(ua, ub, ⇤z) , io} = f1(ci) [ f2(ci) [ ... [ fn(ci) being ncatenation of all identifiers found by all 2 . An edge exists between any two nodes subset of all the cascade identifiers that m. This subset and none of its subsets is s found for any node that was created in the n when the two linked nodes were created. {ir| Web crawl), and the matching algorithms determining which cascade identifiers will be spotted (e.g. reuse of hashtags, URIs, quotes, images, or maybe exploiting wider semantics or sentiment) as depicted in Figure ??. Fig. 1. Depending on the applied matching functions, different transcendental information cascade representations can be generated for the same input data. A fictive example of a transcendental cascade based on our model is shown in Figure 2. Consider a system that features hashtags as an established form of identifying content patterns. The visualisation uses the following approach to represent distinct identifiers and time: Nodes are chronologically ordered alongside the horizontal dimension from left (the oldest node) to right (the most recent node); additionally nodes are ordered alongside the vertical dimension depending on the set of identifiers present in a node (each unique set is assigned to a distinct level). Consequently, the visualisation represents the content creation sequence (“#A”) - (“#A#B”) - (“#A”) - (“#A”) - (“#A#B#C”) - (“#C”) - (“#A”) - (“#B#D”) - (“#A”). i that was shared at the time ti with the content ci. Nodes in he network are those resources from R that contain a set Ii of ne or multiple cascade identifiers. A cascade identifier is any nique informational pattern that is recognized by applying matching function to the content or any other inherent roperties of a resource (e.g. simple string matching algorithms o identify keywords in content). Formally a matching function k 2 F, k 2 N, k  n is defined as: fk(ci) = 8 >>>>>< >>>>>: {i1, i2, ..., ix} if fk matches patterns {i1, i2, ..., ix} in ci x 2 N ; otherwise Nodes V and edges E are then given as follows V ={v1, v2, ..., vp} vy = (uy, ty, Iy), E ={e1, e2, ..., eq} ez =(ua, ub, ⇤z) with Ii = {i1, i2, ..., io} = f1(ci) [ f2(ci) [ ... [ fn(ci) being he result of the concatenation of all identifiers found by all matching functions2 . An edge exists between any two nodes hat share a unique subset of all the cascade identifiers that were found for them. This subset and none of its subsets is art of the identifiers found for any node that was created in the ime period between when the two linked nodes were created. ⇤z ={ir| ir 2 Ia ^ ir 2 Ib, 8ir ! V 0 = {vc|vc = (uc,tc, Ic), ir 2 Ic ^ ta  tc  tb} = ;, vc 2 V, r 2 N, r  |Ib|} A node that contains a cascade identifier that was not etected for any other nodes before is called the identifier oot. Beside this we call a node without any incoming edges network root and node that has no outgoing edges a stub. Our cascade model clearly yields different outputs depending n the data to hand (e.g. determined by the extent of the 2Please note that [20] contains an unintentionally malformed equation for his as the wrong symbol was used to refer to the concatenation of the matching unctions. Fig. 1. Depending on the applied matching functions, different transcendental information cascade representations can be generated for the same input data. A fictive example of a transcendental cascade based on our model is shown in Figure 2. Consider a system that features hashtags as an established form of identifying content patterns. The visualisation uses the following approach to represent distinct identifiers and time: Nodes are chronologically ordered alongside the horizontal dimension from left (the oldest node) to right (the most recent node); additionally nodes are ordered alongside the vertical dimension depending on the set of identifiers present in a node (each unique set is assigned to a distinct level). Consequently, the visualisation represents the content creation sequence (“#A”) - (“#A#B”) - (“#A”) - (“#A”) - (“#A#B#C”) - (“#C”) - (“#A”) - (“#B#D”) - (“#A”). Fig. 2. Example of a cascade that emerges along five different identifiers. #A, #B, #A#B#C, #B#D and #C are fictive hashtags (or hashtag combinations resepectively) treated as the indentifying content patterns In order to understand how edges are labelled we highlight the sub-graph involving the nodes 2, 3, 4, and 5. Conforming to our cascade model an edge exist between nodes 2 and 3 Markus Luczak-Roesch, Ramine Tinati, and Nigel Shadbolt. 2015. When Resources Collide: Towards a Theory of Coincidence in Information Spaces. To appear in WWW’15 Companion, May 18–22, 2015, Florence, Italy. http://dx.doi.org/10.1145/2740908.2743973
  • 6. Cascade motifs as an indicator of state? ? Markus Luczak-Roesch, Ramine Tinati, Max van Kleek, and Nigel Shadbolt. 2015. From coincidence to purposeful flow? Properties of transcendental information cascades. In IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Paris, FR.
  • 7. Analyzing low-level properties of the multiple states of a system that exist at the same time 4 1 15 10 Tags URIs KID & APH Single node motifs long uniform paths short uniform paths long non-uniform paths
  • 8. Analyzing low-level properties of the multiple states of a system that exist at the same time Tags URIs KID&APH Identifier entropy 4. Overview of the results of the cascade comparison. Cascade size distribution and wi d with a log scale on the y-axis. ain one or few identifiers equally distributed. Very large identifiers e size distribution and wiener index are plotted on a log-log scale; identifier entropy is large identifiers (KID, APH, URIs), cascades which are based on varying profiles of increasing randomness with growing cascade size
  • 9. From information co-occurrence to the discovery of hidden structure in Wikipedia Figure 1: Wikipedia edits in a three dimensional space. The di- mensions are (1) time; (2) information diversity as the chronologi- Tinati, Ramine, Luczak-Roesch, Markus, Hall, Wendy and Shadbolt, Nigel (2016) More than an edit: using transcendental information cascades to capture hidden structure in Wikipedia. At 25th International World Wide Web Conference, Montreal, Canada, 11 - 15 Apr 2016. ACM (doi: 10.1145/2872518.2889401). Tinati, R., Luczak-Rösch, M., & Hall, W. Finding Structure in Wikipedia Edit Activity: An Information Cascade Approach . In WikiWorkshop 2016, co-located with WWW 2016. Events detected: •  Edward Snowden speech at SXSW conference •  US supreme court case on same sex marriage (a) Cascade Article Network (CAN): Nodes represent unique Wikipedia articles, edges are shared edits based on a shared identifier matched. A force directed layout has been ap- plied, with edge path lengths determined by edge weight. The strongly connected component (A) contains articles associated with South Korean media, (B) and (C) contain articles related to the USA. (b) Cascade-to-Cascade path network graph: Nodes are cas- cades, Edges are the shared articles between cascades. The cen- tral strongly connected component is established by the Identi- fiers shown in Table 3. A force directed layout has been applied, with edge path lengths determined by edge weight.
  • 10. Discrete vs. continuous data Image source: https://en.wikipedia.org/wiki/Electroencephalography#/media/File:Spike-waves.png, CC BY-SA 2.0
  • 11. EEG brain wave recordings Image source: https://en.wikipedia.org/wiki/Electroencephalography#/media/File:Spike-waves.png, CC BY-SA 2.0
  • 12. EEG brain wave recordings Image source: https://en.wikipedia.org/wiki/Electroencephalography#/media/File:Spike-waves.png, CC BY-SA 2.0
  • 13. EEG brain wave recordings Image source: https://en.wikipedia.org/wiki/Electroencephalography#/media/File:Spike-waves.png, CC BY-SA 2.0
  • 14. Linking based on similarity of spectral density (Euclidian distance)
  • 15. t F1 Fn … … C11 C21 C22 C23 Formalising the multiple possible representations of a system at any time and their relationships. Not all representing purposeful action but reflecting useful informational properties.
  • 16. •  Applying Transcendental Information Cascades to – data from the complex engineering industries (e.g. shipping) – urban traffic data – disaster response data Reducing risk and enhancing security by understanding coincidence in information spaces (RECOIN) PI: Markus Luczak-Roesch
  • 17. F1 Fn … Transcendental Information Cascades Generic time-ordered networks of information co- occurrence t … C11 C21 C22 C23 t6 - t0 t2 - t1 t8 - t2 t4 - t2 t7 - t4 t5 - t3 t1 - t0 t2 - t1 t4 - t1 t4 - t3 t6 - t5 t8 - t6 t7 - t4 t5 - t4 t3 - t2