SlideShare a Scribd company logo
Discovering Common Motifs in Cursor
Movement Data
Dmitry Lagun, 2014
Emory University
1
Thank you!
2
Mikhail Ageev Qi Guo Eugene Agichtein
3
The Importance of Online User Attention
• “Attention is focused
mental engagement on a
particular item of
information.”
(Davenport & Beck 2001, p. 20)
Abundance of information
Scarcity of attention
4
The Importance of Online User Attention
• “Eye-mind Hypothesis”
[Just and Carpenter, 1980]
• “When a subject looks at a
word or object, he or she
also thinks about (process
cognitively), and for
exactly as long as the
recorded fixation.”
5
The Importance of Online User Attention
• Attention is critical for
science of cognition
(vision, language, memory)
• Many industry applications:
– Web search
intent, quality, presentation, s
atisfaction
– UI usability testing
– Display advertising, customer
engagement, branding
Measurement of Attention
• Eye Tracking
– Based on corneal reflection of infra-red light
Infra-red cameras
Users spend most of
the time on top
search results
6
Applications
Examination Strategies
[Buscher et al.]
Web Page Re-Design
[Leiva et al.]
Behavior Biased
Summaries
[Ageev et al.]
Query-Expansion &
Relevance Feedback
[Buscher et al.]
Parkinson, ADHD, FASD
[Tseng et al.]
Prediction of Cognitive
Impairment
[Zola et al.]
Search Relevance
[Guo & Agichtein]
Search Abandonment
[Huang et al.]
7
Applications
Examination Strategies
[Buscher et al.]
Web Page Re-Design
[Leiva et al.]
Behavior Biased
Summaries
[Ageev et al.]
Query-Expansion &
Relevance Feedback
[Buscher et al.]
Parkinson, ADHD, FASD
[Tseng et al.]
Prediction of Cognitive
Impairment
[Zola et al.]
Search Relevance
[Guo & Agichtein]
Search Abandonment
[Huang et al.]
8
Our focus
emory math and cs
9
Search
Search
Logs
Web
Pages
Search
Engine
Ranking
10
Search
Logs
Web
Pages
Search
Engine
Ranking
click
11
Search
Logs
Web
Pages
Search
Engine
Ranking
Relevant or Not?
Ranking
12
Prior Work:
Cursor Movement on Landing Pages
• Post Click Behavior Model [Guo and Agichtein, WWW 2012]
• Two basic patterns: “Reading” and “Scanning”
Reading Scanning
“Reading”: consuming or verifying when
(seemingly) relevant information is found
“Scanning”: not yet found the
relevant information, still in the
process of visually searching
13
Post-Click Behavior (PCB) Data Improves Ranking
• PCB and PCB_User consistently outperform
DTR (baseline)
14
[Guo & Agichtein, WWW 2012]
[Guo , Lagun & Agichtein, CIKM 2012]
DTR = Dwell time + Rank
NDCG
Post-Click Behavior (PCB) Model Features
• Average cursor position, cursor
speed, direction
• Travelled distance, horizontal and vertical
ranges
• Max/Min cursor positions on the screen
• Scroll speed, frequency and scroll distance
• Cursor position in a region-of-interest
Can we automatically discover meaningful
features of cursor trajectory? 15
Our Approach: Cursor Motif Mining
Instead of engineering complex features, discover
common subsequences (motifs)
Motif is a frequently occurring sequence of cursor
movements.
Similar
16
Mouse Cursor Data: Challenges
 Different users examine web pages
with different speed, hence move
mouse slower/faster.
 Similar of movements can appear in
different parts of a web page (top
vs. bottom).
17
Mouse Cursor Data: Challenges
 Different users examine web pages
with different speed, hence move
mouse slower/faster.
[Flexible Distance Metric, DTW]
 Similar type of movements can
appear in different parts of a web
page (top vs. bottom).
[Location Invariance: normalize
subsequence position]
18
Motif Discovery Pipeline
Generate Motif
Candidates
Discover
Frequent
Candidates
De-duplicate /
Output Motifs
Distance
Measure
19
Candidate Generation
window size
sliding window
Motif candidates
20
Distance Measure
• Which time series are similar?
• Popular Choices:
– Euclidian Distance (ED)
– Dynamic Time Warping (DTW)
21
Frequent Motif Mining
• Similarity Search
– How many subsequences in the dataset are
similar to the given candidate subsequence?
motif candidates
motifcandidates
dist(i,j) – how similar i-th candidate
to the j-th motif candidate.
Algorithm Parameters:
max_dist – distance when two subsequences are
considered “similar”
min_count – minimal frequency of motif candidate
22
Brute force search is computationally expensive 
De-Duplication
(only keep cluster centroids)
• Similarity search can generate a lot of
frequent candidates that are similar between
each other
(due to redundancy in motif candidate generation)
23
Motif Discovery Pipeline
Generate Motif
Candidates
Discover
Frequent
Candidates
De-duplicate /
Output Motifs
Distance Metric
24
Optimizations in Similarity Search
• Early stopping
– in DTW computation (takes O(n^2) time)
– in lower bound computation (takes O(n) time)
[Keogh et al.]
• Parallel Computation
– No dependency in distance computation 
use multiple cores
• Distance Metric Learning
• Spatial Indexing
25
Distance Measure Learning
• Goal: Fast pruning of not-promising
candidates in similarity search
Features (x_max, y_max, …, feature_k)
Features (x_max, y_max, …, feature_k)
26
Tune the weights with
Gradient based method
(e.g. SGD)
Spatial Indexing
• Goal: Fast pruning of not-promising
candidates in similarity search
• Indexes motif candidates
in weighted feature space
• Improves asymptotic time for similarity search
27
Timing Experiments
28
Example of Discovered Motif
discovered motif
29
eye gaze
mouse cursor
matching
subsequence
Motifs Discovery: Examples
On Search Engine Result Pages (SERPs)
On “Landing” Pages (non-SERPs)
30
Discovered motifs have many uses
• Summarize typical mouse cursor usages
– E.g. create dictionary of typical cursor usages
• Compact (task-free) representation
– Characterize entire cursor trajectory based on which
motifs appear in it
• For classification/regression:
– Compute whether particular motifs appears in a given
mouse cursor trajectory
31
Using motifs as features for
Classification/Regression
• We can measure how similar is mouse
movement trajectory to each of the
discovered motifs
window size
sliding window
32
motif
Motifs for Relevance Prediction
• Baselines
– Cursor Hover (on the search result page)
[Huang et al., CHI 2011]
– Post Click Behavior Model
[Guo & Agichtein, WWW, 2012]
• Dwell time
• Statistics of cursor movements: max, min, range, etc.
• Statistics of scrolling activity: max, min, range, etc.
33
Reading Scanning
Dataset
• User study (21 users)
– mostly informational search tasks
– 566 search queries
– 1340 page views
– 854 relevance judgments
34
Motifs are Better
than Previous Models (PCB, Hover)
35
Feature Group Pearson Correlation
Cursor Hover 0.120
Post Click Behavior 0.392
Motifs 0.394 (+0.5%)
Post Click Behavior + Motifs 0.468 (+19.4%)
Motifs are Helpful for
Web Search Result Ranking
36
Conclusions
• It is possible to automatically discover
meaningful motifs from mouse cursor data
• Motifs are helpful for relevance prediction &
ranking
• Cursor motifs provide compact (task free)
representation for the entire cursor trajectory
37
Applications of Gaze/Mouse Cursor
Tracking in Medical Domain
38
Background: Mild Cognitive Impairment
(MCI) and Alzheimer’s Disease
• Alzheimer’s disease (AD) affects more than 5M
Americans, expected to grow in the coming decade
• Memory impairment (aMCI)
indicates onset of AD (affects
hippocampus first)
• Visual Paired Comparison
(VPC) task: promising for
early diagnosis of both MCI
and AD before it is detectable
by other means
39
VPC Task: Eye Tracking Equipment
40
Impaired Subjects spent 50% on Novel
Image after Long Delay
41
VPC Task: Eye Tracking
42
Exploiting Eye Gaze Movement Data
Novelty Preference
fixation duration
distribution
+
43
Shapelets are Helpful for
Prediction of Cognitive Decline
• Shapelets – “class specific” motifs
44
Shapelets are Helpful for
Prediction of Cognitive Decline
• Shapelets – “class specific” motifs
Baseline AUC = 0.892 ± 0.003
Shapelets AUC = 0.916 ± 0.006
45
User Attention on Web Pages
46
Cross-Domain User Study
• Research Question
– Does web page content affect user attention?
• Domains
– Search (Google), Wikipedia, Shopping (Amazon), Social
(Twitter), News (CNN )
• 20 users (4 + 20 tasks per user)
• 400 tasks, 1700 page views
• 500K gaze/cursor measurements (sampled every 50 ms)
47
?
search domain X
Web Search Pages
48
News Search Pages
49
Shopping Search Pages
50
Twitter Search Pages
51
Conclusions
• It is possible to automatically discover
meaningful motifs from mouse cursor data
• Motifs are helpful for relevance prediction,
ranking and prediction of cognitive
impairment
• Attention patterns vary significantly across
search interfaces
52
Thank You!
• This work was supported by
53
Emory IR Lab: Research Areas
Modeling collaborative content
creation for information
organization, indexing, and search
54
• Mining search behavior data to
improve information finding.
Medical applications of
Search, NLP, behavior modeling.
UFindIt: Remote Search Behavior Studies
55
Misha Ageev (MGU & Yandex), Dmitry Lagun (Emory), Denis Savenkov (Emory)
SIGIR 2011 (best paper award), SIGIR 2013, EMNLP 2013
Search behavior models for Touch Screens
Ongoing project, looking for students
56
Guo et al., SIGIR 2013
Dynamics in User Generated Content
Wikipedia
Major events (e.g., natural disasters, sports) affect the content change in Wikipedia articles.
Use content change for ranking:
• Words used in early revisions of the documents are more essential and important to
the documents.
• Words used during a major event may reflect relevance change between words and
documents
Twitter
Topic transitions in Tweet streams:
• What you’ve tweeted before may affect what you will tweet in the near feature.
Sentiment change in Twitter during major events:
• People respond differently to the same event since they could hold different prior
opinions. (e.g., conservatives vs. liberals)
Yu Wang (Ph.D. expected 2014)
[CIKM 2010, KDD 2012, CIKM 2013]
Community Question Answering (CQA)
1. What are the factors influencing answer
contributions in CQA Systems?
– Analyzing answerer behavior [ECIR 2011]
2. What kind of searches benefit most from CQA
services and archives?
– Understanding how searchers become askers [SIGIR 2011]
3. How to improve search quality with CQA data?
– Predicting searcher satisfaction with CQA data [SIGIR 2012]
Qiaoling Liu,
Ph.D. expected: 2014
• Emory IR Lab is looking for a few good Ph.D. students to start
September 2015
• Information retrieval and web search: search behavior, ranking, user
interfaces, content analysis, Question Answering
• Social media and social network mining applications:
political science, public health, advertising
• Psychology, Neuroscience, Medicine applications:
computational attention, memory, cognition, language
Contact: Eugene Agichtein
Associate Professor
eugene@mathcs.emory.edu
www.mathcs.emory.edu/~eugene/
59
http://www.mathcs.emory.edu/programs-grad/
Computer Science Ph.D. Program information and application process:
60
Atlanta, GA

More Related Content

PDF
Representing and Evaluating Social Context on Mobile Devices
PPTX
Making The Leap From Web To Mobile
PPTX
International Business User Research: Methods and Tools
PPTX
Session1 methods research_question
PPTX
New ways of seeing: Understanding individuals on their terms.
PPTX
Session 3 Research Methods - Data Analysis - Case Study Example
PPTX
Machine Learning for Data Extraction
PDF
Économie du web
Representing and Evaluating Social Context on Mobile Devices
Making The Leap From Web To Mobile
International Business User Research: Methods and Tools
Session1 methods research_question
New ways of seeing: Understanding individuals on their terms.
Session 3 Research Methods - Data Analysis - Case Study Example
Machine Learning for Data Extraction
Économie du web

Similar to Discovering Common Motifs in Cursor Movement Data (20)

PDF
From Exploration to Construction
 - How to Support the Complex Dynamics of In...
PDF
Advanced Methods for User Evaluation in Enterprise AR
PDF
Parts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & Tasks
PDF
Smashing silos ia-ux-meetup-mar112014
PDF
Crowd Intelligence in Requirements Engineering:Current Status and Future Dire...
PDF
Comp4010 lecture11 VR Applications
PDF
Comp4010 lecture11 VR Applications
PDF
Cognitive Science Perspective on User eXperience!
PDF
[CVPR 2018] Visual Search (Image Retrieval) and Metric Learning
PDF
Building Surveys in Qualtrics for Efficient Analytics
PDF
Sweeny group think-ias2015
PPTX
An Empirical Investigation of the Intuitiveness of Process Landscape Designs
PPTX
Information Experience Lab, IE Lab at SISLT
PDF
Master Thesis: The Design of a Rich Internet Application for Exploratory Sear...
PDF
Evaluation and User Study in HCI
PPT
DBLP-SSE: A DBLP Search Support Engine
PPTX
TechnicalBackgroundOverview
PDF
ICS3211_lecture 03 2023.pdf
PDF
The Pocket Universal Methods of Design 125 Ways to Research Complex Problems ...
From Exploration to Construction
 - How to Support the Complex Dynamics of In...
Advanced Methods for User Evaluation in Enterprise AR
Parts 1 & 2: WWW 2018 Tutorial: Understanding User Needs & Tasks
Smashing silos ia-ux-meetup-mar112014
Crowd Intelligence in Requirements Engineering:Current Status and Future Dire...
Comp4010 lecture11 VR Applications
Comp4010 lecture11 VR Applications
Cognitive Science Perspective on User eXperience!
[CVPR 2018] Visual Search (Image Retrieval) and Metric Learning
Building Surveys in Qualtrics for Efficient Analytics
Sweeny group think-ias2015
An Empirical Investigation of the Intuitiveness of Process Landscape Designs
Information Experience Lab, IE Lab at SISLT
Master Thesis: The Design of a Rich Internet Application for Exploratory Sear...
Evaluation and User Study in HCI
DBLP-SSE: A DBLP Search Support Engine
TechnicalBackgroundOverview
ICS3211_lecture 03 2023.pdf
The Pocket Universal Methods of Design 125 Ways to Research Complex Problems ...
Ad

More from Yandex (20)

PDF
Предсказание оттока игроков из World of Tanks
PDF
Как принять/организовать работу по поисковой оптимизации сайта, Сергей Царик,...
PDF
Структурированные данные, Юлия Тихоход, лекция в Школе вебмастеров Яндекса
PDF
Представление сайта в поиске, Сергей Лысенко, лекция в Школе вебмастеров Яндекса
PDF
Плохие методы продвижения сайта, Екатерины Гладких, лекция в Школе вебмастеро...
PDF
Основные принципы ранжирования, Сергей Царик и Антон Роменский, лекция в Школ...
PDF
Основные принципы индексирования сайта, Александр Смирнов, лекция в Школе веб...
PDF
Мобильное приложение: как и зачем, Александр Лукин, лекция в Школе вебмастеро...
PDF
Сайты на мобильных устройствах, Олег Ножичкин, лекция в Школе вебмастеров Янд...
PDF
Качественная аналитика сайта, Юрий Батиевский, лекция в Школе вебмастеров Янд...
PDF
Что можно и что нужно измерять на сайте, Петр Аброськин, лекция в Школе вебма...
PDF
Как правильно поставить ТЗ на создание сайта, Алексей Бородкин, лекция в Школ...
PDF
Как защитить свой сайт, Пётр Волков, лекция в Школе вебмастеров
PDF
Как правильно составить структуру сайта, Дмитрий Сатин, лекция в Школе вебмас...
PDF
Технические особенности создания сайта, Дмитрий Васильева, лекция в Школе веб...
PDF
Конструкторы для отдельных элементов сайта, Елена Першина, лекция в Школе веб...
PDF
Контент для интернет-магазинов, Катерина Ерошина, лекция в Школе вебмастеров ...
PDF
Как написать хороший текст для сайта, Катерина Ерошина, лекция в Школе вебмас...
PDF
Usability и дизайн - как не помешать пользователю, Алексей Иванов, лекция в Ш...
PDF
Cайт. Зачем он и каким должен быть, Алексей Иванов, лекция в Школе вебмастеро...
Предсказание оттока игроков из World of Tanks
Как принять/организовать работу по поисковой оптимизации сайта, Сергей Царик,...
Структурированные данные, Юлия Тихоход, лекция в Школе вебмастеров Яндекса
Представление сайта в поиске, Сергей Лысенко, лекция в Школе вебмастеров Яндекса
Плохие методы продвижения сайта, Екатерины Гладких, лекция в Школе вебмастеро...
Основные принципы ранжирования, Сергей Царик и Антон Роменский, лекция в Школ...
Основные принципы индексирования сайта, Александр Смирнов, лекция в Школе веб...
Мобильное приложение: как и зачем, Александр Лукин, лекция в Школе вебмастеро...
Сайты на мобильных устройствах, Олег Ножичкин, лекция в Школе вебмастеров Янд...
Качественная аналитика сайта, Юрий Батиевский, лекция в Школе вебмастеров Янд...
Что можно и что нужно измерять на сайте, Петр Аброськин, лекция в Школе вебма...
Как правильно поставить ТЗ на создание сайта, Алексей Бородкин, лекция в Школ...
Как защитить свой сайт, Пётр Волков, лекция в Школе вебмастеров
Как правильно составить структуру сайта, Дмитрий Сатин, лекция в Школе вебмас...
Технические особенности создания сайта, Дмитрий Васильева, лекция в Школе веб...
Конструкторы для отдельных элементов сайта, Елена Першина, лекция в Школе веб...
Контент для интернет-магазинов, Катерина Ерошина, лекция в Школе вебмастеров ...
Как написать хороший текст для сайта, Катерина Ерошина, лекция в Школе вебмас...
Usability и дизайн - как не помешать пользователю, Алексей Иванов, лекция в Ш...
Cайт. Зачем он и каким должен быть, Алексей Иванов, лекция в Школе вебмастеро...
Ad

Recently uploaded (20)

PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Approach and Philosophy of On baking technology
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Electronic commerce courselecture one. Pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Big Data Technologies - Introduction.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Advanced methodologies resolving dimensionality complications for autism neur...
Digital-Transformation-Roadmap-for-Companies.pptx
Approach and Philosophy of On baking technology
Encapsulation_ Review paper, used for researhc scholars
Electronic commerce courselecture one. Pdf
Unlocking AI with Model Context Protocol (MCP)
Network Security Unit 5.pdf for BCA BBA.
Understanding_Digital_Forensics_Presentation.pptx
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?
The AUB Centre for AI in Media Proposal.docx
Big Data Technologies - Introduction.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
Chapter 3 Spatial Domain Image Processing.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
NewMind AI Weekly Chronicles - August'25 Week I
Per capita expenditure prediction using model stacking based on satellite ima...
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy

Discovering Common Motifs in Cursor Movement Data

  • 1. Discovering Common Motifs in Cursor Movement Data Dmitry Lagun, 2014 Emory University 1
  • 2. Thank you! 2 Mikhail Ageev Qi Guo Eugene Agichtein
  • 3. 3 The Importance of Online User Attention • “Attention is focused mental engagement on a particular item of information.” (Davenport & Beck 2001, p. 20) Abundance of information Scarcity of attention
  • 4. 4 The Importance of Online User Attention • “Eye-mind Hypothesis” [Just and Carpenter, 1980] • “When a subject looks at a word or object, he or she also thinks about (process cognitively), and for exactly as long as the recorded fixation.”
  • 5. 5 The Importance of Online User Attention • Attention is critical for science of cognition (vision, language, memory) • Many industry applications: – Web search intent, quality, presentation, s atisfaction – UI usability testing – Display advertising, customer engagement, branding
  • 6. Measurement of Attention • Eye Tracking – Based on corneal reflection of infra-red light Infra-red cameras Users spend most of the time on top search results 6
  • 7. Applications Examination Strategies [Buscher et al.] Web Page Re-Design [Leiva et al.] Behavior Biased Summaries [Ageev et al.] Query-Expansion & Relevance Feedback [Buscher et al.] Parkinson, ADHD, FASD [Tseng et al.] Prediction of Cognitive Impairment [Zola et al.] Search Relevance [Guo & Agichtein] Search Abandonment [Huang et al.] 7
  • 8. Applications Examination Strategies [Buscher et al.] Web Page Re-Design [Leiva et al.] Behavior Biased Summaries [Ageev et al.] Query-Expansion & Relevance Feedback [Buscher et al.] Parkinson, ADHD, FASD [Tseng et al.] Prediction of Cognitive Impairment [Zola et al.] Search Relevance [Guo & Agichtein] Search Abandonment [Huang et al.] 8 Our focus
  • 9. emory math and cs 9 Search
  • 13. Prior Work: Cursor Movement on Landing Pages • Post Click Behavior Model [Guo and Agichtein, WWW 2012] • Two basic patterns: “Reading” and “Scanning” Reading Scanning “Reading”: consuming or verifying when (seemingly) relevant information is found “Scanning”: not yet found the relevant information, still in the process of visually searching 13
  • 14. Post-Click Behavior (PCB) Data Improves Ranking • PCB and PCB_User consistently outperform DTR (baseline) 14 [Guo & Agichtein, WWW 2012] [Guo , Lagun & Agichtein, CIKM 2012] DTR = Dwell time + Rank NDCG
  • 15. Post-Click Behavior (PCB) Model Features • Average cursor position, cursor speed, direction • Travelled distance, horizontal and vertical ranges • Max/Min cursor positions on the screen • Scroll speed, frequency and scroll distance • Cursor position in a region-of-interest Can we automatically discover meaningful features of cursor trajectory? 15
  • 16. Our Approach: Cursor Motif Mining Instead of engineering complex features, discover common subsequences (motifs) Motif is a frequently occurring sequence of cursor movements. Similar 16
  • 17. Mouse Cursor Data: Challenges  Different users examine web pages with different speed, hence move mouse slower/faster.  Similar of movements can appear in different parts of a web page (top vs. bottom). 17
  • 18. Mouse Cursor Data: Challenges  Different users examine web pages with different speed, hence move mouse slower/faster. [Flexible Distance Metric, DTW]  Similar type of movements can appear in different parts of a web page (top vs. bottom). [Location Invariance: normalize subsequence position] 18
  • 19. Motif Discovery Pipeline Generate Motif Candidates Discover Frequent Candidates De-duplicate / Output Motifs Distance Measure 19
  • 20. Candidate Generation window size sliding window Motif candidates 20
  • 21. Distance Measure • Which time series are similar? • Popular Choices: – Euclidian Distance (ED) – Dynamic Time Warping (DTW) 21
  • 22. Frequent Motif Mining • Similarity Search – How many subsequences in the dataset are similar to the given candidate subsequence? motif candidates motifcandidates dist(i,j) – how similar i-th candidate to the j-th motif candidate. Algorithm Parameters: max_dist – distance when two subsequences are considered “similar” min_count – minimal frequency of motif candidate 22 Brute force search is computationally expensive 
  • 23. De-Duplication (only keep cluster centroids) • Similarity search can generate a lot of frequent candidates that are similar between each other (due to redundancy in motif candidate generation) 23
  • 24. Motif Discovery Pipeline Generate Motif Candidates Discover Frequent Candidates De-duplicate / Output Motifs Distance Metric 24
  • 25. Optimizations in Similarity Search • Early stopping – in DTW computation (takes O(n^2) time) – in lower bound computation (takes O(n) time) [Keogh et al.] • Parallel Computation – No dependency in distance computation  use multiple cores • Distance Metric Learning • Spatial Indexing 25
  • 26. Distance Measure Learning • Goal: Fast pruning of not-promising candidates in similarity search Features (x_max, y_max, …, feature_k) Features (x_max, y_max, …, feature_k) 26 Tune the weights with Gradient based method (e.g. SGD)
  • 27. Spatial Indexing • Goal: Fast pruning of not-promising candidates in similarity search • Indexes motif candidates in weighted feature space • Improves asymptotic time for similarity search 27
  • 29. Example of Discovered Motif discovered motif 29 eye gaze mouse cursor matching subsequence
  • 30. Motifs Discovery: Examples On Search Engine Result Pages (SERPs) On “Landing” Pages (non-SERPs) 30
  • 31. Discovered motifs have many uses • Summarize typical mouse cursor usages – E.g. create dictionary of typical cursor usages • Compact (task-free) representation – Characterize entire cursor trajectory based on which motifs appear in it • For classification/regression: – Compute whether particular motifs appears in a given mouse cursor trajectory 31
  • 32. Using motifs as features for Classification/Regression • We can measure how similar is mouse movement trajectory to each of the discovered motifs window size sliding window 32 motif
  • 33. Motifs for Relevance Prediction • Baselines – Cursor Hover (on the search result page) [Huang et al., CHI 2011] – Post Click Behavior Model [Guo & Agichtein, WWW, 2012] • Dwell time • Statistics of cursor movements: max, min, range, etc. • Statistics of scrolling activity: max, min, range, etc. 33 Reading Scanning
  • 34. Dataset • User study (21 users) – mostly informational search tasks – 566 search queries – 1340 page views – 854 relevance judgments 34
  • 35. Motifs are Better than Previous Models (PCB, Hover) 35 Feature Group Pearson Correlation Cursor Hover 0.120 Post Click Behavior 0.392 Motifs 0.394 (+0.5%) Post Click Behavior + Motifs 0.468 (+19.4%)
  • 36. Motifs are Helpful for Web Search Result Ranking 36
  • 37. Conclusions • It is possible to automatically discover meaningful motifs from mouse cursor data • Motifs are helpful for relevance prediction & ranking • Cursor motifs provide compact (task free) representation for the entire cursor trajectory 37
  • 38. Applications of Gaze/Mouse Cursor Tracking in Medical Domain 38
  • 39. Background: Mild Cognitive Impairment (MCI) and Alzheimer’s Disease • Alzheimer’s disease (AD) affects more than 5M Americans, expected to grow in the coming decade • Memory impairment (aMCI) indicates onset of AD (affects hippocampus first) • Visual Paired Comparison (VPC) task: promising for early diagnosis of both MCI and AD before it is detectable by other means 39
  • 40. VPC Task: Eye Tracking Equipment 40
  • 41. Impaired Subjects spent 50% on Novel Image after Long Delay 41
  • 42. VPC Task: Eye Tracking 42
  • 43. Exploiting Eye Gaze Movement Data Novelty Preference fixation duration distribution + 43
  • 44. Shapelets are Helpful for Prediction of Cognitive Decline • Shapelets – “class specific” motifs 44
  • 45. Shapelets are Helpful for Prediction of Cognitive Decline • Shapelets – “class specific” motifs Baseline AUC = 0.892 ± 0.003 Shapelets AUC = 0.916 ± 0.006 45
  • 46. User Attention on Web Pages 46
  • 47. Cross-Domain User Study • Research Question – Does web page content affect user attention? • Domains – Search (Google), Wikipedia, Shopping (Amazon), Social (Twitter), News (CNN ) • 20 users (4 + 20 tasks per user) • 400 tasks, 1700 page views • 500K gaze/cursor measurements (sampled every 50 ms) 47 ? search domain X
  • 52. Conclusions • It is possible to automatically discover meaningful motifs from mouse cursor data • Motifs are helpful for relevance prediction, ranking and prediction of cognitive impairment • Attention patterns vary significantly across search interfaces 52
  • 53. Thank You! • This work was supported by 53
  • 54. Emory IR Lab: Research Areas Modeling collaborative content creation for information organization, indexing, and search 54 • Mining search behavior data to improve information finding. Medical applications of Search, NLP, behavior modeling.
  • 55. UFindIt: Remote Search Behavior Studies 55 Misha Ageev (MGU & Yandex), Dmitry Lagun (Emory), Denis Savenkov (Emory) SIGIR 2011 (best paper award), SIGIR 2013, EMNLP 2013
  • 56. Search behavior models for Touch Screens Ongoing project, looking for students 56 Guo et al., SIGIR 2013
  • 57. Dynamics in User Generated Content Wikipedia Major events (e.g., natural disasters, sports) affect the content change in Wikipedia articles. Use content change for ranking: • Words used in early revisions of the documents are more essential and important to the documents. • Words used during a major event may reflect relevance change between words and documents Twitter Topic transitions in Tweet streams: • What you’ve tweeted before may affect what you will tweet in the near feature. Sentiment change in Twitter during major events: • People respond differently to the same event since they could hold different prior opinions. (e.g., conservatives vs. liberals) Yu Wang (Ph.D. expected 2014) [CIKM 2010, KDD 2012, CIKM 2013]
  • 58. Community Question Answering (CQA) 1. What are the factors influencing answer contributions in CQA Systems? – Analyzing answerer behavior [ECIR 2011] 2. What kind of searches benefit most from CQA services and archives? – Understanding how searchers become askers [SIGIR 2011] 3. How to improve search quality with CQA data? – Predicting searcher satisfaction with CQA data [SIGIR 2012] Qiaoling Liu, Ph.D. expected: 2014
  • 59. • Emory IR Lab is looking for a few good Ph.D. students to start September 2015 • Information retrieval and web search: search behavior, ranking, user interfaces, content analysis, Question Answering • Social media and social network mining applications: political science, public health, advertising • Psychology, Neuroscience, Medicine applications: computational attention, memory, cognition, language Contact: Eugene Agichtein Associate Professor eugene@mathcs.emory.edu www.mathcs.emory.edu/~eugene/ 59 http://www.mathcs.emory.edu/programs-grad/ Computer Science Ph.D. Program information and application process:

Editor's Notes

  • #4: Attention, Interest, Desire and Action
  • #5: Just MA, Carpenter PA (1980) A theory of reading: from eye fixation to comprehension. Psychol Rev 87:329–354
  • #13: Mouse movement allows how users read the web pagesWhy dwell time is not enough?
  • #14: I
  • #17: Motif is defined as …Show example of several similar motifs in a plot
  • #21: We represent mouse movement trajectory with two 2-dimensional time series
  • #22: Replace the explanation for DTW
  • #24: Add definition of motif
  • #26: Fix Showkeogh year
  • #27: May be shorten to one slide (very distracting)
  • #31: We can study differences in cursor usages on various web pages
  • #33: Similar to binary version of bag of words
  • #36: Emphasize similar performance for automatic vs. manual
  • #57: Even when the interactions are available, the device formfactor requires re-intepretationUsers are increasingly expecting touch interactions