SlideShare a Scribd company logo
#SMX #32A @dawnieando
…And how you can overcome some of them
SOME CURRENT
CHALLENGES WITH
VOICE &
CONVERSATIONAL
SEARCH
#SMX #32A @dawnieando
Who  is  Dawn  Anderson?
• From  rainy  Manchester,  UK
• A  bit  of  a  ‘pracademic’  (hybrid  of  academic  and  
practitioner)
• International  SEO  consultant
• Move  It  Marketing
• I  lecture  on  search  and  digital  marketing  strategy
• But  I  mostly  ‘do’  SEO
• 11  years  in  SEO  now
• Googlebot hunter  ;P  ;P
• Consulting  with  brands,  in-­‐house  teams  and  start-­‐
ups
• My  pomeranian Bert  is  often  featured  in  tweets  
and  posts  ;P  ;P
#SMX #32A @dawnieando
Interest  over  time  on  Alexa  and  Google  Home
#SMX #32A @dawnieando
Seasonal  social  media  demonstrates  mass  engagement
#SMX #32A @dawnieando
Eyes-­‐free  device  sales  are  sky-­‐rocketing
#SMX #32A @dawnieando
Search  Engines  are  Getting  Better  At  Voice  Recognition  &  Question  
Answering
#SMX #32A @dawnieando
TITLE SLIDE ALTERNATIVE LAYOUT w/
*EXAMPLE* IMAGE
(SWAP IN YOUR OWN AS NEEDED)
In 2017 was the year of “questions”
#SMX #32A @dawnieando
Google  Raters  guidelines  for  voice  search  published
#SMX #32A @dawnieando
What  does  a  good  result  look  like?
SPOILER
• Meets informational needs
• In short answers (as applicable)
• Or the answer is at the beginning
of the paragraph or result
• Grammatically correct
(syntactically well-formed)
• No spelling mistakes
• With accurate pronunciation
#SMX #32A @dawnieando
What  does  a  bad  result  look  like?
#SMX #32A @dawnieando
• [Skip]
• [play  mumford and  sons  reminder]  -­‐ Action  Response:  Set  a  
Reminder  Time:  Please  specify  a  time  Fails  to  Meet  The  user  
wanted  to  play  a  specific  song,  and  the  device  instead  set  a  
reminder.  No  users  would  be  satisfied  with  this  response.
Bad Result - Confusion between ‘actions’ & ‘queries’
#SMX #32A @dawnieando
Who  knows  how  many  times  Google  Home  cannot  help?
• Only  Google  knows
• But  they  aren’t  
sharing
• Search  engine  
embarrassment?
#SMX #32A @dawnieando
RECOGNITION IS NOT NATURAL
LANGUAGE
UNDERSTANDING
#SMX #32A @dawnieando
ESSIR2017  
European  Summer  
School  on  Information  
Retrieval
Information Retrieval Lectures
#SMX #32A @dawnieando
Enrique Alfonseca – Google Research Team
TITLE SLIDE ALTERNATIVE LAYOUT w/
*EXAMPLE* IMAGE
(SWAP IN YOUR OWN AS NEEDED)
#SMX #32A @dawnieando
Better ranking needed because the user tends to focus
on a single answer
#SMX #32A @dawnieando
§ One  shot  at  the  answer
§ Berrypicking ‘evolving  search’  may  
not  apply  so  easily
§ Does  not  benefit  from  query  
refinement  and  user  feedback  as  
desktop  SERPs  do
– May  be  why  there  are  still  many  
unanswered  queries
Better Ranking Is Needed As The User Focuses On A
Single Result
#SMX #32A @dawnieando
Accurate spelling and grammar matter a lot in voice search
TITLE SLIDE ALTERNATIVE LAYOUT w/
*EXAMPLE* IMAGE
(SWAP IN YOUR OWN AS NEEDED)
Query  diversity  ‘clusters’  
in  keyboard  ‘evolving’  
user  search
#SMX #32A @dawnieando
Accurate spelling and grammar matter a lot in voice search
TITLE SLIDE ALTERNATIVE LAYOUT w/
*EXAMPLE* IMAGE
(SWAP IN YOUR OWN AS NEEDED)
Query  refinement  (via  
user  feedback)  is  not  
possible  with  voice  
search
#SMX #32A @dawnieando
#SMXInsights
§ No query expansion or relaxation
– Precision more important than recall
– Because there can be only one (or 2)
#SMX #32A @dawnieando
Accurate spelling and grammar matter a lot in voice search
TITLE SLIDE ALTERNATIVE LAYOUT w/
*EXAMPLE* IMAGE
(SWAP IN YOUR OWN AS NEEDED)
Precision  >  Recall  in  voice  
search
Accuracy  >  Diversity
#SMX #32A @dawnieando
A rambled answer at the end is the worst possible result
#SMX #32A @dawnieando
“There  is  no  re-­‐ordering  in  
voice  search  – no  
paraphrasing  – just  
extraction  and  
compression.”
(Alfonseca,  2017,  
ESSIR2017)
#SMX #32A @dawnieando
Example of classic IR teaching query interpretation system
TITLE SLIDE ALTERNATIVE LAYOUT w/
*EXAMPLE* IMAGE
(SWAP IN YOUR OWN AS NEEDED)
#SMX #32A @dawnieando
#SMXInsights
§ No paraphrasing with conversational search
– Paraphrasing likely needs full understanding
of query & intent to reformulate
#SMX #32A @dawnieando
• The  knowledge  base  is  checked  first
• Then  the  web  is  checked  to  ‘fill  in  gaps’
• Taking  from  the  messy  unstructured  
data  of  web  pages
Knowledge base first,
web text second
#SMX #32A @dawnieando
• Structured  data  (tables  and  data  stored  in  databases)
• Semi-­‐structured  data  (XML,  JSON,  meta  headings  [h1-­‐h6])
• Semantically-­‐enriched  data  (marked  up  schema,  entities)
• Unstuctured data  (normal  web  text  copy)
• The  web  is  messy  and  noisy
• Unstructured  data  is  difficult  to  make  sense  of  (no  topical  
strength)
The different types of data & the problem with
unstructured data
#SMX #32A @dawnieando
Structured  data  has  
never  been  more  
important  for  
disambiguation
#SMX #32A @dawnieando
• Adds  meaning
• Disambiguates
• Adds  structure
• Helps  with  context
• The  web  is  noisy
• Unstructured  data  is  voluminous
Structured Data is very,
very useful here
#SMX #32A @dawnieando
#SMXInsights
§ Simply adding topical H1 – H6
headings turns unstructured web
data into semi-structured data
#SMX #32A @dawnieando
Share these #SMXInsights on your social channels!
#SMXInsights
§ Tables are problematic for voice search
– Support tabular data with well formed
paragraphs and sentences
#SMX #32A @dawnieando
• What  may  be  good  for  featured  
snippets  (tabular  data)  may  not  be  
good  for  voice  search
• You  may  need  additional  strategy  
for  voice  search  &  tabular  data  in  
featured  snippets
• Pete  Myers  from  Moz found  only  
30%  voice  search  results  on  Google  
Home  came  from  tables  in  featured  
snippets  (Image  credit:  Pete  Myers,  
Moz)
Tables are currently problematic
#SMX #32A @dawnieando
CONFIRMED  BY:
• Google’s  Enrique  Alfonseca (2017)
• Microsoft’s  Harry  Shum  (2018)
• Conversational  contextual  search  is  difficult
Multi-turn conversations are still challenging
#SMX #32A @dawnieando
• (“anaphoric”  is  referring  
upward  to  previously  
mentioned  words)
• Resolution  means  trying  to  
understand  what  it  was  
which  is  referred  to  in  those  
previously  mentioned  words
Anaphoric
Resolution
#SMX #32A @dawnieando
• (“cataphoric”  is  referring  
downward  to  subsequent  
words)
• Resolution  means  trying  to  
understand  what  it  is  which  is  
referred  to  in  those  
subsequent  words
Cataphoric
Resolution
#SMX #32A @dawnieando
Likely  relates  to  anaphoric  (likely)  &  cataphoric (far  less  likely)  
resolution
Pronouns seem still
Problematic
#SMX #32A @dawnieando
Our ’Previous’ Work
#SMX #32A @dawnieando
AKA  – Word  category  disambiguation
• Function  words  – POS  (Syntax)
• Content  words  – POS  (relevant)
• Verbs  – POS
• Nouns  -­‐ POS
• Pronouns  -­‐ POS
• Plural-­‐pronouns  -­‐ POS
Pygmalion are carrying out Part of Speech (POS) &
Named Entity Tagging (NE tags) manually
#SMX #32A @dawnieando
WORD DISAMBIGUATION
#SMX #32A @dawnieando
Ambiguous queries need context – ‘House’
#SMX #32A @dawnieando
Linguistics are complex
Homophora Endophora Exophora
Hyponyms Hypernyms Homonyms
#SMX #32A @dawnieando
COREFERENCE RESOLUTION IS A
CHALLENGING PROBLEM FOR
DISAMBIGUATION
#SMX #32A @dawnieando
THE IMPORTANCE OF
CO-OCCURRENCE
#SMX #32A @dawnieando
”You shall know a word by the
company it keeps”
(Firth)
#SMX #32A @dawnieando
Other ’Previous’ Work – Similarity & Relatedness
#SMX #32A @dawnieando
WordSimilarity353 Test Collection
#SMX #32A @dawnieando
money cash 9.08
money currency 9.04
football soccer 9.03
magician wizard 9.02
gem jewel 8.96
car automobile 8.94
boy lad 8.83
furnace stove 8.79
Maradona football 8.62
king queen 8.58
money bank 8.5
Jerusalem Israel 8.46
vodka gin 8.46
planet star 8.45
money dollar 8.42
vodka brandy 8.13
bank money 8.12
physics proton 8.12
planet galaxy 8.11
stock market 8.08
psychology psychiatry 8.08
planet moon 8.08
planet constellation 8.06
planet sun 8.02
tiger feline 8
planet astronomer 7.94
movie theater 7.92
planet space 7.92
baby mother 7.85
wood forest 7.73
money deposit 7.73
psychology mind 7.69
Jerusalem Palestinian 7.65
Arafat terror 7.65
computer keyboard 7.62
computer internet 7.58
money property 7.57
tennis racket 7.56
psychology cognition 7.48
book paper 7.46
book library 7.46
media radio 7.42
psychology depression 7.42
jaguar cat 7.42
movie star 7.38
bird crane 7.38
tiger cat 7.35
physics chemistry 7.35
money possession 7.29
jaguar car 7.27
cup drink 7.25
psychology health 7.23
bird cock 7.1
company stock 7.08
tiger carnivore 7.08
WordSimilarity353 Test Collection
#SMX #32A @dawnieando
#SMXInsights
§ Secondary or 3-way strategy may be
needed
– Add a TL:DR
– Or an executive summary
– Or Q & A based table of contents
– Or a ‘Short Answer’ then ‘Longer Answer’
#SMX #32A @dawnieando
#SMXInsights
§ Mine forums, customer service, chat &
emails
– Build word clouds to provide answers to
topics which matter to your audience
#SMX #32A @dawnieando
Accurate spelling and grammar matter a lot in voice search
TITLE SLIDE ALTERNATIVE LAYOUT w/
*EXAMPLE* IMAGE
(SWAP IN YOUR OWN AS NEEDED)
Soundex,  Metaphone or  
similar  ’misspelling’  
algorithms  may  not  apply  
to  voice  search
#SMX #32A @dawnieando
LEARN MORE: UPCOMING @SMX EVENTS
THANK YOU!
SEE YOU AT THE NEXT #SMX
#SMX #32A @dawnieando
• WordSimilarity353  Test  Collection  -­‐http://www.cs.technion.ac.il/~gabr/resources/data/wordsim353/
• Miller,  G.A.  and  Charles,  W.G.,  1991.  Contextual  correlates  of  semantic  similarity. Language  and  
cognitive  processes, 6(1),  pp.1-­‐28.
• Linkedin Harry  Shum.  2018. From  Search  to  Research.  [ONLINE]  Available  
at: https://www.linkedin.com/pulse/from-­‐search-­‐research-­‐harry-­‐shum/.  [Accessed  22  February  2018].
• Coreference Resolution  -­‐ The  Stanford  Natural  Language  Processing  Group.  2018. The  Stanford  Natural  
Language  Processing  Group.  [ONLINE]  Available  at: https://nlp.stanford.edu/projects/coref.shtml.  
[Accessed  19  February  2018].
Sources & References
#SMX #32A @dawnieando
APPENDIX
#SMX #32A @dawnieando
EXAMPLES
• Look  at  Wikipedia  Redirects
• Alternative  names  redirect  to  the most  appropriate  article  
title (for  example, Edison  Arantes  do  Nascimento redirects  
to Pelé)  (Wikipedia)
• SPARQL  and  DBPedia identifies  many  variations  
• (Beethoven  example)
• https://dbpedia.org/sparql
• https://en.wikipedia.org/wiki/Wikipedia:Redirect
Terms can have many ‘surface forms’
#SMX #32A @dawnieando
” It is concluded…the more often two words
can be substituted into the same contexts the
more similar in meaning they are judged to
be.”
(Miller & Charles,1991)
#SMX #32A @dawnieando
Accurate spelling and grammar matter a lot in voice search
TITLE SLIDE ALTERNATIVE LAYOUT w/
*EXAMPLE* IMAGE
(SWAP IN YOUR OWN AS NEEDED)
Difficult  to  deal  with  
‘query  ambiguity’
Result  ‘diversity’  
assists  with  query  
ambiguity  in  desktop  
or  non-­‐voice  results
#SMX #32A @dawnieando
Accurate spelling and grammar matter a lot in voice search
TITLE SLIDE ALTERNATIVE LAYOUT w/
*EXAMPLE* IMAGE
(SWAP IN YOUR OWN AS NEEDED)
Page  Length  
‘Normalization’  may  not  
apply  as  with  traditional  
results??
(Me  musing)
#SMX #32A @dawnieando
Long numbers should be rounded
§ 60,999,888.999999999
– It  reads  terribly
– Needs  to  be  rounded
#SMX #32A @dawnieando
• First  checks  whether  the  next  ‘turn’  of  question  relates  to  
the  previous  question
• Using  LSTMs  (Long  Short  Term  Memory)
• Bi-­‐directional  context  embedding
• Query  and  its  context  are  both  used  as  input
Conversational Context & Microsoft
#SMX #32A @dawnieando
Katja Filippova – Google Research Team
TITLE SLIDE ALTERNATIVE LAYOUT w/
*EXAMPLE* IMAGE
(SWAP IN YOUR OWN AS NEEDED)
#SMX #32A @dawnieando
Query expansion and query relaxation
#SMX #32A @dawnieando
https://www.ntid.rit.edu/sea/processes/referencewords/practice/ph
oric
Example of cataphoric and anaphoric resolution

More Related Content

PDF
Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...
PDF
The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...
PPTX
The User is the Query - The Rise of Predictive Proactive Search
PDF
DigitalDealer27 - Voice Search Optimization and 2019 Voice Report
PPTX
2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search
PDF
Beyond User Research
PDF
Bearish SEO: Defining the User Experience for Google’s Panda Search Landscape
PPT
Search Patterns: An Early Talk
Voice Search and Conversation Action Assistive Systems - Challenges & Opportu...
The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...
The User is the Query - The Rise of Predictive Proactive Search
DigitalDealer27 - Voice Search Optimization and 2019 Voice Report
2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search
Beyond User Research
Bearish SEO: Defining the User Experience for Google’s Panda Search Landscape
Search Patterns: An Early Talk

What's hot (9)

PDF
Site search analytics workshop presentation
PDF
SearchLove San Diego 2017 | Will Critchlow | Knowing Ranking Factors Won't Be...
PDF
Jason Barnard — Structured Data for the Knowledge Panel
PDF
You've Got (Big) Data! Now What?
PPTX
Increase Conversions and Rankings with User Experience (SEM Summit 2016)
PDF
Effective Use of Google by Chheda Sanjay Visanji
PPTX
The 5 Levels of Talent Mining from SourceCon 2010 DC
PDF
8 Seconds_Writing for Digital Communications.12.11
PDF
How Just a Little Data Analysis Can Improve your Content
Site search analytics workshop presentation
SearchLove San Diego 2017 | Will Critchlow | Knowing Ranking Factors Won't Be...
Jason Barnard — Structured Data for the Knowledge Panel
You've Got (Big) Data! Now What?
Increase Conversions and Rankings with User Experience (SEM Summit 2016)
Effective Use of Google by Chheda Sanjay Visanji
The 5 Levels of Talent Mining from SourceCon 2010 DC
8 Seconds_Writing for Digital Communications.12.11
How Just a Little Data Analysis Can Improve your Content
Ad

Similar to Voice Search Challenges For Search and Information Retrieval and SEO (20)

PPTX
Voice search lessons
PDF
Optimizing for Voice Search #SMXL18
PDF
How Mobile Voice Search Changes SEO
PDF
OK Google, Whats next? - OMT Wiesbaden 2018
PDF
Conductor C3 2019 - A Sound Advantage: How Voice Search Works & Works For You
PDF
Voice Search: How Will it Affect Search Marketers in 2017?
PDF
LSA17: Getting Found Through Voice Optimization & Virtual Assistants (Soleo, ...
PPTX
Google SEO 2013 - Hummingbird and Beyond
PDF
Sound, Search, and Semantics: How Form Follows Function
PDF
Sound, Search, and Semantics: How Form Follows Function
PDF
Sound, Search, and Semantics: How Form Follows Function
PDF
Upsana Gautam - Advanced Search Summit Napa 2019
PPTX
Let's Talk Voice Search
PPTX
The Voice Search Revolution
PDF
ETAs: Evolved Text Ads By Mark Irvine
PPTX
VOICE SEARCH: Boosting SEO in the age of Conversation
PDF
How Can I Optimize My Website for Semantic Search, Voice Search, and AI to Ra...
PDF
OK, How can I Rank for Voice Search #LearnInbound
PPTX
Hot Trends in Digital Marketing
PPTX
Voice Search | Hero Conf 2017 London | Lars Neumann
Voice search lessons
Optimizing for Voice Search #SMXL18
How Mobile Voice Search Changes SEO
OK Google, Whats next? - OMT Wiesbaden 2018
Conductor C3 2019 - A Sound Advantage: How Voice Search Works & Works For You
Voice Search: How Will it Affect Search Marketers in 2017?
LSA17: Getting Found Through Voice Optimization & Virtual Assistants (Soleo, ...
Google SEO 2013 - Hummingbird and Beyond
Sound, Search, and Semantics: How Form Follows Function
Sound, Search, and Semantics: How Form Follows Function
Sound, Search, and Semantics: How Form Follows Function
Upsana Gautam - Advanced Search Summit Napa 2019
Let's Talk Voice Search
The Voice Search Revolution
ETAs: Evolved Text Ads By Mark Irvine
VOICE SEARCH: Boosting SEO in the age of Conversation
How Can I Optimize My Website for Semantic Search, Voice Search, and AI to Ra...
OK, How can I Rank for Voice Search #LearnInbound
Hot Trends in Digital Marketing
Voice Search | Hero Conf 2017 London | Lars Neumann
Ad

More from Dawn Anderson MSc DigM (20)

PDF
Human vs AI Quality Raters for Search Engines.pdf
PDF
Life of An SEO - Surfing The Waves of Googles Many Algorithmic Updates
PDF
Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...
PDF
Passage indexing is likely more important than you think
PDF
Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...
PDF
Google BERT - SMX London 2020 Virtual Conference
PDF
Google BERT - What SEOs and Marketers Need to Know
PDF
Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020
PDF
Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...
PDF
Planning an SEO Strategy for a New Website - SMXL Milan 2019
PPTX
Google BERT and Family and the Natural Language Understanding Leaderboard Race
PPTX
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
PPTX
Using topic modelling frameworks for NLP and semantic search
PDF
SEO in a Mobile First World
PDF
Modern Ecommerce SEO
PDF
SEO and The Mobile-First Paradigm Shift
PDF
Pubcon florida 2018 logs dont lie dawn anderson
PDF
Digital Olympus Technical SEO Findings Whilst Taming An SEO Beast
PDF
Cruft busting technical debt code smell and refactoring for seo - state of ...
PDF
Duplicate Content Myths Types and Ways To Make It Work For You
Human vs AI Quality Raters for Search Engines.pdf
Life of An SEO - Surfing The Waves of Googles Many Algorithmic Updates
Natural Semantic SEO - Surfacing Walnuts in Densely Represented, Every Increa...
Passage indexing is likely more important than you think
Zipfs Law & Zipfian Distribution in SEO - Pubcon Virtual Fall 2020 - Dawn And...
Google BERT - SMX London 2020 Virtual Conference
Google BERT - What SEOs and Marketers Need to Know
Disambiguating Equiprobability in SEO Dawn Anderson Friends of Search 2020
Connecting The Worlds of Information Retrieval & SEO - Search solutions 2019 ...
Planning an SEO Strategy for a New Website - SMXL Milan 2019
Google BERT and Family and the Natural Language Understanding Leaderboard Race
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Using topic modelling frameworks for NLP and semantic search
SEO in a Mobile First World
Modern Ecommerce SEO
SEO and The Mobile-First Paradigm Shift
Pubcon florida 2018 logs dont lie dawn anderson
Digital Olympus Technical SEO Findings Whilst Taming An SEO Beast
Cruft busting technical debt code smell and refactoring for seo - state of ...
Duplicate Content Myths Types and Ways To Make It Work For You

Recently uploaded (20)

PPTX
PRINCIPLES OF MANAGEMENT and functions (1).pptx
PDF
How a Travel Company Can Implement Content Marketing
PPTX
Amazon - STRATEGIC.......................pptx
PDF
Coleção Nature .
PPTX
Ranking a Webpage with SEO (And Tracking It with the Right Attribution Type a...
PPTX
UNIT 3 - 5 INDUSTRIAL PRICING.ppt x
PDF
UNIT 1 -3 Factors Influencing RURAL CONSUMER BEHAVIOUR.pdf
PPTX
Best Digital marketing service provider in Chandigarh.pptx
DOCX
marketing plan starville............docx
PPTX
Assignment 2 Task 1 - How Consumers Use Technology and Its Impact on Their Lives
PDF
Hidden gems in Microsoft ads with Navah Hopkins
PDF
Fly Emirates SEO case study by Rakesh pathak.pdf
PDF
Future Retail Disruption Trends and Observations
PPTX
Sumit Saxena IIM J Project Market segmentation.pptx
PDF
Prove and Prioritize Profitability in Every Marketing Campaign - Zach Sherrod...
PDF
Ramjilal Ramsaroop || Trending Branding
PDF
Proven AI Visibility: From SEO Strategy To GEO Tactics
PPTX
Your score increases as you pick a category, fill out a long description and ...
DOCX
Parkville marketing plan .......MR.docx
PPTX
Presentation - MindfulHeal Digital Ayurveda GTM & Marketing Plan.pptx
PRINCIPLES OF MANAGEMENT and functions (1).pptx
How a Travel Company Can Implement Content Marketing
Amazon - STRATEGIC.......................pptx
Coleção Nature .
Ranking a Webpage with SEO (And Tracking It with the Right Attribution Type a...
UNIT 3 - 5 INDUSTRIAL PRICING.ppt x
UNIT 1 -3 Factors Influencing RURAL CONSUMER BEHAVIOUR.pdf
Best Digital marketing service provider in Chandigarh.pptx
marketing plan starville............docx
Assignment 2 Task 1 - How Consumers Use Technology and Its Impact on Their Lives
Hidden gems in Microsoft ads with Navah Hopkins
Fly Emirates SEO case study by Rakesh pathak.pdf
Future Retail Disruption Trends and Observations
Sumit Saxena IIM J Project Market segmentation.pptx
Prove and Prioritize Profitability in Every Marketing Campaign - Zach Sherrod...
Ramjilal Ramsaroop || Trending Branding
Proven AI Visibility: From SEO Strategy To GEO Tactics
Your score increases as you pick a category, fill out a long description and ...
Parkville marketing plan .......MR.docx
Presentation - MindfulHeal Digital Ayurveda GTM & Marketing Plan.pptx

Voice Search Challenges For Search and Information Retrieval and SEO

  • 1. #SMX #32A @dawnieando …And how you can overcome some of them SOME CURRENT CHALLENGES WITH VOICE & CONVERSATIONAL SEARCH
  • 2. #SMX #32A @dawnieando Who  is  Dawn  Anderson? • From  rainy  Manchester,  UK • A  bit  of  a  ‘pracademic’  (hybrid  of  academic  and   practitioner) • International  SEO  consultant • Move  It  Marketing • I  lecture  on  search  and  digital  marketing  strategy • But  I  mostly  ‘do’  SEO • 11  years  in  SEO  now • Googlebot hunter  ;P  ;P • Consulting  with  brands,  in-­‐house  teams  and  start-­‐ ups • My  pomeranian Bert  is  often  featured  in  tweets   and  posts  ;P  ;P
  • 3. #SMX #32A @dawnieando Interest  over  time  on  Alexa  and  Google  Home
  • 4. #SMX #32A @dawnieando Seasonal  social  media  demonstrates  mass  engagement
  • 5. #SMX #32A @dawnieando Eyes-­‐free  device  sales  are  sky-­‐rocketing
  • 6. #SMX #32A @dawnieando Search  Engines  are  Getting  Better  At  Voice  Recognition  &  Question   Answering
  • 7. #SMX #32A @dawnieando TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED) In 2017 was the year of “questions”
  • 8. #SMX #32A @dawnieando Google  Raters  guidelines  for  voice  search  published
  • 9. #SMX #32A @dawnieando What  does  a  good  result  look  like? SPOILER • Meets informational needs • In short answers (as applicable) • Or the answer is at the beginning of the paragraph or result • Grammatically correct (syntactically well-formed) • No spelling mistakes • With accurate pronunciation
  • 10. #SMX #32A @dawnieando What  does  a  bad  result  look  like?
  • 11. #SMX #32A @dawnieando • [Skip] • [play  mumford and  sons  reminder]  -­‐ Action  Response:  Set  a   Reminder  Time:  Please  specify  a  time  Fails  to  Meet  The  user   wanted  to  play  a  specific  song,  and  the  device  instead  set  a   reminder.  No  users  would  be  satisfied  with  this  response. Bad Result - Confusion between ‘actions’ & ‘queries’
  • 12. #SMX #32A @dawnieando Who  knows  how  many  times  Google  Home  cannot  help? • Only  Google  knows • But  they  aren’t   sharing • Search  engine   embarrassment?
  • 13. #SMX #32A @dawnieando RECOGNITION IS NOT NATURAL LANGUAGE UNDERSTANDING
  • 14. #SMX #32A @dawnieando ESSIR2017   European  Summer   School  on  Information   Retrieval Information Retrieval Lectures
  • 15. #SMX #32A @dawnieando Enrique Alfonseca – Google Research Team TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED)
  • 16. #SMX #32A @dawnieando Better ranking needed because the user tends to focus on a single answer
  • 17. #SMX #32A @dawnieando § One  shot  at  the  answer § Berrypicking ‘evolving  search’  may   not  apply  so  easily § Does  not  benefit  from  query   refinement  and  user  feedback  as   desktop  SERPs  do – May  be  why  there  are  still  many   unanswered  queries Better Ranking Is Needed As The User Focuses On A Single Result
  • 18. #SMX #32A @dawnieando Accurate spelling and grammar matter a lot in voice search TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED) Query  diversity  ‘clusters’   in  keyboard  ‘evolving’   user  search
  • 19. #SMX #32A @dawnieando Accurate spelling and grammar matter a lot in voice search TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED) Query  refinement  (via   user  feedback)  is  not   possible  with  voice   search
  • 20. #SMX #32A @dawnieando #SMXInsights § No query expansion or relaxation – Precision more important than recall – Because there can be only one (or 2)
  • 21. #SMX #32A @dawnieando Accurate spelling and grammar matter a lot in voice search TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED) Precision  >  Recall  in  voice   search Accuracy  >  Diversity
  • 22. #SMX #32A @dawnieando A rambled answer at the end is the worst possible result
  • 23. #SMX #32A @dawnieando “There  is  no  re-­‐ordering  in   voice  search  – no   paraphrasing  – just   extraction  and   compression.” (Alfonseca,  2017,   ESSIR2017)
  • 24. #SMX #32A @dawnieando Example of classic IR teaching query interpretation system TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED)
  • 25. #SMX #32A @dawnieando #SMXInsights § No paraphrasing with conversational search – Paraphrasing likely needs full understanding of query & intent to reformulate
  • 26. #SMX #32A @dawnieando • The  knowledge  base  is  checked  first • Then  the  web  is  checked  to  ‘fill  in  gaps’ • Taking  from  the  messy  unstructured   data  of  web  pages Knowledge base first, web text second
  • 27. #SMX #32A @dawnieando • Structured  data  (tables  and  data  stored  in  databases) • Semi-­‐structured  data  (XML,  JSON,  meta  headings  [h1-­‐h6]) • Semantically-­‐enriched  data  (marked  up  schema,  entities) • Unstuctured data  (normal  web  text  copy) • The  web  is  messy  and  noisy • Unstructured  data  is  difficult  to  make  sense  of  (no  topical   strength) The different types of data & the problem with unstructured data
  • 28. #SMX #32A @dawnieando Structured  data  has   never  been  more   important  for   disambiguation
  • 29. #SMX #32A @dawnieando • Adds  meaning • Disambiguates • Adds  structure • Helps  with  context • The  web  is  noisy • Unstructured  data  is  voluminous Structured Data is very, very useful here
  • 30. #SMX #32A @dawnieando #SMXInsights § Simply adding topical H1 – H6 headings turns unstructured web data into semi-structured data
  • 31. #SMX #32A @dawnieando Share these #SMXInsights on your social channels! #SMXInsights § Tables are problematic for voice search – Support tabular data with well formed paragraphs and sentences
  • 32. #SMX #32A @dawnieando • What  may  be  good  for  featured   snippets  (tabular  data)  may  not  be   good  for  voice  search • You  may  need  additional  strategy   for  voice  search  &  tabular  data  in   featured  snippets • Pete  Myers  from  Moz found  only   30%  voice  search  results  on  Google   Home  came  from  tables  in  featured   snippets  (Image  credit:  Pete  Myers,   Moz) Tables are currently problematic
  • 33. #SMX #32A @dawnieando CONFIRMED  BY: • Google’s  Enrique  Alfonseca (2017) • Microsoft’s  Harry  Shum  (2018) • Conversational  contextual  search  is  difficult Multi-turn conversations are still challenging
  • 34. #SMX #32A @dawnieando • (“anaphoric”  is  referring   upward  to  previously   mentioned  words) • Resolution  means  trying  to   understand  what  it  was   which  is  referred  to  in  those   previously  mentioned  words Anaphoric Resolution
  • 35. #SMX #32A @dawnieando • (“cataphoric”  is  referring   downward  to  subsequent   words) • Resolution  means  trying  to   understand  what  it  is  which  is   referred  to  in  those   subsequent  words Cataphoric Resolution
  • 36. #SMX #32A @dawnieando Likely  relates  to  anaphoric  (likely)  &  cataphoric (far  less  likely)   resolution Pronouns seem still Problematic
  • 37. #SMX #32A @dawnieando Our ’Previous’ Work
  • 38. #SMX #32A @dawnieando AKA  – Word  category  disambiguation • Function  words  – POS  (Syntax) • Content  words  – POS  (relevant) • Verbs  – POS • Nouns  -­‐ POS • Pronouns  -­‐ POS • Plural-­‐pronouns  -­‐ POS Pygmalion are carrying out Part of Speech (POS) & Named Entity Tagging (NE tags) manually
  • 39. #SMX #32A @dawnieando WORD DISAMBIGUATION
  • 40. #SMX #32A @dawnieando Ambiguous queries need context – ‘House’
  • 41. #SMX #32A @dawnieando Linguistics are complex Homophora Endophora Exophora Hyponyms Hypernyms Homonyms
  • 42. #SMX #32A @dawnieando COREFERENCE RESOLUTION IS A CHALLENGING PROBLEM FOR DISAMBIGUATION
  • 43. #SMX #32A @dawnieando THE IMPORTANCE OF CO-OCCURRENCE
  • 44. #SMX #32A @dawnieando ”You shall know a word by the company it keeps” (Firth)
  • 45. #SMX #32A @dawnieando Other ’Previous’ Work – Similarity & Relatedness
  • 47. #SMX #32A @dawnieando money cash 9.08 money currency 9.04 football soccer 9.03 magician wizard 9.02 gem jewel 8.96 car automobile 8.94 boy lad 8.83 furnace stove 8.79 Maradona football 8.62 king queen 8.58 money bank 8.5 Jerusalem Israel 8.46 vodka gin 8.46 planet star 8.45 money dollar 8.42 vodka brandy 8.13 bank money 8.12 physics proton 8.12 planet galaxy 8.11 stock market 8.08 psychology psychiatry 8.08 planet moon 8.08 planet constellation 8.06 planet sun 8.02 tiger feline 8 planet astronomer 7.94 movie theater 7.92 planet space 7.92 baby mother 7.85 wood forest 7.73 money deposit 7.73 psychology mind 7.69 Jerusalem Palestinian 7.65 Arafat terror 7.65 computer keyboard 7.62 computer internet 7.58 money property 7.57 tennis racket 7.56 psychology cognition 7.48 book paper 7.46 book library 7.46 media radio 7.42 psychology depression 7.42 jaguar cat 7.42 movie star 7.38 bird crane 7.38 tiger cat 7.35 physics chemistry 7.35 money possession 7.29 jaguar car 7.27 cup drink 7.25 psychology health 7.23 bird cock 7.1 company stock 7.08 tiger carnivore 7.08 WordSimilarity353 Test Collection
  • 48. #SMX #32A @dawnieando #SMXInsights § Secondary or 3-way strategy may be needed – Add a TL:DR – Or an executive summary – Or Q & A based table of contents – Or a ‘Short Answer’ then ‘Longer Answer’
  • 49. #SMX #32A @dawnieando #SMXInsights § Mine forums, customer service, chat & emails – Build word clouds to provide answers to topics which matter to your audience
  • 50. #SMX #32A @dawnieando Accurate spelling and grammar matter a lot in voice search TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED) Soundex,  Metaphone or   similar  ’misspelling’   algorithms  may  not  apply   to  voice  search
  • 51. #SMX #32A @dawnieando LEARN MORE: UPCOMING @SMX EVENTS THANK YOU! SEE YOU AT THE NEXT #SMX
  • 52. #SMX #32A @dawnieando • WordSimilarity353  Test  Collection  -­‐http://www.cs.technion.ac.il/~gabr/resources/data/wordsim353/ • Miller,  G.A.  and  Charles,  W.G.,  1991.  Contextual  correlates  of  semantic  similarity. Language  and   cognitive  processes, 6(1),  pp.1-­‐28. • Linkedin Harry  Shum.  2018. From  Search  to  Research.  [ONLINE]  Available   at: https://www.linkedin.com/pulse/from-­‐search-­‐research-­‐harry-­‐shum/.  [Accessed  22  February  2018]. • Coreference Resolution  -­‐ The  Stanford  Natural  Language  Processing  Group.  2018. The  Stanford  Natural   Language  Processing  Group.  [ONLINE]  Available  at: https://nlp.stanford.edu/projects/coref.shtml.   [Accessed  19  February  2018]. Sources & References
  • 54. #SMX #32A @dawnieando EXAMPLES • Look  at  Wikipedia  Redirects • Alternative  names  redirect  to  the most  appropriate  article   title (for  example, Edison  Arantes  do  Nascimento redirects   to Pelé)  (Wikipedia) • SPARQL  and  DBPedia identifies  many  variations   • (Beethoven  example) • https://dbpedia.org/sparql • https://en.wikipedia.org/wiki/Wikipedia:Redirect Terms can have many ‘surface forms’
  • 55. #SMX #32A @dawnieando ” It is concluded…the more often two words can be substituted into the same contexts the more similar in meaning they are judged to be.” (Miller & Charles,1991)
  • 56. #SMX #32A @dawnieando Accurate spelling and grammar matter a lot in voice search TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED) Difficult  to  deal  with   ‘query  ambiguity’ Result  ‘diversity’   assists  with  query   ambiguity  in  desktop   or  non-­‐voice  results
  • 57. #SMX #32A @dawnieando Accurate spelling and grammar matter a lot in voice search TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED) Page  Length   ‘Normalization’  may  not   apply  as  with  traditional   results?? (Me  musing)
  • 58. #SMX #32A @dawnieando Long numbers should be rounded § 60,999,888.999999999 – It  reads  terribly – Needs  to  be  rounded
  • 59. #SMX #32A @dawnieando • First  checks  whether  the  next  ‘turn’  of  question  relates  to   the  previous  question • Using  LSTMs  (Long  Short  Term  Memory) • Bi-­‐directional  context  embedding • Query  and  its  context  are  both  used  as  input Conversational Context & Microsoft
  • 60. #SMX #32A @dawnieando Katja Filippova – Google Research Team TITLE SLIDE ALTERNATIVE LAYOUT w/ *EXAMPLE* IMAGE (SWAP IN YOUR OWN AS NEEDED)
  • 61. #SMX #32A @dawnieando Query expansion and query relaxation