SlideShare a Scribd company logo
Dr. Carson Kai-Sang Leung

        Inderjeet Singh
(Database and Data Mining Lab)
   Introduction

   Problem

   Solution Methodology

   Evaluation




                           Comp 7220   1/11/2012   2
Comp 7220   1/11/2012   3
   Mining user behaviour/preferences
   Predict document relevance
   Re-rank the search results
   Compare different ranking functions (train/test)
   Optimize the ad. performance
   Query suggestions

   How Big are these logs?
    ◦ 10+ terabyte of entries each day
    ◦ Composed of billions of distinct (query, url)’s

                                               Comp 7220   1/11/2012   4
Documents/results                                 Ranking factors
                        Many ranking factors
presented in order of                             depend on query,
                          considered when
the relevance to the                                document and
                        ranking these results
        query                                    query-document pair


 Improving ranking
   based on user        Personalized search       Recency (temporal)
     preferences          +Social search               ranking
   (likes/dislikes)



                                                Comp 7220   1/11/2012   5
[David Green; blog]



                      Comp 7220   1/11/2012   6
# of clicks received




[CIKM'09 Tutorial]
                     Comp 7220   1/11/2012    7
Trust factor: Preferences to certain URLs more than the other,
e.g., wikipedia.com, stackoverflow.com, Yahoo answers,
about.com

What is missing (in previous models) ?
 Modelling trust factor

 Clicks on sponsored results

 Related queries/searches (sidebars)

 Realistic and flexible assumptions on user behaviour




                                              Comp 7220   1/11/2012   8
Comp 7220   1/11/2012   9
1. Informational query – “DDR3 memory”, “SATA 3 hard drives”,
   “American history”
2. Navigational query – “gmail”, “digg”, “CIBC”, “CIBC credit cards”




                                               Comp 7220   1/11/2012   10
No                              No
Snippet Examine?            Snippet Examine?


                      No
          Yes                         Yes
                                                   No
Snippet Attractive?         Snippet Attractive?

                                                  No
         Yes          No             Yes

Enough Utility?             Enough Utility?

         Yes                          Yes
       End                         End

                                                       Comp 7220   1/11/2012   11
Realistic and flexible assumptions on user
behaviour (session modelling)

    Consider trust bias (trust factor)

        Order results for particular query by
        relevance scores predicted by model

            Comparison of this order to the editorial
            ranking

                 Is it good model? If orderings agree upto a
                 considerable extent

                                                   Comp 7220   1/11/2012   12
Deploy this model as a feature/factor for predicting relevance in learning
                            to rank algorithm



                   Deriving retrieval/ranking function



If metric gains over baseline ranking function? Model insights can be used
                      as a feature in ranking function



  Ranking function tests with different class of queries for metric gains

                                                   Comp 7220   1/11/2012     13
Metrics
• Discounted Cumulative Gain (DCG)
• Normalized DCG (NDCG)
• Precision
• Recall

Two types of data
1. Search click logs (from real or meta search engines)
2. Benchmarking dataset LEarning TO Rank (LETOR) for
   information retrieval




                                     Comp 7220   1/11/2012   14
[Guo et al., 2009]

                     [Chapelle and Zhang, 2009]



                     Comp 7220   1/11/2012        15
   David Green Blog. http://davidgreen.com/comparative-value-of-google-search-
    rankings (accessed 20th-April-2011)

   Fan Guo and Chao Liu. Statistical Models for Web Search Click Log Analysis.
    Tutorial, 2009

   Fan Guo, Chao Liu, and Yi Min Wang. Efficient multiple-click models in web
    search. In Proceedings of Second Web Search and Data Mining (WSDM)
    Conference, Barcelona, Spain, pages 124-131. ACM, 9-11 February, 2009

   Olivier Chapelle and Ye Zhang. A dynamic bayesian network click model for
    web search and ranking. In Proceedings of the 18th International Conference
    on World Wide web (WWW), Madrid, Spain, pages 1-10, ACM, 20-24 April,
    2009
                                                        Comp 7220   1/11/2012     16
[Tmcnet.com Blog]
Comp 7220   1/11/2012              17

More Related Content

PDF
Determining Relevance Rankings with Search Click Logs
PDF
Sweeny group think-ias2015
PDF
Design the Search Experience
PDF
Team of Rivals: UX, SEO, Content & Dev UXDC 2015
PDF
What IA, UX and SEO Can Learn from Each Other
PDF
Query- And User-Dependent Approach for Ranking Query Results in Web Databases
PDF
Ac02411221125
PDF
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
Determining Relevance Rankings with Search Click Logs
Sweeny group think-ias2015
Design the Search Experience
Team of Rivals: UX, SEO, Content & Dev UXDC 2015
What IA, UX and SEO Can Learn from Each Other
Query- And User-Dependent Approach for Ranking Query Results in Web Databases
Ac02411221125
IJCER (www.ijceronline.com) International Journal of computational Engineerin...

What's hot (20)

PDF
A LOCATION-BASED RECOMMENDER SYSTEM FRAMEWORK TO IMPROVE ACCURACY IN USERBASE...
PDF
IJRET : International Journal of Research in Engineering and TechnologyImprov...
PDF
Structural Balance Theory Based Recommendation for Social Service Portal
PDF
IRJET- Hybrid Book Recommendation System
PDF
IRJET- E-Commerce Recommendation based on Users Rating Data
PDF
FIND MY VENUE: Content & Review Based Location Recommendation System
PDF
Custom-Made Ranking in Databases Establishing and Utilizing an Appropriate Wo...
PDF
Protect Social Connection Using Privacy Predictive Algorithm
PDF
IRJET- Personalize Travel Recommandation based on Facebook Data
PDF
K1803057782
PDF
FHCC: A SOFT HIERARCHICAL CLUSTERING APPROACH FOR COLLABORATIVE FILTERING REC...
PDF
A Hybrid Approach for Personalized Recommender System Using Weighted TFIDF on...
PDF
Search Solutions 2011: Successful Enterprise Search By Design
PPTX
Movie Recommender System Using Artificial Intelligence
PDF
Recommender systems in indian e-commerce context
PDF
SIMILARITY MEASURES FOR RECOMMENDER SYSTEMS: A COMPARATIVE STUDY
PDF
Introduction to Recommendation Systems
PDF
kdd2015-feed (1)
PDF
Recommending the Appropriate Products for target user in E-commerce using SBT...
A LOCATION-BASED RECOMMENDER SYSTEM FRAMEWORK TO IMPROVE ACCURACY IN USERBASE...
IJRET : International Journal of Research in Engineering and TechnologyImprov...
Structural Balance Theory Based Recommendation for Social Service Portal
IRJET- Hybrid Book Recommendation System
IRJET- E-Commerce Recommendation based on Users Rating Data
FIND MY VENUE: Content & Review Based Location Recommendation System
Custom-Made Ranking in Databases Establishing and Utilizing an Appropriate Wo...
Protect Social Connection Using Privacy Predictive Algorithm
IRJET- Personalize Travel Recommandation based on Facebook Data
K1803057782
FHCC: A SOFT HIERARCHICAL CLUSTERING APPROACH FOR COLLABORATIVE FILTERING REC...
A Hybrid Approach for Personalized Recommender System Using Weighted TFIDF on...
Search Solutions 2011: Successful Enterprise Search By Design
Movie Recommender System Using Artificial Intelligence
Recommender systems in indian e-commerce context
SIMILARITY MEASURES FOR RECOMMENDER SYSTEMS: A COMPARATIVE STUDY
Introduction to Recommendation Systems
kdd2015-feed (1)
Recommending the Appropriate Products for target user in E-commerce using SBT...
Ad

Similar to Determining Relevance Rankings from Search Click Logs (20)

PPTX
Opinion-Based Entity Ranking
PDF
Vol 12 No 1 - April 2014
PDF
Better Search Engine Testing - Eric Pugh
PDF
User search goal inference and feedback session using fast generalized – fuzz...
PDF
Click-through relevance ranking in solr &  lucid works enterprise - By Andrz...
PDF
Implementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
PDF
Implementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
PDF
User Priority Based Search on Organizing User Search Histories with Security
PPTX
Organizing user search histories
PDF
IRJET- Text-based Domain and Image Categorization of Google Search Engine usi...
PDF
IRJET- A Novel Technique for Inferring User Search using Feedback Sessions
PPTX
Social Book Search: Techniques and evaluation
PPTX
Search engines
PPTX
Developing and testing search engine algorithms –
PDF
pedersen
PDF
UProRevs-User Profile Relevant Results
PPT
Image re ranking system
PDF
Efficient way of user search location in query processing
DOCX
A new algorithm for inferring user search goals with feedback sessions
PDF
Beyond search queries
Opinion-Based Entity Ranking
Vol 12 No 1 - April 2014
Better Search Engine Testing - Eric Pugh
User search goal inference and feedback session using fast generalized – fuzz...
Click-through relevance ranking in solr &  lucid works enterprise - By Andrz...
Implementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
Implementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
User Priority Based Search on Organizing User Search Histories with Security
Organizing user search histories
IRJET- Text-based Domain and Image Categorization of Google Search Engine usi...
IRJET- A Novel Technique for Inferring User Search using Feedback Sessions
Social Book Search: Techniques and evaluation
Search engines
Developing and testing search engine algorithms –
pedersen
UProRevs-User Profile Relevant Results
Image re ranking system
Efficient way of user search location in query processing
A new algorithm for inferring user search goals with feedback sessions
Beyond search queries
Ad

Recently uploaded (20)

PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
cuic standard and advanced reporting.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Machine learning based COVID-19 study performance prediction
PPTX
Cloud computing and distributed systems.
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Programs and apps: productivity, graphics, security and other tools
PPT
Teaching material agriculture food technology
PPTX
Machine Learning_overview_presentation.pptx
Chapter 3 Spatial Domain Image Processing.pdf
Empathic Computing: Creating Shared Understanding
Diabetes mellitus diagnosis method based random forest with bat algorithm
Advanced methodologies resolving dimensionality complications for autism neur...
Reach Out and Touch Someone: Haptics and Empathic Computing
20250228 LYD VKU AI Blended-Learning.pptx
cuic standard and advanced reporting.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
MIND Revenue Release Quarter 2 2025 Press Release
Network Security Unit 5.pdf for BCA BBA.
Per capita expenditure prediction using model stacking based on satellite ima...
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Machine learning based COVID-19 study performance prediction
Cloud computing and distributed systems.
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Programs and apps: productivity, graphics, security and other tools
Teaching material agriculture food technology
Machine Learning_overview_presentation.pptx

Determining Relevance Rankings from Search Click Logs

  • 1. Dr. Carson Kai-Sang Leung Inderjeet Singh (Database and Data Mining Lab)
  • 2. Introduction  Problem  Solution Methodology  Evaluation Comp 7220 1/11/2012 2
  • 3. Comp 7220 1/11/2012 3
  • 4. Mining user behaviour/preferences  Predict document relevance  Re-rank the search results  Compare different ranking functions (train/test)  Optimize the ad. performance  Query suggestions  How Big are these logs? ◦ 10+ terabyte of entries each day ◦ Composed of billions of distinct (query, url)’s Comp 7220 1/11/2012 4
  • 5. Documents/results Ranking factors Many ranking factors presented in order of depend on query, considered when the relevance to the document and ranking these results query query-document pair Improving ranking based on user Personalized search Recency (temporal) preferences +Social search ranking (likes/dislikes) Comp 7220 1/11/2012 5
  • 6. [David Green; blog] Comp 7220 1/11/2012 6
  • 7. # of clicks received [CIKM'09 Tutorial] Comp 7220 1/11/2012 7
  • 8. Trust factor: Preferences to certain URLs more than the other, e.g., wikipedia.com, stackoverflow.com, Yahoo answers, about.com What is missing (in previous models) ?  Modelling trust factor  Clicks on sponsored results  Related queries/searches (sidebars)  Realistic and flexible assumptions on user behaviour Comp 7220 1/11/2012 8
  • 9. Comp 7220 1/11/2012 9
  • 10. 1. Informational query – “DDR3 memory”, “SATA 3 hard drives”, “American history” 2. Navigational query – “gmail”, “digg”, “CIBC”, “CIBC credit cards” Comp 7220 1/11/2012 10
  • 11. No No Snippet Examine? Snippet Examine? No Yes Yes No Snippet Attractive? Snippet Attractive? No Yes No Yes Enough Utility? Enough Utility? Yes Yes End End Comp 7220 1/11/2012 11
  • 12. Realistic and flexible assumptions on user behaviour (session modelling) Consider trust bias (trust factor) Order results for particular query by relevance scores predicted by model Comparison of this order to the editorial ranking Is it good model? If orderings agree upto a considerable extent Comp 7220 1/11/2012 12
  • 13. Deploy this model as a feature/factor for predicting relevance in learning to rank algorithm Deriving retrieval/ranking function If metric gains over baseline ranking function? Model insights can be used as a feature in ranking function Ranking function tests with different class of queries for metric gains Comp 7220 1/11/2012 13
  • 14. Metrics • Discounted Cumulative Gain (DCG) • Normalized DCG (NDCG) • Precision • Recall Two types of data 1. Search click logs (from real or meta search engines) 2. Benchmarking dataset LEarning TO Rank (LETOR) for information retrieval Comp 7220 1/11/2012 14
  • 15. [Guo et al., 2009] [Chapelle and Zhang, 2009] Comp 7220 1/11/2012 15
  • 16. David Green Blog. http://davidgreen.com/comparative-value-of-google-search- rankings (accessed 20th-April-2011)  Fan Guo and Chao Liu. Statistical Models for Web Search Click Log Analysis. Tutorial, 2009  Fan Guo, Chao Liu, and Yi Min Wang. Efficient multiple-click models in web search. In Proceedings of Second Web Search and Data Mining (WSDM) Conference, Barcelona, Spain, pages 124-131. ACM, 9-11 February, 2009  Olivier Chapelle and Ye Zhang. A dynamic bayesian network click model for web search and ranking. In Proceedings of the 18th International Conference on World Wide web (WWW), Madrid, Spain, pages 1-10, ACM, 20-24 April, 2009 Comp 7220 1/11/2012 16

Editor's Notes

  • #17: User Browsing Model (UBM) [Dupret and Piwowarski, 2008]Dynamic Bayesian Model (DBM) [Chapelle and Zhang, 2009] Session Utility Model (SUM) [Dupret and Liao, 2010]Independent Click Model (ICM) [Guo et. al, 2009]Dependent Click Model (DCM) [Guo et. al, 2009]