SlideShare a Scribd company logo
GENERATING QUERY FACETS USING KNOWLEDGE BASES
ABSTRACT
A query facet is a significant list of information nuggets that explains an underlying aspect of a
query. Existing algorithms mine facets of a query by extracting frequent lists contained in top
search results. The coverage of facets and facet items mined by this kind of methods might be
limited, because only a small number of search results are used. In order to solve this problem,
we propose mining query facets by using knowledge bases which contain high-quality structured
data. Specifically, we first generate facets based on the properties of the entities which are
contained in Freebase and correspond to the query. Second, we mine initial query facets from
search results, then expanding them by finding similar entities from Freebase. Experimental
results show that our proposed method can significantly improve the coverage of facet items over
the state-of-the-art algorithms.
EXISTING SYSTEM
Existing query facet mining algorithms mainly rely on the top search results from search engines.
Dou et al. first introduced the concept of query dimensions [4], which is the same concept as
query facet discussed in this paper. They proposed QDMiner, a system that can automatically
mine query facets by aggregating frequent lists contained in the results. The lists are extracted by
HTML tags (like <select> and <table>), text patterns, and repeat content blocks contained in web
pages. Kong et al proposed two supervised methods, namely QF-I and QF-J, to mine query facets
from the results. In all these existing solutions, facet items are extracted from the top search
results from a search engine (e.g., top 100 search results from Bing.com). More specifically,
facet items are extracted from the lists contained in the results. The problem is that the coverage
of facets mined using this kind of methods might be limited, because some useful words or
phrases might not appear in a list within the search results used and they have no opportunity to
be mined.
DISADVANTAGES
 Previous studies show that many users are not satisfied with this kind of conventional
search result pages.
 Users often have to click and view many documents to summarize the information they
are seeking, especially when they want to learn about a topic that covers different aspects.
 This usually takes a lot of time and troubles the users.
 Mining query facets (or query dimensions) is an emerging approach to solve the problem.
 Existing query facet mining algorithms mainly rely on the top search results from search
engines
PROPOSED SYSTEM
In order to solve this problem, we propose leveraging a knowledge base as a complementary data
source to improve the quality of query facets. Knowledge bases contain highquality structured
information such as entities and their properties and are especially useful when the query is
related to an entity. We propose using both knowledge bases and search results to mine query
facets in this paper. The reason why we don’t abandon search results is that search results reflect
user intent and provide abundant context for facet generation and expansion. Our target is to
improve the recall of facet and facet items by utilizing entities and their properties contained in
knowledge bases, and at the same time, make sure that the accuracy of facet items are not
harmed too much. Our approach consists of two methods which are facet generation and facet
expansion.
Advantages:
By leveraging both knowledge bases and search results, QDMKB breaks the limitation of
only using search results to generate query facets, thus could improve the quality of
facets, especially recall.
Objectives:
 Existing query facet mining algorithms, including QDMiner, QF-I, and QF-J mainly rely
on the top search results from the search engines.
 The coverage of facets mined using this kind of methods might be limited, because
usually only a small number of results are used.
 We propose leveraging knowledge bases as complementary data sources.
 We use two methods, namely facet generation and facet expansion, to mine query facets.
Facet generation directly uses properties in Freebase as candidates, while facet expansion
intends to expand initial facets mined by QDMiner in propertybased and type-based
manners
SYSTEM CONFIGURATION:
HARDWARE REQUIREMENTS:
Hardware - Pentium
Speed - 1.1 GHz
RAM - 1GB
Hard Disk - 20 GB
Floppy Drive - 1.44 MB
Key Board - Standard Windows Keyboard
Mouse - Two or Three Button Mouse
Monitor - SVGA
SOFTWARE REQUIREMENTS:
Operating System : Windows
Technology : Java and J2EE
Web Technologies : Html, JavaScript, CSS
IDE : My Eclipse
Web Server : Tomcat
Tool kit : Android Phone
Database : My SQL
Java Version : J2SDK1.5

More Related Content

PPTX
Structured data and metadata evaluation methodology for organizations looking...
PPTX
Sekhon final 1_ppt
PDF
Context Based Web Indexing For Semantic Web
PPTX
Effective Navigation of Query Results Based On Hierarchies
PPTX
How to Use mirtronDB
PDF
Next-Generation Search Engines for Information Retrieval
PDF
A Novel Data Extraction and Alignment Method for Web Databases
PDF
Implemenation of Enhancing Information Retrieval Using Integration of Invisib...
Structured data and metadata evaluation methodology for organizations looking...
Sekhon final 1_ppt
Context Based Web Indexing For Semantic Web
Effective Navigation of Query Results Based On Hierarchies
How to Use mirtronDB
Next-Generation Search Engines for Information Retrieval
A Novel Data Extraction and Alignment Method for Web Databases
Implemenation of Enhancing Information Retrieval Using Integration of Invisib...

Similar to GENERATING QUERY FACETS USING KNOWLEDGE BASES (20)

PDF
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
PDF
A Survey on Automatically Mining Facets for Queries from their Search Results
PPTX
Evolving the Optimal Relevancy Ranking Model at Dice.com
DOCX
DYNAMIC FACET ORDERING FOR FACETED PRODUCT SEARCH ENGINES
PDF
Annotation for query result records based on domain specific ontology
PPTX
How to improve RAG systems implementation.pptx
PDF
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
DOCX
JPJ1421 Facilitating Document Annotation Using Content and Querying Value
PPTX
Introduction to internet.
PDF
IRJET- Text-based Domain and Image Categorization of Google Search Engine usi...
PPT
3 Understanding Search
DOC
Efficient instant fuzzy search with proximity ranking
PDF
Implementing Site Search in CQ5 / AEM
PDF
You Don't Know SEO
PDF
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
DOCX
IEEE 2014 DOTNET DATA MINING PROJECTS Web image re ranking using query-specif...
DOCX
2014 IEEE DOTNET DATA MINING PROJECT Web image re ranking using query-specifi...
DOCX
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS Web image re ranking using query-sp...
DOCX
2014 IEEE DOTNET CLOUD COMPUTING PROJECT Web image re ranking using query-spe...
PDF
SharePoint User Group Meeting- SharePoint 2013 Search
Enhanced Web Usage Mining Using Fuzzy Clustering and Collaborative Filtering ...
A Survey on Automatically Mining Facets for Queries from their Search Results
Evolving the Optimal Relevancy Ranking Model at Dice.com
DYNAMIC FACET ORDERING FOR FACETED PRODUCT SEARCH ENGINES
Annotation for query result records based on domain specific ontology
How to improve RAG systems implementation.pptx
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
JPJ1421 Facilitating Document Annotation Using Content and Querying Value
Introduction to internet.
IRJET- Text-based Domain and Image Categorization of Google Search Engine usi...
3 Understanding Search
Efficient instant fuzzy search with proximity ranking
Implementing Site Search in CQ5 / AEM
You Don't Know SEO
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
IEEE 2014 DOTNET DATA MINING PROJECTS Web image re ranking using query-specif...
2014 IEEE DOTNET DATA MINING PROJECT Web image re ranking using query-specifi...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS Web image re ranking using query-sp...
2014 IEEE DOTNET CLOUD COMPUTING PROJECT Web image re ranking using query-spe...
SharePoint User Group Meeting- SharePoint 2013 Search
Ad

More from Prasadu Peddi (16)

PDF
Pointers
PDF
String notes
DOCX
B.Com 1year Lab programs
DOCX
COMPUTING SEMANTIC SIMILARITY OF CONCEPTS IN KNOWLEDGE GRAPHS
DOCX
Energy-efficient Query Processing in Web Search Engines
DOCX
MINING COMPETITORS FROM LARGE UNSTRUCTURED DATASETS
DOCX
UNDERSTAND SHORTTEXTS BY HARVESTING & ANALYZING SEMANTIKNOWLEDGE
DOCX
SOCIRANK: IDENTIFYING AND RANKING PREVALENT NEWS TOPICS USING SOCIAL MEDIA FA...
DOCX
QUERY EXPANSION WITH ENRICHED USER PROFILES FOR PERSONALIZED SEARCH UTILIZING...
DOCX
COLLABORATIVE FILTERING-BASED RECOMMENDATION OF ONLINE SOCIAL VOTING
PPTX
A Cross Tenant Access Control (CTAC) Model for Cloud Computing: Formal Specif...
PPTX
Time and Attribute Factors Combined Access Control on Time-Sensitive Data in ...
PPTX
Attribute Based Storage Supporting Secure Deduplication of Encrypted D...
PPTX
RAAC: Robust and Auditable Access Control with Multiple Attribute Authorities...
PPTX
Provably Secure Key-Aggregate Cryptosystems with Broadcast Aggregate Keys for...
PPTX
Identity-Based Remote Data Integrity Checking With Perfect Data Privacy Prese...
Pointers
String notes
B.Com 1year Lab programs
COMPUTING SEMANTIC SIMILARITY OF CONCEPTS IN KNOWLEDGE GRAPHS
Energy-efficient Query Processing in Web Search Engines
MINING COMPETITORS FROM LARGE UNSTRUCTURED DATASETS
UNDERSTAND SHORTTEXTS BY HARVESTING & ANALYZING SEMANTIKNOWLEDGE
SOCIRANK: IDENTIFYING AND RANKING PREVALENT NEWS TOPICS USING SOCIAL MEDIA FA...
QUERY EXPANSION WITH ENRICHED USER PROFILES FOR PERSONALIZED SEARCH UTILIZING...
COLLABORATIVE FILTERING-BASED RECOMMENDATION OF ONLINE SOCIAL VOTING
A Cross Tenant Access Control (CTAC) Model for Cloud Computing: Formal Specif...
Time and Attribute Factors Combined Access Control on Time-Sensitive Data in ...
Attribute Based Storage Supporting Secure Deduplication of Encrypted D...
RAAC: Robust and Auditable Access Control with Multiple Attribute Authorities...
Provably Secure Key-Aggregate Cryptosystems with Broadcast Aggregate Keys for...
Identity-Based Remote Data Integrity Checking With Perfect Data Privacy Prese...
Ad

Recently uploaded (20)

PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPTX
Geodesy 1.pptx...............................................
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
DOCX
573137875-Attendance-Management-System-original
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PDF
composite construction of structures.pdf
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
Welding lecture in detail for understanding
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Foundation to blockchain - A guide to Blockchain Tech
Geodesy 1.pptx...............................................
UNIT-1 - COAL BASED THERMAL POWER PLANTS
CH1 Production IntroductoryConcepts.pptx
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
573137875-Attendance-Management-System-original
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
composite construction of structures.pdf
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Welding lecture in detail for understanding
Lecture Notes Electrical Wiring System Components
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
R24 SURVEYING LAB MANUAL for civil enggi
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf

GENERATING QUERY FACETS USING KNOWLEDGE BASES

  • 1. GENERATING QUERY FACETS USING KNOWLEDGE BASES ABSTRACT A query facet is a significant list of information nuggets that explains an underlying aspect of a query. Existing algorithms mine facets of a query by extracting frequent lists contained in top search results. The coverage of facets and facet items mined by this kind of methods might be limited, because only a small number of search results are used. In order to solve this problem, we propose mining query facets by using knowledge bases which contain high-quality structured data. Specifically, we first generate facets based on the properties of the entities which are contained in Freebase and correspond to the query. Second, we mine initial query facets from search results, then expanding them by finding similar entities from Freebase. Experimental results show that our proposed method can significantly improve the coverage of facet items over the state-of-the-art algorithms. EXISTING SYSTEM Existing query facet mining algorithms mainly rely on the top search results from search engines. Dou et al. first introduced the concept of query dimensions [4], which is the same concept as query facet discussed in this paper. They proposed QDMiner, a system that can automatically mine query facets by aggregating frequent lists contained in the results. The lists are extracted by HTML tags (like <select> and <table>), text patterns, and repeat content blocks contained in web pages. Kong et al proposed two supervised methods, namely QF-I and QF-J, to mine query facets from the results. In all these existing solutions, facet items are extracted from the top search results from a search engine (e.g., top 100 search results from Bing.com). More specifically, facet items are extracted from the lists contained in the results. The problem is that the coverage of facets mined using this kind of methods might be limited, because some useful words or phrases might not appear in a list within the search results used and they have no opportunity to be mined. DISADVANTAGES  Previous studies show that many users are not satisfied with this kind of conventional search result pages.
  • 2.  Users often have to click and view many documents to summarize the information they are seeking, especially when they want to learn about a topic that covers different aspects.  This usually takes a lot of time and troubles the users.  Mining query facets (or query dimensions) is an emerging approach to solve the problem.  Existing query facet mining algorithms mainly rely on the top search results from search engines PROPOSED SYSTEM In order to solve this problem, we propose leveraging a knowledge base as a complementary data source to improve the quality of query facets. Knowledge bases contain highquality structured information such as entities and their properties and are especially useful when the query is related to an entity. We propose using both knowledge bases and search results to mine query facets in this paper. The reason why we don’t abandon search results is that search results reflect user intent and provide abundant context for facet generation and expansion. Our target is to improve the recall of facet and facet items by utilizing entities and their properties contained in knowledge bases, and at the same time, make sure that the accuracy of facet items are not harmed too much. Our approach consists of two methods which are facet generation and facet expansion. Advantages: By leveraging both knowledge bases and search results, QDMKB breaks the limitation of only using search results to generate query facets, thus could improve the quality of facets, especially recall. Objectives:
  • 3.  Existing query facet mining algorithms, including QDMiner, QF-I, and QF-J mainly rely on the top search results from the search engines.  The coverage of facets mined using this kind of methods might be limited, because usually only a small number of results are used.  We propose leveraging knowledge bases as complementary data sources.  We use two methods, namely facet generation and facet expansion, to mine query facets. Facet generation directly uses properties in Freebase as candidates, while facet expansion intends to expand initial facets mined by QDMiner in propertybased and type-based manners SYSTEM CONFIGURATION: HARDWARE REQUIREMENTS: Hardware - Pentium Speed - 1.1 GHz RAM - 1GB Hard Disk - 20 GB Floppy Drive - 1.44 MB Key Board - Standard Windows Keyboard Mouse - Two or Three Button Mouse Monitor - SVGA SOFTWARE REQUIREMENTS:
  • 4. Operating System : Windows Technology : Java and J2EE Web Technologies : Html, JavaScript, CSS IDE : My Eclipse Web Server : Tomcat Tool kit : Android Phone Database : My SQL Java Version : J2SDK1.5