#13/ 19, 1st Floor, Municipal Colony, Kangayanellore Road, Gandhi Nagar, Vellore – 6.
Off: 0416-2247353 / 6066663 Mo: +91 9500218218
Website: www.shakastech.com, Email - id: shakastech@gmail.com, info@shakastech.com
Top-Down Xml Keyword Query Processing
ABSTRACT
Efficiently answering XML keyword queries has attracted much research effort in the last
decade. The key factors resulting in the inefficiency of existing methods are the common
ancestor-repetition (CAR) and visiting-useless-nodes (VUN) problems. To address the CAR
problem, we propose a generic top-down processing strategy to answer a given keyword query
w.r.t. LCA/SLCA/ELCA semantics. By “top-down”, we mean that we visit all common ancestor
(CA) nodes in a depth-first, left-to-right order; by “generic”, we mean that our method is
independent of the query semantics. To address the VUN problem, we propose to use child
nodes, rather than descendant nodes to test the satisfiability of a node v w.r.t. the given
semantics. We propose two algorithms that are based on either traditional inverted lists or our
newly proposed LLists to improve the overall performance. We further propose several
algorithms that are based on hash search to simplify the operation of finding CA nodes from all
involved LLists. The experimental results verify the benefits of our methods according to various
evaluation metrics.
EXISTING SYSTEM
Typically, an XML document can be modeled as a node labeled tree T. For a given
keyword query Q, several semantics have been proposed to define meaningful results, for which
the basic semantics is Lowest Common Ancestor. Based on LCA, the most widely adopted query
semantics are Exclusive LCA (ELCA) and Smallest LCA (SLCA). SLCA defines a subset of
LCA nodes, of which no LCA is the ancestor of any other LCA. As a comparison, ELCA tries to
capture more meaningful results; it may take some LCAs that are not SLCAs as meaningful
results. Obviously, a system supporting more query semantics will facilitate users to find
interesting results, since any query semantics cannot work well in all situations. However, each
of existing algorithms focuses on certain query semantics. Simply implementing them to support
all query semantics will result in big index size and make it unscalable to new query semantics,
and more importantly, these algorithms are still inefficient due to redundant computation.
#13/ 19, 1st Floor, Municipal Colony, Kangayanellore Road, Gandhi Nagar, Vellore – 6.
Off: 0416-2247353 / 6066663 Mo: +91 9500218218
Website: www.shakastech.com, Email - id: shakastech@gmail.com, info@shakastech.com
DISADVANTAGES OF EXISTING SYSTEM:
1. The existing algorithms are still suffer from redundant computation by visiting many
useless components
2. Common-ancestor-repetition problem
3. Visiting useless nodes problem
PROPOSED SYSTEM
Considering the above problems, we propose to support different query semantics with a
generic processing strategy, which is more efficient by avoiding both the CAR and VUN
problems, such that to further reduce the number of visited components. To address the CAR
problem, we propose a generic top-down XML keyword query processing strategy. To address
the VUN problem, we propose to use child nodes, rather than descendant nodes, to test the
satisfiability of node v with respect to xLCA semantics. We propose a labeling-scheme-
independent inverted index, namely LList, which maintains every node in each level of a
traditional inverted list only once and keeps all necessary information for answering a given
keyword query without any loss.
ADVANTAGES OF PROPOSED SYSTEM:
1. Reduce the time complexity
2. Avoid the CAR and VUN problems
3. Based on LLists, our second top-down algorithm, namely TDxLCA-L, further reduces
the time complexity.
MODULES
1. LList Index Module
2. Compute CA Nodes Module
#13/ 19, 1st Floor, Municipal Colony, Kangayanellore Road, Gandhi Nagar, Vellore – 6.
Off: 0416-2247353 / 6066663 Mo: +91 9500218218
Website: www.shakastech.com, Email - id: shakastech@gmail.com, info@shakastech.com
MODULE DESCRIPTION:
LList Index:
Labeling-scheme-independent inverted index (LList), based on this we can reduce both
the cost and calling times of binary search operation.
Compute CA Nodes:
Top-Down Exclusive LCA (TDELCA) recursively gets all CA nodes in a top-down way.
For each CA node, it finds out the number of occurrences of each query keyword in its subtree,
i.e., the length of each of its child list, and then gets node’s child CA nodes by intersecting
node’s child lists using binary search operation.
SYSTEM REQUIREMENTS
HARDWARE REQUIREMENTS:
 Processor - Pentium –IV
 Speed - 1.1 Ghz
 Ram - 256 Mb
 Hard Disk - 20 Gb
 Key Board - Standard Windows Keyboard
 Mouse - Two or Three Button Mouse
 Monitor - SVGA
SOFTWARE REQUIREMENTS:
 Operating System - Windows XP
 Coding Language - Java

More Related Content

PDF
Exploiting rateless codes in cloud storage systems
PPT
Kutadgu Corporate Profile
PPTX
Bitmap Indexes for Relational XML Twig Query Processing
PDF
A locality-sensitive-low-rank-model
PDF
Optimized search and-compute circuits and their application to query evaluati...
PDF
Xs path navigation on xml schemas made easy
PDF
Tmacs a robust and verifiable threshold multi authority access control system...
PDF
Tmacs a robust and verifiable threshold multi authority access control system...
Exploiting rateless codes in cloud storage systems
Kutadgu Corporate Profile
Bitmap Indexes for Relational XML Twig Query Processing
A locality-sensitive-low-rank-model
Optimized search and-compute circuits and their application to query evaluati...
Xs path navigation on xml schemas made easy
Tmacs a robust and verifiable threshold multi authority access control system...
Tmacs a robust and verifiable threshold multi authority access control system...

Similar to Top down xml keyword query processing (20)

DOCX
Key updating for leakage resiliency with application to aes modes of operation
DOCX
Key updating for leakage resiliency with application to aes modes of operation
PDF
Exploit every bit effective caching for high dimensional nearest neighbor search
DOCX
Optimal configuration of network coding in ad hoc networks
PDF
Fdms 1st cycle exp.pdf
PDF
Thwarting selfish behavior in 802.11 wla ns
PDF
Overlay automata and algorithms for fast and scalable regular expression matc...
PDF
Key aggregate cryptosystem for scalable data sharing in cloud storage
DOCX
Effective key management in dynamic wireless sensor networks
DOCX
An efficient cluster tree based data collection scheme for large mobile wirel...
PDF
Opportunistic routing with congestion diversity in wireless ad hoc networks
PDF
BENCHMARKING LARGE LANGUAGE MODELS ON NETWORK OPTIMIZATION
PDF
Benchmarking Large Language Models on Network Optimization
DOCX
Mobile data gathering with load balanced clustering and dual data uploading i...
DOCX
Deep feature based text clustering and its explanation
PDF
Acl Optimisation - Computer Networks
PDF
Enhancing network security and performance using optimized acls
DOCX
A probabilistic misbehavior detection scheme towards efficient trust establis...
DOCX
ORCHESTRATING BULK DATA TRANSFERS ACROSS GEO-DISTRIBUTED DATACENTERS
PDF
A Systematic Approach to Creating Behavioral Models (white paper) v1.0
Key updating for leakage resiliency with application to aes modes of operation
Key updating for leakage resiliency with application to aes modes of operation
Exploit every bit effective caching for high dimensional nearest neighbor search
Optimal configuration of network coding in ad hoc networks
Fdms 1st cycle exp.pdf
Thwarting selfish behavior in 802.11 wla ns
Overlay automata and algorithms for fast and scalable regular expression matc...
Key aggregate cryptosystem for scalable data sharing in cloud storage
Effective key management in dynamic wireless sensor networks
An efficient cluster tree based data collection scheme for large mobile wirel...
Opportunistic routing with congestion diversity in wireless ad hoc networks
BENCHMARKING LARGE LANGUAGE MODELS ON NETWORK OPTIMIZATION
Benchmarking Large Language Models on Network Optimization
Mobile data gathering with load balanced clustering and dual data uploading i...
Deep feature based text clustering and its explanation
Acl Optimisation - Computer Networks
Enhancing network security and performance using optimized acls
A probabilistic misbehavior detection scheme towards efficient trust establis...
ORCHESTRATING BULK DATA TRANSFERS ACROSS GEO-DISTRIBUTED DATACENTERS
A Systematic Approach to Creating Behavioral Models (white paper) v1.0
Ad

More from Shakas Technologies (20)

DOCX
A Review on Deep-Learning-Based Cyberbullying Detection
DOCX
A Personal Privacy Data Protection Scheme for Encryption and Revocation of Hi...
DOCX
A Novel Framework for Credit Card.
DOCX
A Comparative Analysis of Sampling Techniques for Click-Through Rate Predicti...
DOCX
NS2 Final Year Project Titles 2023- 2024
DOCX
MATLAB Final Year IEEE Project Titles 2023-2024
DOCX
Latest Python IEEE Project Titles 2023-2024
DOCX
EMOTION RECOGNITION BY TEXTUAL TWEETS CLASSIFICATION USING VOTING CLASSIFIER ...
DOCX
CYBER THREAT INTELLIGENCE MINING FOR PROACTIVE CYBERSECURITY DEFENSE
DOCX
Detecting Mental Disorders in social Media through Emotional patterns-The cas...
DOCX
COMMERCE FAKE PRODUCT REVIEWS MONITORING AND DETECTION
DOCX
CO2 EMISSION RATING BY VEHICLES USING DATA SCIENCE
DOCX
Toward Effective Evaluation of Cyber Defense Threat Based Adversary Emulation...
DOCX
Optimizing Numerical Weather Prediction Model Performance Using Machine Learn...
DOCX
Nature-Based Prediction Model of Bug Reports Based on Ensemble Machine Learni...
DOCX
Multi-Class Stress Detection Through Heart Rate Variability A Deep Neural Net...
DOCX
Identifying Hot Topic Trends in Streaming Text Data Using News Sequential Evo...
DOCX
Fighting Money Laundering With Statistics and Machine Learning.docx
DOCX
Explainable Artificial Intelligence for Patient Safety A Review of Applicatio...
DOCX
Ensemble Deep Learning-Based Prediction of Fraudulent Cryptocurrency Transact...
A Review on Deep-Learning-Based Cyberbullying Detection
A Personal Privacy Data Protection Scheme for Encryption and Revocation of Hi...
A Novel Framework for Credit Card.
A Comparative Analysis of Sampling Techniques for Click-Through Rate Predicti...
NS2 Final Year Project Titles 2023- 2024
MATLAB Final Year IEEE Project Titles 2023-2024
Latest Python IEEE Project Titles 2023-2024
EMOTION RECOGNITION BY TEXTUAL TWEETS CLASSIFICATION USING VOTING CLASSIFIER ...
CYBER THREAT INTELLIGENCE MINING FOR PROACTIVE CYBERSECURITY DEFENSE
Detecting Mental Disorders in social Media through Emotional patterns-The cas...
COMMERCE FAKE PRODUCT REVIEWS MONITORING AND DETECTION
CO2 EMISSION RATING BY VEHICLES USING DATA SCIENCE
Toward Effective Evaluation of Cyber Defense Threat Based Adversary Emulation...
Optimizing Numerical Weather Prediction Model Performance Using Machine Learn...
Nature-Based Prediction Model of Bug Reports Based on Ensemble Machine Learni...
Multi-Class Stress Detection Through Heart Rate Variability A Deep Neural Net...
Identifying Hot Topic Trends in Streaming Text Data Using News Sequential Evo...
Fighting Money Laundering With Statistics and Machine Learning.docx
Explainable Artificial Intelligence for Patient Safety A Review of Applicatio...
Ensemble Deep Learning-Based Prediction of Fraudulent Cryptocurrency Transact...
Ad

Recently uploaded (20)

PDF
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
PDF
BP 505 T. PHARMACEUTICAL JURISPRUDENCE (UNIT 2).pdf
PPTX
Share_Module_2_Power_conflict_and_negotiation.pptx
PPTX
Computer Architecture Input Output Memory.pptx
PDF
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
PDF
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 2).pdf
PDF
Skin Care and Cosmetic Ingredients Dictionary ( PDFDrive ).pdf
PDF
International_Financial_Reporting_Standa.pdf
PDF
Uderstanding digital marketing and marketing stratergie for engaging the digi...
PDF
Race Reva University – Shaping Future Leaders in Artificial Intelligence
PDF
Hazard Identification & Risk Assessment .pdf
PPTX
What’s under the hood: Parsing standardized learning content for AI
PPTX
Education and Perspectives of Education.pptx
PDF
LIFE & LIVING TRILOGY- PART (1) WHO ARE WE.pdf
PDF
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
PPTX
A powerpoint presentation on the Revised K-10 Science Shaping Paper
PDF
LIFE & LIVING TRILOGY - PART (3) REALITY & MYSTERY.pdf
PPTX
Introduction to pro and eukaryotes and differences.pptx
PDF
BP 505 T. PHARMACEUTICAL JURISPRUDENCE (UNIT 1).pdf
PDF
advance database management system book.pdf
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
BP 505 T. PHARMACEUTICAL JURISPRUDENCE (UNIT 2).pdf
Share_Module_2_Power_conflict_and_negotiation.pptx
Computer Architecture Input Output Memory.pptx
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 2).pdf
Skin Care and Cosmetic Ingredients Dictionary ( PDFDrive ).pdf
International_Financial_Reporting_Standa.pdf
Uderstanding digital marketing and marketing stratergie for engaging the digi...
Race Reva University – Shaping Future Leaders in Artificial Intelligence
Hazard Identification & Risk Assessment .pdf
What’s under the hood: Parsing standardized learning content for AI
Education and Perspectives of Education.pptx
LIFE & LIVING TRILOGY- PART (1) WHO ARE WE.pdf
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
A powerpoint presentation on the Revised K-10 Science Shaping Paper
LIFE & LIVING TRILOGY - PART (3) REALITY & MYSTERY.pdf
Introduction to pro and eukaryotes and differences.pptx
BP 505 T. PHARMACEUTICAL JURISPRUDENCE (UNIT 1).pdf
advance database management system book.pdf

Top down xml keyword query processing

  • 1. #13/ 19, 1st Floor, Municipal Colony, Kangayanellore Road, Gandhi Nagar, Vellore – 6. Off: 0416-2247353 / 6066663 Mo: +91 9500218218 Website: www.shakastech.com, Email - id: shakastech@gmail.com, info@shakastech.com Top-Down Xml Keyword Query Processing ABSTRACT Efficiently answering XML keyword queries has attracted much research effort in the last decade. The key factors resulting in the inefficiency of existing methods are the common ancestor-repetition (CAR) and visiting-useless-nodes (VUN) problems. To address the CAR problem, we propose a generic top-down processing strategy to answer a given keyword query w.r.t. LCA/SLCA/ELCA semantics. By “top-down”, we mean that we visit all common ancestor (CA) nodes in a depth-first, left-to-right order; by “generic”, we mean that our method is independent of the query semantics. To address the VUN problem, we propose to use child nodes, rather than descendant nodes to test the satisfiability of a node v w.r.t. the given semantics. We propose two algorithms that are based on either traditional inverted lists or our newly proposed LLists to improve the overall performance. We further propose several algorithms that are based on hash search to simplify the operation of finding CA nodes from all involved LLists. The experimental results verify the benefits of our methods according to various evaluation metrics. EXISTING SYSTEM Typically, an XML document can be modeled as a node labeled tree T. For a given keyword query Q, several semantics have been proposed to define meaningful results, for which the basic semantics is Lowest Common Ancestor. Based on LCA, the most widely adopted query semantics are Exclusive LCA (ELCA) and Smallest LCA (SLCA). SLCA defines a subset of LCA nodes, of which no LCA is the ancestor of any other LCA. As a comparison, ELCA tries to capture more meaningful results; it may take some LCAs that are not SLCAs as meaningful results. Obviously, a system supporting more query semantics will facilitate users to find interesting results, since any query semantics cannot work well in all situations. However, each of existing algorithms focuses on certain query semantics. Simply implementing them to support all query semantics will result in big index size and make it unscalable to new query semantics, and more importantly, these algorithms are still inefficient due to redundant computation.
  • 2. #13/ 19, 1st Floor, Municipal Colony, Kangayanellore Road, Gandhi Nagar, Vellore – 6. Off: 0416-2247353 / 6066663 Mo: +91 9500218218 Website: www.shakastech.com, Email - id: shakastech@gmail.com, info@shakastech.com DISADVANTAGES OF EXISTING SYSTEM: 1. The existing algorithms are still suffer from redundant computation by visiting many useless components 2. Common-ancestor-repetition problem 3. Visiting useless nodes problem PROPOSED SYSTEM Considering the above problems, we propose to support different query semantics with a generic processing strategy, which is more efficient by avoiding both the CAR and VUN problems, such that to further reduce the number of visited components. To address the CAR problem, we propose a generic top-down XML keyword query processing strategy. To address the VUN problem, we propose to use child nodes, rather than descendant nodes, to test the satisfiability of node v with respect to xLCA semantics. We propose a labeling-scheme- independent inverted index, namely LList, which maintains every node in each level of a traditional inverted list only once and keeps all necessary information for answering a given keyword query without any loss. ADVANTAGES OF PROPOSED SYSTEM: 1. Reduce the time complexity 2. Avoid the CAR and VUN problems 3. Based on LLists, our second top-down algorithm, namely TDxLCA-L, further reduces the time complexity. MODULES 1. LList Index Module 2. Compute CA Nodes Module
  • 3. #13/ 19, 1st Floor, Municipal Colony, Kangayanellore Road, Gandhi Nagar, Vellore – 6. Off: 0416-2247353 / 6066663 Mo: +91 9500218218 Website: www.shakastech.com, Email - id: shakastech@gmail.com, info@shakastech.com MODULE DESCRIPTION: LList Index: Labeling-scheme-independent inverted index (LList), based on this we can reduce both the cost and calling times of binary search operation. Compute CA Nodes: Top-Down Exclusive LCA (TDELCA) recursively gets all CA nodes in a top-down way. For each CA node, it finds out the number of occurrences of each query keyword in its subtree, i.e., the length of each of its child list, and then gets node’s child CA nodes by intersecting node’s child lists using binary search operation. SYSTEM REQUIREMENTS HARDWARE REQUIREMENTS:  Processor - Pentium –IV  Speed - 1.1 Ghz  Ram - 256 Mb  Hard Disk - 20 Gb  Key Board - Standard Windows Keyboard  Mouse - Two or Three Button Mouse  Monitor - SVGA SOFTWARE REQUIREMENTS:  Operating System - Windows XP  Coding Language - Java