Providing Resilient Xpaths for External Adaptation Engines Iñaki Paz LKS, S. Coop. ONEKIN Research Group – UPV/EHU Donostia - San Sebastián, Spain June 14th, 2010
Index Introduction XPath Expressions to select contents Web pages get changed!!!! In Space In Time Evaluation Conclusions Introduction
Adaptation aware Web Applications Architecture: Server Browser Depending on user profile and context, the Web Application reacts executing adaptation rules providing personalized contents. RULES “kind of” CONFIGURE ADAPTATION HTTP URL + Params Content Adaptation Rules Rules address what is adapted and how, based on user profile and context
Adaptation Aware Applications Adaptation cases / rules are foreseen on application development New not foreseen adaptation needs may appear through time New Possible Adaptation needs: New interaction protocol (FTP) to handle application docs. New comm. language (RSS) to present data. Provide a RESTful interface to application concepts New data filters on searches for given user. Add external mashups related to certain content.
Adaptation as an Application Layer Adaptation Layer can be inside the application May access to application’s business logic and APIs Complex adaptations Architecture: Application Layer Browser Adapted Content HTTP / HTML? Content Protocol Adaptation Layer Adaptation Layer can be  EXTERNAL  to the application Adapt Layer works like any other Browser (HTTP + HTML) More flexible,  Adaptation FULLY independent  from Application Adaptation Rules
External Adaptation Architecture: Application Layer Browser HTTP / HTML Content (HTML Pages) Adaptation Layer Adapted Communication Protocol Adapted Content Content (HTML Pages) HTTP / HTML? http://www.dapper.net/open/  Web Page => RSS, Google Gadget GreaseMonkey  Scripts JS Scripts for the Browser to personalize app.
External Adaptation Adaptation Rules need to specify WHICH elements adaptation affects on the page. Distinct technologies available to select elements on pages: Text Patterns Regular Expressions Complex Expression Languages  This work focusses on Xpath Most browsers support DOM Level 3 Xpath specification Easy to transform HTML to XHTML (e.g. Jtidy) Application Layer Content (HTML Pages) Adaptation Layer
Index Introduction XPath expressions to select contents Web pages get changed!!!! In Space In Time Evaluation Conclusions XPath to Select Contents
External Adaptation XPATH is a language to select nodes in XML Documents XPATH is based on the TREE Structure of Documents /html/body[1]/table[2]/tr[1]/td[3]/table[1]/tr[1]/td[2]/table[3]/tr[4]
Web App Pages Change!!! /html/body[1]/table[2]/tr[1]/td[3]/table[1]/tr[1]/td[2]/table[3]/ tr[4] /html/body[1]/table[2]/tr[1]/td[3]/table[1]/tr[1]/td[2]/table[3]/ tr[6] If the page changes, wanted element may not be correctly selected
Web App Pages Change!!! /html/body[1]/table[2]/tr[1]/td[3]/table[1]/tr[1]/td[2]/table[3]/ tr[4] /html/body[1]/table[2]/tr[1]/td[3]/table[1]/tr[1]/td[2]/table[3]/ tr[6] If the page changes, wanted element may not be correctly selected  OUR OBJECTIVE IS  TO OBTAIN  CHANGE RESILIENT  XPATH EXPRESSIONS
Web App Pages Change!!! Given the XPaths: The Xpath: Would select the same elements. Notice that this XPath characterizes the banner as those ROWS with only ONE column on a table whose cellpadding is ‘2’ Obtaining these XPath expressions by hand is cumbersome and error prone. A tool has been developed to obtain a node’s absolute XPath expression and then generate an optimized XPATH. Firefox plugins like XPather or XPath Checker (among others) enable obtaining a node’s absolute XPath. /html/body[1]/table[2]/tr[1]/td[3]/table[1]/tr[1]/td[2]/table[3]/ tr[4] /html/body[1]/table[2]/tr[1]/td[3]/table[1]/tr[1]/td[2]/table[3]/ tr[6] //table[@cellpadding=‘2’]/tr[count(*)=1]
Web App Pages are different!!! Distinct Pages => Distinct Structure, Distinct Contents => Distinct XPaths XPaths are patterns to be applied over a pageClass set. Page Class = The SET of pages that describe the same type of information and have a similar page structure.
Index Introduction XPath expressions to select contents Web pages get changed!!!! In Space In Time Evaluation Conclusions Web Pages get changed!!!!
Variability in Space Variability in Space denotes the distinct running versions of a given page accessible on a given time. Web applications pages change their contents!!! Different searches provide different results Information caducity Advert introduction User and context adaptations application is aware of An XPath working on a page of a given class may not work on another of the same class Need to induce an XPath robust to those changes  from a pageClass set  contaning most of the page variants
XPath Induction /html/body[1]/table[2]/tr[1]/td[3]/table[1]/tr[1]/td[2]/table[3]/ tr[3] /html/body[1]/table[2]/tr[1]/td[3]/table[1]/tr[1]/td[2]/table[3]/ tr[6] Each STEP in an absolute XPATH  selects one and only one  ELEMENT
Induction: Differences on Paths 3 Main difference types may be found /a[n]/b[m]/ c[o] /a[n]/b[m]/ c[p] ---------------------- /a[n]/b[m]/ c[conds] Position /a[n]/b/c[m] /a[n]/d/c[m] ------------------ /a[n]/ *[conds] /c[m] /a[n]/b[m]/c[o] /a[n]/d/b[m]/c[o] ----------------------- /a[n] //b[conds] /c[o] Node (e.g. div vs. span) Depth
These types may appear combined Induction: Differences on Paths /a[n]/b[m]/c[o] /a[n]/d[m]/c[p] ------------------ /a[n] /*[conds]/c[conds] Position & node combination Sample on  http://www.carsearch.com : 2 of Position /html/body[1]/table[2]/tr[1]/…/ table[3] / tr[7] /html/body[1]/table[2]/tr[1]/…/ table[2] / tr[10] ------------------ /html/body[1]/table[2]/tr[1]/ … / table[@width='100%'][@border='0'] [@cellpadding='2'][@cellspacing='0'][tr] / tr[count(*)=1][count(td)=1]
LOOP on XPaths resolving unconsidered differences Problems: /…/table[@class] /…/table[@style] Induction provides an XPath working on all the samples, does not optimize it Ends on expressions like: Induction Algorithm html/body[1]/table[2]/tr[1]/ … /table[@width='100%'][@border='0'] [@cellpadding='2'][@cellspacing='0'][tr]/tr[count(*)=1][count(td)=1]
¿Which is the problem? XPath is based on structure. Small changes may affect structure. Solution: Remove as much structural information as possible keeping equivalence with original XPath. Web Pages Evolve in time!!!
Definition: Two XPaths are equivalent if they recover the same nodes. [Miklau 2004] have demonstrated that this problem is NP-Complete for a subset of XPath. Definition: An XPath is resilient to change C, if the set of recovered nodes is independent of making change C or not. Web Pages Evolve in time!!!
An Example: ¿Which XPath seems more robust? /html/body/table/tr/td/span /html//span The optimum for a change may not be such for another change. But the probability of being affected by a change IS different. Web Pages Evolve in time!!!
Generic probabilistic heuristic approach for global  optimization  problems. Iteration starting from a solution: Get new valid  neighbor  solution (RANDOM) Test if new solution improves older based on an  energy calculation function Else, check if probabilistically solution is accepted (RANDOM) Iterate until solution is good enough or computation budget has been exhausted Simulated annealing with this function has been used: F(XPath)= a * nºsteps + b * nºwildcards + c * conditions Simulated Annealing
Selecting a neighbor solution: Solutions obtained by the modification of an XPath step Resulting solution obtained by the modification must be equivalent (select the same nodes). This is checked on SA execution. Simulated Annealing
How to characterize an XPATH? Parts of an XPath: Steps (/table): FIX an structure element on the path Wildcards (/*): FIX an undetermined structure element on the path Conditions: FIX a condition over an elements attribute Conditions: Style (@width) vs. description (@class, @id, @alt) Change Likelihood vs. Condition singularity Energy Function characterization:  F(xpath)=a*steps + b*wildcards + c*styleConds + d*descrConds Simulated Annealing
Sample on CarSearch Area to be adapted: BANNERS Simulated Annealing
Sample on CarSearch Area to be adapted: BANNERS Simulated Annealing Note that optimized Xpaths  somehow  determine WHAT characterizes the selection on the document
Index Introduction XPath Expressions to select contents Web pages get changed!!!! In Space In Time Evaluation Conclusions Evaluation
Evaluation How to obtain page evolution for a Web app? Select apps and watch if and how change Consult archive.org web site home pages. www.yahoo.com   ||  www.elmundo.es   Tests: One page each 10 days. All pages analyzed for changes. Changes => milestones 2 or 3 different pages between milestones to generate Xpath Tested with pages AFTER milestone.
Evaluation Changes evaluated as: Minor: small changes in esthetics and basic structure (e.g. add rows to table) Major: App redesign, new layout, etc. Results: 90% of XPaths were resilient to Minor Changes 10% of XPaths were resilient to Major Changes Conclusion: The approach works for evolutionary changes, not revolutionary ones
Index Introduction XPath Expressions to select contents Web pages get changed!!!! In Space In Time Evaluation Conclusions Conclusions
Conclusions External Adaptation Tools have appeared Require selection patterns, such as XPath Pattern Resilience to Web App Changes is important Application of Induction and SA techniques Further specific treatments based on the language should be taken into account (a table always contains rows and columns) on energy function.
Contact Iñaki Paz [email_address] http://www.lks.es http://www.onekin.org

More Related Content

PDF
PDF
Programming Without Coding Technology (PWCT) Environment
ODP
FileMan Training Part 2
PDF
Compare And Merge Scripts
PPT
Sql intro & ddl 1
ODP
FileMan Training Part 1
PPTX
Using SPMetal for faster SharePoint development
PDF
LaTeX Part 2
Programming Without Coding Technology (PWCT) Environment
FileMan Training Part 2
Compare And Merge Scripts
Sql intro & ddl 1
FileMan Training Part 1
Using SPMetal for faster SharePoint development
LaTeX Part 2

What's hot (20)

PPT
ODP
FileMan Training Part 3
PPT
Ch 9 S Q L
PDF
SessionTen_CaseStudies
PDF
Learn Latex
PPTX
Using SP Metal for faster share point development
PPTX
Latex for beginner
PPTX
Introduction Latex
PPT
Introduction to latex by Rouhollah Nabati
PDF
LaTeX Part 1
PPT
Sas Plots Graphs
PPT
SAS Access / SAS Connect
PDF
Introduction to Latex
PPT
Day Of Dot Net Ann Arbor 2007
PPTX
Presentation1
PPTX
Anchor data type,cursor data type,array data type
PDF
Chap16 scr
PDF
Introduction to LaTeX
PDF
Introduction to LaTeX
FileMan Training Part 3
Ch 9 S Q L
SessionTen_CaseStudies
Learn Latex
Using SP Metal for faster share point development
Latex for beginner
Introduction Latex
Introduction to latex by Rouhollah Nabati
LaTeX Part 1
Sas Plots Graphs
SAS Access / SAS Connect
Introduction to Latex
Day Of Dot Net Ann Arbor 2007
Presentation1
Anchor data type,cursor data type,array data type
Chap16 scr
Introduction to LaTeX
Introduction to LaTeX
Ad

Viewers also liked (18)

PPTX
Google API
PPT
Slideshare
PPSX
Starkweather Roofing Large Commercial Projects
PPTX
La Familia
PDF
1412 Eye Submisison Final
PDF
Keeping It Cool
DOC
Ziyad CV
PPT
Webquest
DOC
Ziyad Cv Del 2009-12
PPT
Global Outreach - Thailand Project
PDF
Making a living with WordPress in 2009
PPT
PDF
Re usable continuous-time analog sva assertions
DOC
Ziyad CV - Deloitte Nov-2010
PDF
Vamos criar redes fora das redes sociais?
PDF
Simon Mainwaring Next Generation Digital Summit - December 2016
PDF
Simon mainwaring codacon presentation - 18 jan2017
Google API
Slideshare
Starkweather Roofing Large Commercial Projects
La Familia
1412 Eye Submisison Final
Keeping It Cool
Ziyad CV
Webquest
Ziyad Cv Del 2009-12
Global Outreach - Thailand Project
Making a living with WordPress in 2009
Re usable continuous-time analog sva assertions
Ziyad CV - Deloitte Nov-2010
Vamos criar redes fora das redes sociais?
Simon Mainwaring Next Generation Digital Summit - December 2016
Simon mainwaring codacon presentation - 18 jan2017
Ad

Similar to HT2010 Paper Presentation (20)

PPT
Potter’S Wheel
PPTX
6 10-presentation
PDF
react hook and wesite making structure ppt
DOC
IEEE 2014 JAVA DATA MINING PROJECTS Xs path navigation on xml schemas made easy
DOC
2014 IEEE JAVA DATA MINING PROJECT Xs path navigation on xml schemas made easy
PPT
Migration from ASP to ASP.NET
DOCX
1 Project 2 Introduction - the SeaPort Project seri.docx
PPT
ASP.NET MVC - In the Wild
PDF
Building social and RESTful frameworks
PPT
WPF Windows Presentation Foundation A detailed overview Version1.2
PPTX
Asp Net Advance Topics
PDF
Ur/Web Programing Language: a brief overview
PDF
POS/409 ENTIRE CLASS UOP TUTORIALS
PDF
Deep dive into the native multi model database ArangoDB
PPT
Modeling Search Computing Applications
PDF
phpWebApp presentation
PPTX
Unit 1 - TypeScript & Introduction to Angular CLI.pptx
PPTX
Productionalizing ML : Real Experience
PPTX
Share point 2010-uiimprovements
PPT
Acutate erd pro
Potter’S Wheel
6 10-presentation
react hook and wesite making structure ppt
IEEE 2014 JAVA DATA MINING PROJECTS Xs path navigation on xml schemas made easy
2014 IEEE JAVA DATA MINING PROJECT Xs path navigation on xml schemas made easy
Migration from ASP to ASP.NET
1 Project 2 Introduction - the SeaPort Project seri.docx
ASP.NET MVC - In the Wild
Building social and RESTful frameworks
WPF Windows Presentation Foundation A detailed overview Version1.2
Asp Net Advance Topics
Ur/Web Programing Language: a brief overview
POS/409 ENTIRE CLASS UOP TUTORIALS
Deep dive into the native multi model database ArangoDB
Modeling Search Computing Applications
phpWebApp presentation
Unit 1 - TypeScript & Introduction to Angular CLI.pptx
Productionalizing ML : Real Experience
Share point 2010-uiimprovements
Acutate erd pro

Recently uploaded (20)

PPTX
Microsoft Excel 365/2024 Beginner's training
PPTX
Configure Apache Mutual Authentication
PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PPTX
Modernising the Digital Integration Hub
PDF
A review of recent deep learning applications in wood surface defect identifi...
PPTX
2018-HIPAA-Renewal-Training for executives
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PPTX
Chapter 5: Probability Theory and Statistics
PDF
UiPath Agentic Automation session 1: RPA to Agents
PDF
Convolutional neural network based encoder-decoder for efficient real-time ob...
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PPTX
Custom Battery Pack Design Considerations for Performance and Safety
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PPT
What is a Computer? Input Devices /output devices
PPTX
Benefits of Physical activity for teenagers.pptx
Microsoft Excel 365/2024 Beginner's training
Configure Apache Mutual Authentication
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
sustainability-14-14877-v2.pddhzftheheeeee
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
Modernising the Digital Integration Hub
A review of recent deep learning applications in wood surface defect identifi...
2018-HIPAA-Renewal-Training for executives
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
A comparative study of natural language inference in Swahili using monolingua...
Taming the Chaos: How to Turn Unstructured Data into Decisions
Chapter 5: Probability Theory and Statistics
UiPath Agentic Automation session 1: RPA to Agents
Convolutional neural network based encoder-decoder for efficient real-time ob...
Final SEM Unit 1 for mit wpu at pune .pptx
Custom Battery Pack Design Considerations for Performance and Safety
Enhancing emotion recognition model for a student engagement use case through...
A contest of sentiment analysis: k-nearest neighbor versus neural network
What is a Computer? Input Devices /output devices
Benefits of Physical activity for teenagers.pptx

HT2010 Paper Presentation

  • 1. Providing Resilient Xpaths for External Adaptation Engines Iñaki Paz LKS, S. Coop. ONEKIN Research Group – UPV/EHU Donostia - San Sebastián, Spain June 14th, 2010
  • 2. Index Introduction XPath Expressions to select contents Web pages get changed!!!! In Space In Time Evaluation Conclusions Introduction
  • 3. Adaptation aware Web Applications Architecture: Server Browser Depending on user profile and context, the Web Application reacts executing adaptation rules providing personalized contents. RULES “kind of” CONFIGURE ADAPTATION HTTP URL + Params Content Adaptation Rules Rules address what is adapted and how, based on user profile and context
  • 4. Adaptation Aware Applications Adaptation cases / rules are foreseen on application development New not foreseen adaptation needs may appear through time New Possible Adaptation needs: New interaction protocol (FTP) to handle application docs. New comm. language (RSS) to present data. Provide a RESTful interface to application concepts New data filters on searches for given user. Add external mashups related to certain content.
  • 5. Adaptation as an Application Layer Adaptation Layer can be inside the application May access to application’s business logic and APIs Complex adaptations Architecture: Application Layer Browser Adapted Content HTTP / HTML? Content Protocol Adaptation Layer Adaptation Layer can be EXTERNAL to the application Adapt Layer works like any other Browser (HTTP + HTML) More flexible, Adaptation FULLY independent from Application Adaptation Rules
  • 6. External Adaptation Architecture: Application Layer Browser HTTP / HTML Content (HTML Pages) Adaptation Layer Adapted Communication Protocol Adapted Content Content (HTML Pages) HTTP / HTML? http://www.dapper.net/open/ Web Page => RSS, Google Gadget GreaseMonkey Scripts JS Scripts for the Browser to personalize app.
  • 7. External Adaptation Adaptation Rules need to specify WHICH elements adaptation affects on the page. Distinct technologies available to select elements on pages: Text Patterns Regular Expressions Complex Expression Languages This work focusses on Xpath Most browsers support DOM Level 3 Xpath specification Easy to transform HTML to XHTML (e.g. Jtidy) Application Layer Content (HTML Pages) Adaptation Layer
  • 8. Index Introduction XPath expressions to select contents Web pages get changed!!!! In Space In Time Evaluation Conclusions XPath to Select Contents
  • 9. External Adaptation XPATH is a language to select nodes in XML Documents XPATH is based on the TREE Structure of Documents /html/body[1]/table[2]/tr[1]/td[3]/table[1]/tr[1]/td[2]/table[3]/tr[4]
  • 10. Web App Pages Change!!! /html/body[1]/table[2]/tr[1]/td[3]/table[1]/tr[1]/td[2]/table[3]/ tr[4] /html/body[1]/table[2]/tr[1]/td[3]/table[1]/tr[1]/td[2]/table[3]/ tr[6] If the page changes, wanted element may not be correctly selected
  • 11. Web App Pages Change!!! /html/body[1]/table[2]/tr[1]/td[3]/table[1]/tr[1]/td[2]/table[3]/ tr[4] /html/body[1]/table[2]/tr[1]/td[3]/table[1]/tr[1]/td[2]/table[3]/ tr[6] If the page changes, wanted element may not be correctly selected OUR OBJECTIVE IS TO OBTAIN CHANGE RESILIENT XPATH EXPRESSIONS
  • 12. Web App Pages Change!!! Given the XPaths: The Xpath: Would select the same elements. Notice that this XPath characterizes the banner as those ROWS with only ONE column on a table whose cellpadding is ‘2’ Obtaining these XPath expressions by hand is cumbersome and error prone. A tool has been developed to obtain a node’s absolute XPath expression and then generate an optimized XPATH. Firefox plugins like XPather or XPath Checker (among others) enable obtaining a node’s absolute XPath. /html/body[1]/table[2]/tr[1]/td[3]/table[1]/tr[1]/td[2]/table[3]/ tr[4] /html/body[1]/table[2]/tr[1]/td[3]/table[1]/tr[1]/td[2]/table[3]/ tr[6] //table[@cellpadding=‘2’]/tr[count(*)=1]
  • 13. Web App Pages are different!!! Distinct Pages => Distinct Structure, Distinct Contents => Distinct XPaths XPaths are patterns to be applied over a pageClass set. Page Class = The SET of pages that describe the same type of information and have a similar page structure.
  • 14. Index Introduction XPath expressions to select contents Web pages get changed!!!! In Space In Time Evaluation Conclusions Web Pages get changed!!!!
  • 15. Variability in Space Variability in Space denotes the distinct running versions of a given page accessible on a given time. Web applications pages change their contents!!! Different searches provide different results Information caducity Advert introduction User and context adaptations application is aware of An XPath working on a page of a given class may not work on another of the same class Need to induce an XPath robust to those changes from a pageClass set contaning most of the page variants
  • 16. XPath Induction /html/body[1]/table[2]/tr[1]/td[3]/table[1]/tr[1]/td[2]/table[3]/ tr[3] /html/body[1]/table[2]/tr[1]/td[3]/table[1]/tr[1]/td[2]/table[3]/ tr[6] Each STEP in an absolute XPATH selects one and only one ELEMENT
  • 17. Induction: Differences on Paths 3 Main difference types may be found /a[n]/b[m]/ c[o] /a[n]/b[m]/ c[p] ---------------------- /a[n]/b[m]/ c[conds] Position /a[n]/b/c[m] /a[n]/d/c[m] ------------------ /a[n]/ *[conds] /c[m] /a[n]/b[m]/c[o] /a[n]/d/b[m]/c[o] ----------------------- /a[n] //b[conds] /c[o] Node (e.g. div vs. span) Depth
  • 18. These types may appear combined Induction: Differences on Paths /a[n]/b[m]/c[o] /a[n]/d[m]/c[p] ------------------ /a[n] /*[conds]/c[conds] Position & node combination Sample on http://www.carsearch.com : 2 of Position /html/body[1]/table[2]/tr[1]/…/ table[3] / tr[7] /html/body[1]/table[2]/tr[1]/…/ table[2] / tr[10] ------------------ /html/body[1]/table[2]/tr[1]/ … / table[@width='100%'][@border='0'] [@cellpadding='2'][@cellspacing='0'][tr] / tr[count(*)=1][count(td)=1]
  • 19. LOOP on XPaths resolving unconsidered differences Problems: /…/table[@class] /…/table[@style] Induction provides an XPath working on all the samples, does not optimize it Ends on expressions like: Induction Algorithm html/body[1]/table[2]/tr[1]/ … /table[@width='100%'][@border='0'] [@cellpadding='2'][@cellspacing='0'][tr]/tr[count(*)=1][count(td)=1]
  • 20. ¿Which is the problem? XPath is based on structure. Small changes may affect structure. Solution: Remove as much structural information as possible keeping equivalence with original XPath. Web Pages Evolve in time!!!
  • 21. Definition: Two XPaths are equivalent if they recover the same nodes. [Miklau 2004] have demonstrated that this problem is NP-Complete for a subset of XPath. Definition: An XPath is resilient to change C, if the set of recovered nodes is independent of making change C or not. Web Pages Evolve in time!!!
  • 22. An Example: ¿Which XPath seems more robust? /html/body/table/tr/td/span /html//span The optimum for a change may not be such for another change. But the probability of being affected by a change IS different. Web Pages Evolve in time!!!
  • 23. Generic probabilistic heuristic approach for global optimization problems. Iteration starting from a solution: Get new valid neighbor solution (RANDOM) Test if new solution improves older based on an energy calculation function Else, check if probabilistically solution is accepted (RANDOM) Iterate until solution is good enough or computation budget has been exhausted Simulated annealing with this function has been used: F(XPath)= a * nºsteps + b * nºwildcards + c * conditions Simulated Annealing
  • 24. Selecting a neighbor solution: Solutions obtained by the modification of an XPath step Resulting solution obtained by the modification must be equivalent (select the same nodes). This is checked on SA execution. Simulated Annealing
  • 25. How to characterize an XPATH? Parts of an XPath: Steps (/table): FIX an structure element on the path Wildcards (/*): FIX an undetermined structure element on the path Conditions: FIX a condition over an elements attribute Conditions: Style (@width) vs. description (@class, @id, @alt) Change Likelihood vs. Condition singularity Energy Function characterization: F(xpath)=a*steps + b*wildcards + c*styleConds + d*descrConds Simulated Annealing
  • 26. Sample on CarSearch Area to be adapted: BANNERS Simulated Annealing
  • 27. Sample on CarSearch Area to be adapted: BANNERS Simulated Annealing Note that optimized Xpaths somehow determine WHAT characterizes the selection on the document
  • 28. Index Introduction XPath Expressions to select contents Web pages get changed!!!! In Space In Time Evaluation Conclusions Evaluation
  • 29. Evaluation How to obtain page evolution for a Web app? Select apps and watch if and how change Consult archive.org web site home pages. www.yahoo.com || www.elmundo.es Tests: One page each 10 days. All pages analyzed for changes. Changes => milestones 2 or 3 different pages between milestones to generate Xpath Tested with pages AFTER milestone.
  • 30. Evaluation Changes evaluated as: Minor: small changes in esthetics and basic structure (e.g. add rows to table) Major: App redesign, new layout, etc. Results: 90% of XPaths were resilient to Minor Changes 10% of XPaths were resilient to Major Changes Conclusion: The approach works for evolutionary changes, not revolutionary ones
  • 31. Index Introduction XPath Expressions to select contents Web pages get changed!!!! In Space In Time Evaluation Conclusions Conclusions
  • 32. Conclusions External Adaptation Tools have appeared Require selection patterns, such as XPath Pattern Resilience to Web App Changes is important Application of Induction and SA techniques Further specific treatments based on the language should be taken into account (a table always contains rows and columns) on energy function.
  • 33. Contact Iñaki Paz [email_address] http://www.lks.es http://www.onekin.org