SlideShare a Scribd company logo
Shauvik Roy Choudhary,
                                      Husayn Versee, and Alessandro Orso
                                      Georgia Institute of Technology
Partially supported by the NSF awards CCF-0916605 and CCF-0725202 to Georgia Tech
2
3
4
HTTP Request




     Server Side
(Web Application Server)




                                          5
HTTP Request


                           HTTP Response

     Server Side
(Web Application Server)




                                           6
HTTP Request


                           HTTP Response

     Server Side
(Web Application Server)




                                           7
<html>
 <head>
   <script src="script.js"></script>
   <link href="style.css" rel="stylesheet" />
                           HTTP Request
 </head>
 <body>
   <h1>Ajax Search:</h1> HTTP Response
   <input type="text" id="query" />
   <input type="button" onclick="search()"
       Server Side
  (Web Application Server)  value="Search" />
   <h2>Results:</h2>
   <div id="stats"></div>
   <ul id="results"></ul>
 </body>
</html>

                                                8
<html>
  <head>              document
     <script src="script.js"></script>
     <link href="style.css" rel="stylesheet" />
                                 HTTP Request
  </head>
  <body>
      head
     <h1>Ajax Search:</h1> HTTP Response
                                body
     <input type="text" id="query" />
     <input type="button" onclick="search()"
         Server Side
script Application Server)
    (Web link                     value="Search" />
     <h2>Results:</h2>
     <div id="stats"></div>
     <ul id="results"></ul>
           h1      input   input     h2    div  ul
  </body>
</html>

                                                      9
<html>
  <head>              document
     <script src="script.js"></script>
     <link href="style.css" rel="stylesheet" />
                                 HTTP Request
  </head>
  <body>                                No shadow
      head
     <h1>Ajax Search:</h1> HTTP Response
                                body
     <input type="text" id="query" />
     <input type="button" onclick="search()"
         Server Side
script Application Server)
    (Web link                     value="Search" />
                                        Result count
     <h2>Results:</h2>
     <div id="stats"></div>
     <ul id="results"></ul>
           h1      input   input     h2Displaced border
                                             div     ul
  </body>
</html>

                                                          10
11
Mozilla Firefox   Internet Explorer
                                      12
13
14
15
16
17
18
Manual inspection           DOM differs           Mimic end user’s
   is expensive            between browsers          perception




Ignore variable elements    Locate broken         Even work with
      on webpage            element in code   browser security controls

                                                                          19
   Goal:
    Compare behavior of web pages in different browsers

   High level view of the approach:



             Data collection   Ignore variable
                                  elements
                                                                      Report

                                                 Structural Visual
                                                  analysis analysis

                                                                               20
   From each browser under consideration, the
    technique collects:

                        body

      div                div         div

h1     a     ul        div       div
                  ul     div                 div

                               div     div   div

    Structural Information (DOM)
     ( tagname, id, xpath, coord,
     clickable, visible, zindex, hash )
                                                   Visual Information (Screenshot)
                                                                                     21
   Load page twice in reference browser:


                      body                                        body

     div               div         div                div          div         div

h1    a    ul        div       div               h1   a     ul   div       div
                ul     div                 div                                         div
                                                            ul    div
                             div     div   div                           div     div   div




                                                                                             22
   Load page twice in reference browser:


                      body                                        body

     div               div         div                div          div         div

h1    a    ul        div       div               h1   a     ul   div       div
                ul     div                 div                                         div
                                                            ul    div
                             div     div   div                           div     div   div




                                                                                             23
   Page in reference browser over two subsequent requests:




                                                              24
   Page in reference browser over two subsequent requests:




                                                              25
   Page in reference browser over two subsequent requests:




                                                              26
   Page in reference browser over two subsequent requests:




                                                              27
   Page in reference browser over two subsequent requests:




                                                              28
   Page in reference browser over two subsequent requests:




                                                              29
   Page in reference browser over two subsequent requests:




                                                              30
   Match the nodes in the DOM tree of each browser to those in
    reference browser:

                       body                                             body

      div               div         div                div               div         div

h1    a     ul        div       div               h1   a     ul        div      div
                 ul     div                 div
                                                                  ul     div         div
                              div     div   div                                              div

                                                                               div     div   div
   Match the nodes in the DOM tree of each browser to those in
    reference browser:
                                          id = “footer”                                        id = “footer”
                       body                                                     body

      div               div         div                        div               div         div

h1    a     ul        div       div                       h1   a     ul        div      div
                 ul     div                   div
                                                                          ul     div         div
                              div     div     div                                                    div

                                                                                       div     div   div
   Match the nodes in the DOM tree of each browser to those in
    reference browser:

                       body                                                 body

      div               div         div                    div               div         div

h1    a     ul        div       div                   h1   a     ul        div      div
                 ul     div                   div
                                                                      ul     div         div
                              div     div     div                                                   div

                                          id = null                                div     div      div

                                                                                               id = null
   Match the nodes in the DOM tree of each browser to those in
    reference browser:

                       body                                                 body

      div               div         div                    div               div         div

h1    a     ul        div       div                   h1   a     ul        div      div
                 ul     div                 div
                                                                      ul     div         div
                              div     div   div                                                  div

                                    tagname = “div”                                div     div   div

                                                                                     tagname = “div”
   Match the nodes in the DOM tree of each browser to those in
    reference browser:

                         body                                               body

       div                div         div                  div               div         div

 h1     a     ul        div       div                h1    a     ul        div      div
                   ul     div                 div
                                                                      ul     div         div
                                div     div   div                                                div

                                                                                   div     div   div
xPath1 = /html/body/div[1]/div[1]/div[1]            xPath2 = /html/body/div[1]/div[1]/div/div[1]




                                                                                                       35
   Match the nodes in the DOM tree of each browser to those in
    reference browser:

                       body                                             body

      div               div         div                div               div         div

h1    a     ul        div       div               h1   a     ul        div      div
                 ul     div                 div
                                                                  ul     div         div
                              div     div   div                                              div

                                                                               div     div   div




                                                                                                   36
   Match the nodes in the DOM tree of each browser to those in
    reference browser:

                       body                                             body

      div               div         div                div               div         div

h1    a     ul        div       div               h1   a     ul        div      div
                 ul     div                 div
                                                                  ul     div         div
                              div     div   div                                              div

                                                                               div     div   div
   Match the nodes in the DOM tree of each browser to those in
    reference browser:

                       body                                             body

      div               div         div                div               div         div

h1    a     ul        div       div               h1   a     ul        div      div
                 ul     div                 div
                                                                  ul     div         div
                              div     div   div                                              div

                                                                               div     div   div
   Match the nodes in the DOM tree of each browser to those in
    reference browser:

                       body                                             body

      div               div         div                div               div         div

h1    a     ul        div       div               h1   a     ul        div      div
                 ul     div                 div
                                                                  ul     div         div
                              div     div   div                                              div

                                                                               div     div   div
40
41
42
43
44
45
Type of issues found:
• Positional shifts
• Size differences
• Visibility differences
• General appearance issues




                              46
47
Reference Browser screenshot   Target Browser screenshot




                                                           48
   RQ1 : Can            identify cross-browser
    issues in web applications?

   RQ2 : Can           identify such issues
    without generating too many false positives?




                                                   49
Test Subjects
       Subject Name   URL                                 Type
       GATECH         http://www.gatech.edu               University
       BECKER         http://www.beckerelectric.com       Company
       CHESNUT        http://www.chestnutridgecabin.com   Lodge
       CRSTI          http://www.crsti.org                Hospital
       DUICTRL        http://www.duicentral.com           Lawyer
       JTWED          http://www.jtweddings.com           Photography
       ORTHO          http://www.otorohanga.co.nz         Informational
       PROTOOLS       http://www.protoolsexpress.com      Company
       SPEED          http://www.speedsound.com           E-Commerce


For each page P and browser B considered
    1. Load P in B and in the reference browser
    2. Compare the page in the two browsers using our technique
    3. Store the produced reports
    4. Manually checked for false positives and false negatives
                                                                          50
# Issues Reported         False       False
  Subject
                                               Positives   Negatives
GATECH      2    3         0        1     6    0 (0%)          0
BECKER      2    12        0        3    17    1 (6.25%)       0
CHESNUT     8    4         0        4    16    2 (14.3%)       0
CRSTI       4    4         0        2     9    0 (0%)          0
DUICTRL     9    8         0        6    23    4 (21%)         0
JTWED       3    9         0        1    14    0 (0%)          0
ORTHO       0    0         0        4     4    2 (100%)        0
PROTOOLS    4    5         0       11    20    9 (81%)         0
SPEED       23   5         0        5    33    3 (10%)         0
TOTAL       55   50        0       37    142   21 (17%)        0




                                                                       51
# Issues Reported         False       False
  Subject
                                               Positives   Negatives
GATECH      2    3         0        1     6    0 (0%)          0
BECKER      2    12        0        3    17    1 (6.25%)       0
CHESNUT     8    4         0        4    16    2 (14.3%)       0
CRSTI       4    4         0        2     9    0 (0%)          0
DUICTRL     9    8         0        6    23    4 (21%)         0
JTWED       3    9         0        1    14    0 (0%)          0
ORTHO       0    0         0        4     4    2 (100%)        0
PROTOOLS    4    5         0       11    20    9 (81%)         0
SPEED       23   5         0        5    33    3 (10%)         0
TOTAL       55   50        0       37    142   21 (17%)        0




                                                                       52
   Industrial Tools
     Adobe Browser Lab & MS Expression Web
      ▪ Require manual inspection
     Browsera (launched Summer 2010)
      ▪ Simple DOM matching (from experience using the tool)

   Research Tools
     Eaton & Memon [IJWET07]
      ▪ Requires manual classification. Limited to html tags only
     Tamm [GTAC09]
      ▪ Expensive and is focused on layout of text elements

                                                                    54
Summary




          55

More Related Content

PPTX
Cross Browser Issues - few solutions inspired by smashing magazine
PPTX
Advanced Cross-Browser Layout with Internet Explorer 8
PPT
How to do better Quality Assurance for Cross-Browser Testing
PPTX
Continuous Testing of eCommerce Apps
PPTX
Cross browser testing
PDF
Cross-browser testing in the real world
PDF
A Graph-Based Method For Cross-Entity Threat Detection
PPTX
Compatibility testing
Cross Browser Issues - few solutions inspired by smashing magazine
Advanced Cross-Browser Layout with Internet Explorer 8
How to do better Quality Assurance for Cross-Browser Testing
Continuous Testing of eCommerce Apps
Cross browser testing
Cross-browser testing in the real world
A Graph-Based Method For Cross-Entity Threat Detection
Compatibility testing

Similar to Automated Identification of Cross-browser Issues in Web Applications (20)

PPTX
Gwt Deep Dive
PDF
Hedis - GET HBase via Redis
PDF
The Internal Architecture of Chrome Developer Tools
PPTX
Combining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
PDF
Building performance auf der Developer Conference Hamburg
PPTX
Gwt 2,3 Deep dive
PDF
Design Based Dev
PPT
Android Introduction
PDF
AtlasCamp 2015: How HipChat ships at the speed of awesome
PDF
Dynamic User Interfaces for Desktop and Mobile
PDF
From Backbone to Ember and Back(bone) Again
PDF
Unit 06: The Web Application Extension for UML
PPTX
Architeching a php application with interfaces to the ib mi
PDF
Vered Flis: Because performance matters! Architecture Next 20
PPTX
PDF
Understanding Your Content
PDF
DeepSee Web: Angular Render for InterSystems DeepSee Dashboards
PDF
仕事の効率が格段にアップするクラウドサービス活用術
PDF
WEB I - 01 - Introduction to Web Development
PDF
DirectToWeb 2.0
Gwt Deep Dive
Hedis - GET HBase via Redis
The Internal Architecture of Chrome Developer Tools
Combining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
Building performance auf der Developer Conference Hamburg
Gwt 2,3 Deep dive
Design Based Dev
Android Introduction
AtlasCamp 2015: How HipChat ships at the speed of awesome
Dynamic User Interfaces for Desktop and Mobile
From Backbone to Ember and Back(bone) Again
Unit 06: The Web Application Extension for UML
Architeching a php application with interfaces to the ib mi
Vered Flis: Because performance matters! Architecture Next 20
Understanding Your Content
DeepSee Web: Angular Render for InterSystems DeepSee Dashboards
仕事の効率が格段にアップするクラウドサービス活用術
WEB I - 01 - Introduction to Web Development
DirectToWeb 2.0
Ad

More from ICSM 2010 (15)

PDF
A tree kernel based approach for clone detection
PPTX
Scalable Semantic Web-based Source Code Search Infrastructure
PDF
2D and 3D Visualizations In Wikidev2.0 M. Fokaefs, D. Serrano, B. Tansey and ...
PDF
Wiki dev nlp
PDF
iFL: An Interactive Environment for Understanding Feature Implementations
PDF
Using Clone Detection to Identify Bugs in Concurrent Software
PDF
Physical and Conceptual Identifier Dispersion: Measures and Relation to Fault...
PDF
Automatically Repairing Test Cases for Evolving Method Declarations
PDF
Reverse Engineering Object-Oriented Distributed Systems
PPTX
Software asset management
PPTX
Successfulresearch 100915022614-phpapp01
PPTX
Enabling multi tenancy(An Industrial Experience Report)
PDF
Ponsini automatic slides
PDF
Studying the impact of dependency network measures on software quality
PDF
Icsm2010 Announcement
A tree kernel based approach for clone detection
Scalable Semantic Web-based Source Code Search Infrastructure
2D and 3D Visualizations In Wikidev2.0 M. Fokaefs, D. Serrano, B. Tansey and ...
Wiki dev nlp
iFL: An Interactive Environment for Understanding Feature Implementations
Using Clone Detection to Identify Bugs in Concurrent Software
Physical and Conceptual Identifier Dispersion: Measures and Relation to Fault...
Automatically Repairing Test Cases for Evolving Method Declarations
Reverse Engineering Object-Oriented Distributed Systems
Software asset management
Successfulresearch 100915022614-phpapp01
Enabling multi tenancy(An Industrial Experience Report)
Ponsini automatic slides
Studying the impact of dependency network measures on software quality
Icsm2010 Announcement
Ad

Recently uploaded (20)

PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
sap open course for s4hana steps from ECC to s4
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Cloud computing and distributed systems.
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Approach and Philosophy of On baking technology
PDF
Machine learning based COVID-19 study performance prediction
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Encapsulation theory and applications.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Electronic commerce courselecture one. Pdf
PDF
Empathic Computing: Creating Shared Understanding
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
sap open course for s4hana steps from ECC to s4
“AI and Expert System Decision Support & Business Intelligence Systems”
Chapter 3 Spatial Domain Image Processing.pdf
Cloud computing and distributed systems.
Building Integrated photovoltaic BIPV_UPV.pdf
The AUB Centre for AI in Media Proposal.docx
Approach and Philosophy of On baking technology
Machine learning based COVID-19 study performance prediction
Understanding_Digital_Forensics_Presentation.pptx
Review of recent advances in non-invasive hemoglobin estimation
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Encapsulation theory and applications.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Unlocking AI with Model Context Protocol (MCP)
Electronic commerce courselecture one. Pdf
Empathic Computing: Creating Shared Understanding

Automated Identification of Cross-browser Issues in Web Applications

  • 1. Shauvik Roy Choudhary, Husayn Versee, and Alessandro Orso Georgia Institute of Technology Partially supported by the NSF awards CCF-0916605 and CCF-0725202 to Georgia Tech
  • 2. 2
  • 3. 3
  • 4. 4
  • 5. HTTP Request Server Side (Web Application Server) 5
  • 6. HTTP Request HTTP Response Server Side (Web Application Server) 6
  • 7. HTTP Request HTTP Response Server Side (Web Application Server) 7
  • 8. <html> <head> <script src="script.js"></script> <link href="style.css" rel="stylesheet" /> HTTP Request </head> <body> <h1>Ajax Search:</h1> HTTP Response <input type="text" id="query" /> <input type="button" onclick="search()" Server Side (Web Application Server) value="Search" /> <h2>Results:</h2> <div id="stats"></div> <ul id="results"></ul> </body> </html> 8
  • 9. <html> <head> document <script src="script.js"></script> <link href="style.css" rel="stylesheet" /> HTTP Request </head> <body> head <h1>Ajax Search:</h1> HTTP Response body <input type="text" id="query" /> <input type="button" onclick="search()" Server Side script Application Server) (Web link value="Search" /> <h2>Results:</h2> <div id="stats"></div> <ul id="results"></ul> h1 input input h2 div ul </body> </html> 9
  • 10. <html> <head> document <script src="script.js"></script> <link href="style.css" rel="stylesheet" /> HTTP Request </head> <body> No shadow head <h1>Ajax Search:</h1> HTTP Response body <input type="text" id="query" /> <input type="button" onclick="search()" Server Side script Application Server) (Web link value="Search" /> Result count <h2>Results:</h2> <div id="stats"></div> <ul id="results"></ul> h1 input input h2Displaced border div ul </body> </html> 10
  • 11. 11
  • 12. Mozilla Firefox Internet Explorer 12
  • 13. 13
  • 14. 14
  • 15. 15
  • 16. 16
  • 17. 17
  • 18. 18
  • 19. Manual inspection DOM differs Mimic end user’s is expensive between browsers perception Ignore variable elements Locate broken Even work with on webpage element in code browser security controls 19
  • 20. Goal: Compare behavior of web pages in different browsers  High level view of the approach: Data collection Ignore variable elements Report Structural Visual analysis analysis 20
  • 21. From each browser under consideration, the technique collects: body div div div h1 a ul div div ul div div div div div Structural Information (DOM) ( tagname, id, xpath, coord, clickable, visible, zindex, hash ) Visual Information (Screenshot) 21
  • 22. Load page twice in reference browser: body body div div div div div div h1 a ul div div h1 a ul div div ul div div div ul div div div div div div div 22
  • 23. Load page twice in reference browser: body body div div div div div div h1 a ul div div h1 a ul div div ul div div div ul div div div div div div div 23
  • 24. Page in reference browser over two subsequent requests: 24
  • 25. Page in reference browser over two subsequent requests: 25
  • 26. Page in reference browser over two subsequent requests: 26
  • 27. Page in reference browser over two subsequent requests: 27
  • 28. Page in reference browser over two subsequent requests: 28
  • 29. Page in reference browser over two subsequent requests: 29
  • 30. Page in reference browser over two subsequent requests: 30
  • 31. Match the nodes in the DOM tree of each browser to those in reference browser: body body div div div div div div h1 a ul div div h1 a ul div div ul div div ul div div div div div div div div div
  • 32. Match the nodes in the DOM tree of each browser to those in reference browser: id = “footer” id = “footer” body body div div div div div div h1 a ul div div h1 a ul div div ul div div ul div div div div div div div div div
  • 33. Match the nodes in the DOM tree of each browser to those in reference browser: body body div div div div div div h1 a ul div div h1 a ul div div ul div div ul div div div div div div id = null div div div id = null
  • 34. Match the nodes in the DOM tree of each browser to those in reference browser: body body div div div div div div h1 a ul div div h1 a ul div div ul div div ul div div div div div div tagname = “div” div div div tagname = “div”
  • 35. Match the nodes in the DOM tree of each browser to those in reference browser: body body div div div div div div h1 a ul div div h1 a ul div div ul div div ul div div div div div div div div div xPath1 = /html/body/div[1]/div[1]/div[1] xPath2 = /html/body/div[1]/div[1]/div/div[1] 35
  • 36. Match the nodes in the DOM tree of each browser to those in reference browser: body body div div div div div div h1 a ul div div h1 a ul div div ul div div ul div div div div div div div div div 36
  • 37. Match the nodes in the DOM tree of each browser to those in reference browser: body body div div div div div div h1 a ul div div h1 a ul div div ul div div ul div div div div div div div div div
  • 38. Match the nodes in the DOM tree of each browser to those in reference browser: body body div div div div div div h1 a ul div div h1 a ul div div ul div div ul div div div div div div div div div
  • 39. Match the nodes in the DOM tree of each browser to those in reference browser: body body div div div div div div h1 a ul div div h1 a ul div div ul div div ul div div div div div div div div div
  • 40. 40
  • 41. 41
  • 42. 42
  • 43. 43
  • 44. 44
  • 45. 45
  • 46. Type of issues found: • Positional shifts • Size differences • Visibility differences • General appearance issues 46
  • 47. 47
  • 48. Reference Browser screenshot Target Browser screenshot 48
  • 49. RQ1 : Can identify cross-browser issues in web applications?  RQ2 : Can identify such issues without generating too many false positives? 49
  • 50. Test Subjects Subject Name URL Type GATECH http://www.gatech.edu University BECKER http://www.beckerelectric.com Company CHESNUT http://www.chestnutridgecabin.com Lodge CRSTI http://www.crsti.org Hospital DUICTRL http://www.duicentral.com Lawyer JTWED http://www.jtweddings.com Photography ORTHO http://www.otorohanga.co.nz Informational PROTOOLS http://www.protoolsexpress.com Company SPEED http://www.speedsound.com E-Commerce For each page P and browser B considered 1. Load P in B and in the reference browser 2. Compare the page in the two browsers using our technique 3. Store the produced reports 4. Manually checked for false positives and false negatives 50
  • 51. # Issues Reported False False Subject Positives Negatives GATECH 2 3 0 1 6 0 (0%) 0 BECKER 2 12 0 3 17 1 (6.25%) 0 CHESNUT 8 4 0 4 16 2 (14.3%) 0 CRSTI 4 4 0 2 9 0 (0%) 0 DUICTRL 9 8 0 6 23 4 (21%) 0 JTWED 3 9 0 1 14 0 (0%) 0 ORTHO 0 0 0 4 4 2 (100%) 0 PROTOOLS 4 5 0 11 20 9 (81%) 0 SPEED 23 5 0 5 33 3 (10%) 0 TOTAL 55 50 0 37 142 21 (17%) 0 51
  • 52. # Issues Reported False False Subject Positives Negatives GATECH 2 3 0 1 6 0 (0%) 0 BECKER 2 12 0 3 17 1 (6.25%) 0 CHESNUT 8 4 0 4 16 2 (14.3%) 0 CRSTI 4 4 0 2 9 0 (0%) 0 DUICTRL 9 8 0 6 23 4 (21%) 0 JTWED 3 9 0 1 14 0 (0%) 0 ORTHO 0 0 0 4 4 2 (100%) 0 PROTOOLS 4 5 0 11 20 9 (81%) 0 SPEED 23 5 0 5 33 3 (10%) 0 TOTAL 55 50 0 37 142 21 (17%) 0 52
  • 53. Industrial Tools  Adobe Browser Lab & MS Expression Web ▪ Require manual inspection  Browsera (launched Summer 2010) ▪ Simple DOM matching (from experience using the tool)  Research Tools  Eaton & Memon [IJWET07] ▪ Requires manual classification. Limited to html tags only  Tamm [GTAC09] ▪ Expensive and is focused on layout of text elements 54
  • 54. Summary 55