SlideShare a Scribd company logo
6
Most read
Web Analytics
Driving Enterprise growth through web analytics




Abstract
Success of an enterprise largely depends on its ability to effectively monitor
and control its key performance indicators (KPI) and take appropriate actions
to improve them. The web has revolutionized the way business is conducted
and has become the paramount channel for driving growth. Organizations
across the globe are capitalizing on the power of web through various means
like advertising, order management and delivery of products/ services. In
order to measure the performance of web based initiatives, the organizations
accumulate huge volumes of data. This data, on careful processing, can give
useful insights and thus, prove to be beneficial to the organization.
Web analytics refers to data analysis for determining the effectiveness of each
web based initiative undertaken by the organization. The purpose of this
exercise is to make these initiatives more effective.
This article illustrates the importance of web analytics, the various tools and
techniques used for the purpose, and the key players in the web analytics
space.




                                                                      Oct 2010
Executive Summary
Enterprises implement several web based techniques for increasing revenue, driving growth and staying ahead of
competition. Some of the common techniques include launching websites, blogs, multi-channel advertising, targeted
campaigns, and referrals through partners / existing customers, paid search management, third party sites sales etc.
Web analytics play a vital role in monitoring the performance of these web based efforts. Such analysis can enable an
organization in recognizing growth opportunities and turning them into competitive advantage.
Web analytics is the technique of collection, transformation and analyzing the user activities on an organization’s web site. It
also includes reporting user’s activities for helping the organization understand website’s usage and planning for optimization.
The enterprise needs to implement a sound web analytics system that can provide them precise metrics which can be used to
make changes to their current business strategies as well as for predictive analysis for planning their future strategies.

Business Benefits of Implementing Web Analytics
Web Analytics provides a single, consolidated, accurate view of end-to-end usage data which can be used in decision making.
The key benefits of web analytics are listed below.
   1.	 Enables cost-benefit analysis of each web based initiative and helps the organization plan future course of action.
       The organization will also get a view of Return of Investment (ROI) for each initiative. Thus, they can focus on the
       initiatives that are in a position of giving maximum return.
   2.	 Identifies areas of improvement for each initiative and highlights the ones which should be considered for refinement.
   3.	 Helps in analyzing the navigation patterns of the users. Thus, it can form the basis of search engine optimization
       (SEO).
   4.	 Indicates the geographies that attract most visitors. This can help in planning strategies for sales and delivery as well as
       for activities like server deployment across geographic locations.
   5.	 Enables enterprises to target new regions, markets, and product categories based on the analysis.
   6.	 Indicates the time of day and seasons when users are more active. This helps in planning operations and estimating
       the support required for smooth business.
   7.	 Provides sophisticated reporting for improved decision making
   8.	 Gathers, profiles, and integrates data from disparate web applications, thereby eliminating data quality issues &
       exceptions. This, in turn, increases the trust factor of data.
   9.	 Enables on-demand extraction of complex operational reports.

Evolution of Web Analytics Strategy
From being a tool used only for exception data collection, Web Analytics has now evolved to include features like analysis of
user behavior, advanced reporting, real time analysis and real-time decision making. The following illustration depicts history
and trends in Web Analytics.




2 | Infosys – View Point
A n al y si s M at u r i t y / Co m p l e x i t y / Co st I n v o l v e d

                                                                                                                                                                             • Real -Time Web Analytics
                                                                                                                                                                               – What’s happening now?
                                                                                                                                                                               – Campaign & Content Analysis

                                                                                                                                                    • Web Analytics
                                                                                                                                                      – Web server Logs & Java Script Logs
                                                                                                                                                      – Ecommerce & Commercial Analysis
                                                                                                                                                      – Hosted or Offsite Analytics

                                                                                                                    • Web Tracking
                                                                                                                      – COTS products Omniture, Web Trends, Accrue
                                                                                                                      – Analyzing the user behavior & site traffics
                                                                                                                      – Installed or Onsite Analytics

                                                                                         • Web Analysis
                                                                                           – What happened?
                                                                                           – Track Exception & Performance of web sites

                                                                                       1990                      1995                        2000                      > 2005

                                                                                                              Figure 1: Evolution of Web Analytics

Visitors’ data is collected from online systems primarily by scanning through Log files of the web servers and through Page
Tagging.
Logfiles analysis: In a web based application, each user interaction involving information exchange between application and
server has a potential to log the interaction in the log file that is stored on the server. For systems that have large number of
users, this logging generates enormous amount of data. The level of logging i.e. the amount of detail can be controlled by the
application. It contains client side information like page tags, web beacon, exception logs, and event logs. Periodically these
log files can be collected, cleansed and transformed. This data is loaded into the analytical database for further processing.
Page Tagging: Log files may contain some inaccurate data due to caching and proxy setups. To overcome this issue, each web
page is embedded with page tags. On accessing the tagged web page, the visitor’s activity is communicated to data collection
server. This technique is called Page Tagging.
Processing for Page Tagging is simpler and faster as compared to Server Side Logging but this technique cannot provide the
network related status and statistics.
Unique visitor identification is generally done using cookies or IP addresses; this technique, though widely used, is error
prone as the same user may have used different browsers to navigate the web pages from two different locations/devices e.g.
from office and home. Or, several different users may have navigated through the site using the same computer.
A brief comparison between these two techniques is illustrated below:




                                                                                                                                                                                               Infosys – View Point | 3
To meet network requirements, the concept of Network data collection was introduced by web traffic analyzers. Packet
sniffers are placed on the web server or hardware like hub, proxy server, switch etc to collect the data required for analytical
processing. These sniffers are capable of providing information like server response time, network related issues etc which
impact the visitor’s satisfaction quotient. The limitation of this technique is data loss and server load; hence, extra effort is
required from IT side to solve the issues.
To overcome the limitation of data inaccuracies in log file analysis and the limitation of search engines in page tagging, a
combination of both these techniques is generally used by enterprises. The following techniques are predominantly used:
   •	 Combination of Logging and Page Tagging is the most commonly used technique followed by the web analyzers. This
      hybrid technique helps in analyzing the user behavior, downloading various data like status, usage pattern and pages
      cached etc. It also provides other granular information like IP Address as well as domain name of requestor, browser
      type and version, sign-in name, request date & time, request status, requested contents, parameters of dynamic
      request, response time, and cookie parameters and associated details.
   •	 Combination of Network Data Collection & Page Tagging provides rich data for analysis from both user and network
      perspective. A Packet sniffer requires either web server or hardware to reside, which ultimately needs considerable
      maintenance effort. Page tagging is being adapted widely because the data collection is outsourced to 3rd party, who
      will collect data, cleanse and transform as required by the website owner. This combination is the best solution for the
      website traffic analysis.
The combination of one or more data collection techniques is desirable and complex. It is recommended that more attention
should be given to automation and the enterprise needs to ensure that the technique reaches the right target.

Approach to Website Analysis
Website analysis is generally started at the time of website launch as a onetime activity. This analysis is usually outsourced by
the organizations to service providers (SaaS) who specialize in this area. These service providers analyze the data and provide
requisite reports to the organization. This is known as the Hosted approach.
In some cases, the organization deploys and manages analysis software for a specific website which is known as Installed
approach. The hosted approach is used by most organizations.


4 | Infosys – View Point
The purpose of analysis of either approach can be classified based on the usage:
   •	 E-Commerce Management to monitor
       •	 Conversion rates of orders
       •	 Revenue rate of visitor
       •	 Click path analysis
       •	 Popular page analysis
       •	 Entry page analysis
       •	 Exit page analysis
       •	 Keyword analysis (Paid / Non-Paid Keywords)
       •	 Unique Visitor / Return visitor frequency analysis
       •	 Analysis of visit elapsed duration per session
   •	 Campaign Management Sites
       •	 Campaign tracking & analysis
       •	 Conversion tracking & analysis
       •	 Visitor path tracking & analysis
       •	 Referring channel analysis
       •	 Visitor demographics analysis
       •	 Visitor segmentation analysis
       •	 Click stream conversion rate analysis
       •	 Funnel analysis to track the visitor behavior
   •	 Content Management Sites
       •	 Referring Keyword / Pages
       •	 Search success analysis
       •	 Popular content analysis
       •	 Content download volume / traffic analysis
       •	 Analysis on Content download stats
   •	 Support & Service Management Sites
       •	 Stats average
       •	 Summary stats
       •	 Analysis on Visitor’s geographical locations
       •	 Browser Stats
       •	 Operating System Stats
       •	 User Access Management
       •	 Bounce rate analysis
The above metrics can be broadly classified into any of the categories like ratio, counts, key performance metrics/indicators
(KPI).
   •	 Ratio is the calculated value resulted by dividing count by count or another ratio (e.g. Average Visit Time per Visit).
   •	 Count is not derived from other measures. It is the total number of an attribute or action (e.g. Total Number of Unique
      Visitors).
   •	 KPI is associated with business targets (e.g. Success conversion rates) which the management monitors on real-time
      basis to take appropriate decision.
The metrics can be displayed to the analysts as Unit (e.g. Visitor path at a given date time), Aggregate (e.g. Total no. of
visitors), and Segregate (e.g. No. of visitors by locations). The figure below illustrates the major steps of the web analysis life
cycle.

                                                                                                            Infosys – View Point | 5
Define Webb            Collect Logs,
                                                       s                Cleanse,
                                                                              e,
                                                                                                    Reporting &
                     Analytics              Page Tags, &              Transform, &
                                                                              m
                                                                                                     Analysis
                      Metrics               Network Data               Aggregate


                                                                                                                  Page Visit / Session

                                                                                                                  Unique Visitors

                                                                                                                  Ref erral Channel

                                                                                                                  Entry / Exit Pages

                                                                                                                  Popular Contents

                                                                                                                  Click-Through Rate

                                                                                                                  Bounce Rates




                                     Figure 2: Step by Step Approach to Web Analytics

Process of building Web analytics comprises of the following phases:
   •	 Gather business requirements
   •	 Define factors required to derive the metrics
   •	 Identify required attributes from the log files by placing appropriate page tags for monitoring the visitor navigation
   •	 Capturing network details from the hardware.
The attributes gathered by usage of log files or page tagging are unstructured and therefore, unsuitable for further analysis.
This data should be cleansed, de-duplicated (if required), aggregated, standardized and loaded into the data models. Then
this dimensional data can be presented for analysis with the help of reporting tool or published in the enterprise portal for
management decision.

Web Analytics Architecture
A typical web analytics system comprises of several parts. The above diagram illustrates the architecture of a typical
real-time web analytics system.
   •	 Website user accesses the web page through a browser and the server returns the web page to the user.



              Web Browsers            Application Tier                    Analytical Tier                        Decision Tier
                                                                                                                  Analytical
                                                                                                                   Layer

                                            Web Server


                                                                                    Transform



                                                                                      ESB Web Analytics
                                                                                                           Reporting
                                                                         Analytic               Datamart      &             Decision
                                        Script                                                             Analysis
                                       Tagging              Message       Server                                             Maker
                                          &                  Queue
                                       Logging




                                      Figure 3: Architecture of Real-Time Web Analytics


6 | Infosys – View Point
•	 As the web page is loaded the activity is logged in the web server, used in log file analysis, or a separate
      request is sent to the analytics server, in case of page tagging.
   •	 Once the request is received by the analytics server, it returns a small (1x1 pixel) image to the browser along
      with a cookie.
   •	 The request logs are processed by the analytics software, cleansed, transformed and loaded into datamart.
   •	 Reports are generated through datamart which are used for decision making.

Conclusion
Today the digital consumer experience is the key driving force behind all business decisions including the enterprise level
competition to provide the next generation services for their customers. It has become imperative for the enterprises to
understand the customer and their varying behavioral patterns on regular basis and react quickly to meet their expectations.
In this context we find that usage of web analytics and prediction plays an important role for enterprises to convert new
visitors as customers, delight the existing customers and thereby increase revenue and profitability of an organization. Infosys
has rich experience in providing customized web analytics and prediction solutions for different Industry segments by using
3rd Party products, home grown tools, systems and processes.

About the Author
Balasubramanian M.
Balasubramanian M is a Delivery Manager in Communication, Media and Entertainment unit at Infosys. With more than 15
years of experience in this industry, he has played multiple roles for top telecom and media clients across the globe. His areas
of interest include being in-sync with latest trends in Cable & Media industry; as well as leading teams in conceptualizing
and building solutions in the areas of next generation digital experience using latest technology trends like RIA, Web2.0,
Interactive TV, Converged experience. He can be reached at M_balasubramanian@Infosys.com
Sivaprakasam S.R.
Sivaprakasam S.R. is a Principal Technology Architect and mentors the Database and Business Intelligence track in the
Infosys’ Communication, Media, and Entertainment business unit. His interests include Enterprise Data Modeling, Enterprise
Data Integration, Enterprise Data Warehousing, Enterprise Data Quality Management, Enterprise Data Governance,
Enterprise Performance Management and Semantic Data Integration. He can be reached at sivaprakasam_s@infosys.com.
Sudhir Kakkar.
Sudhir Kakkar is a Senior Project Manager and mentors the portal applications and web analytics track in Infosys’
Communication, Media, and Entertainment business unit. His interests include Web 2.0, Rich Internet Applications (RIA),
Adobe Flex, Unified Communications, Presence and Messaging, Performance Engineering and Portal development. He can be
reached at sudhir_kakkar@infosys.com.
Web Analytics Architecture

More Related Content

PDF
DevOps for beginners
PPTX
Introducing DevOps
PDF
Technical SEO Audit
PDF
How People Are Leveraging ChatGPT
PDF
Open API and API Management - Introduction and Comparison of Products: TIBCO ...
PDF
PPC Restart 2023: Libor Mattuš - Optimalizujte nejen kampaně, ale i sebe
PDF
SEO Services Proposal PowerPoint Presentation Slides
PDF
SAP Cloud Platform Product Overview L2 deck
DevOps for beginners
Introducing DevOps
Technical SEO Audit
How People Are Leveraging ChatGPT
Open API and API Management - Introduction and Comparison of Products: TIBCO ...
PPC Restart 2023: Libor Mattuš - Optimalizujte nejen kampaně, ale i sebe
SEO Services Proposal PowerPoint Presentation Slides
SAP Cloud Platform Product Overview L2 deck

What's hot (20)

PDF
Web Development Presentation
PPTX
API Management in Azure
PPTX
Automation in Jira for beginners
PDF
MLops workshop AWS
PDF
Microsoft + OpenAI: Recent Updates (Machine Learning 15minutes! Broadcast #74)
PPTX
Splunk Cloud
PDF
Data Driven Attribution in BigQuery with Shapley Values and Markov Chains
PDF
PPC Restart 2023: Tomáš Sýkora - Jak zvýšit výkon digitálních médií o desítky...
PDF
ChatGPT SEO Guide 2023
PDF
Multimedia cloud computing
PDF
Demystifying Service Mesh
PDF
Crafting an API Strategy with an API Marketplace
PDF
Ml ops intro session
PPSX
PPTX
MLOps - The Assembly Line of ML
PPTX
Web 3.0.pptx
PDF
Ontology engineering ESTC2008
PPTX
An introduction to DevOps
PDF
Data Restart 2022: Pavel Jašek - Jak se řídí výkonnostní marketing s nedokona...
PPT
Website Development and Design Proposal
Web Development Presentation
API Management in Azure
Automation in Jira for beginners
MLops workshop AWS
Microsoft + OpenAI: Recent Updates (Machine Learning 15minutes! Broadcast #74)
Splunk Cloud
Data Driven Attribution in BigQuery with Shapley Values and Markov Chains
PPC Restart 2023: Tomáš Sýkora - Jak zvýšit výkon digitálních médií o desítky...
ChatGPT SEO Guide 2023
Multimedia cloud computing
Demystifying Service Mesh
Crafting an API Strategy with an API Marketplace
Ml ops intro session
MLOps - The Assembly Line of ML
Web 3.0.pptx
Ontology engineering ESTC2008
An introduction to DevOps
Data Restart 2022: Pavel Jašek - Jak se řídí výkonnostní marketing s nedokona...
Website Development and Design Proposal
Ad

Viewers also liked (11)

PDF
A Real Time Web Analytics System
PPT
Web Analytics
PPTX
Rebuilding Web Tracking Infrastructure for Scale
PPTX
EVOLVE'13 | Enhance | Managing Digital Experiences | Patric DelCioppo
PPTX
Web Analytics Concepts & Theories
PDF
The Language of Interfaces
PDF
Designing Data Visualisations & Dashboard in Web Applications
PPTX
Webinar: Which Storage Architecture is Best for Splunk Analytics?
PDF
Google Architecture - Breaking it Open
PPT
An Introduction to Web Analytics
PPTX
Web Analytics Tools Comparison
A Real Time Web Analytics System
Web Analytics
Rebuilding Web Tracking Infrastructure for Scale
EVOLVE'13 | Enhance | Managing Digital Experiences | Patric DelCioppo
Web Analytics Concepts & Theories
The Language of Interfaces
Designing Data Visualisations & Dashboard in Web Applications
Webinar: Which Storage Architecture is Best for Splunk Analytics?
Google Architecture - Breaking it Open
An Introduction to Web Analytics
Web Analytics Tools Comparison
Ad

Similar to Web Analytics Architecture (20)

PDF
Meaure Marketing Online - IABC Ottawa
PPT
Google analytics Review
PDF
Making Web Analytics actionable with Web Content Management
PDF
Web Metrics vs Web Behavioral Analytics and Why You Need to Know the Difference
PDF
Web analyticspres -am-long
PPTX
Web analytics
PPTX
Web Analytics Webinar 10June2010
PDF
Digital WorkSpace NX
PDF
Government Web Analytics
PDF
Web analytics
PPTX
Webtrends Review
PPTX
Web Analytics
PDF
RESEARCH CHALLENGES IN WEB ANALYTICS – A STUDY
PDF
Digital Measurement - How to Evaluate, Track and Measure Marketing Performance
PDF
Emakina Academy - 5 - Know your audience - Web Analytics
PDF
Digital Measurement - How to Turn Data into Actionable Insights
PDF
E Commerce Analytics Demandware
PDF
Digital Measurement - a Determinant in Tracking and Measuring Marketing Perfo...
PDF
Digital Measurement
PDF
Presentation vn
Meaure Marketing Online - IABC Ottawa
Google analytics Review
Making Web Analytics actionable with Web Content Management
Web Metrics vs Web Behavioral Analytics and Why You Need to Know the Difference
Web analyticspres -am-long
Web analytics
Web Analytics Webinar 10June2010
Digital WorkSpace NX
Government Web Analytics
Web analytics
Webtrends Review
Web Analytics
RESEARCH CHALLENGES IN WEB ANALYTICS – A STUDY
Digital Measurement - How to Evaluate, Track and Measure Marketing Performance
Emakina Academy - 5 - Know your audience - Web Analytics
Digital Measurement - How to Turn Data into Actionable Insights
E Commerce Analytics Demandware
Digital Measurement - a Determinant in Tracking and Measuring Marketing Perfo...
Digital Measurement
Presentation vn

More from Infosys (20)

PDF
Demystifying Machine Learning for Manufacturing: Data Science for all
PPTX
Digital Outlook: Healthcare Industry
PPTX
5 tips to make your mainframe as fit as you
PPTX
Mainframe modernization powered by AI
PPTX
Human Amplification In The Enterprise - Resources and Utilities
PPTX
Human Amplification In The Enterprise - Telecom and Communication
PPTX
Human Amplification In The Enterprise - Retail and CPG
PPTX
Human Amplification In The Enterprise - Manufacturing and High-tech
PPTX
Human amplification in the enterprise - Automation. Innovation. Learning.
PPTX
Human Amplification In The Enterprise - Healthcare and Life Sciences
PPTX
Human Amplification In The Enterprise - Banking and Insurance
PPTX
Mainframe modernization powered by AI
PPTX
Reimagining the future of IT Infrastructure
PPTX
Infosys Amplifying Human Potential
PPTX
Snapshots from Infosys Confluence 2016
PPTX
Be Digital. Be More.
PPTX
Being Digital
PPTX
Disruptive forces in digital payments
PPTX
Infosys 'Go Green' Initiative
PDF
Serving the perfect Information Cocktail
Demystifying Machine Learning for Manufacturing: Data Science for all
Digital Outlook: Healthcare Industry
5 tips to make your mainframe as fit as you
Mainframe modernization powered by AI
Human Amplification In The Enterprise - Resources and Utilities
Human Amplification In The Enterprise - Telecom and Communication
Human Amplification In The Enterprise - Retail and CPG
Human Amplification In The Enterprise - Manufacturing and High-tech
Human amplification in the enterprise - Automation. Innovation. Learning.
Human Amplification In The Enterprise - Healthcare and Life Sciences
Human Amplification In The Enterprise - Banking and Insurance
Mainframe modernization powered by AI
Reimagining the future of IT Infrastructure
Infosys Amplifying Human Potential
Snapshots from Infosys Confluence 2016
Be Digital. Be More.
Being Digital
Disruptive forces in digital payments
Infosys 'Go Green' Initiative
Serving the perfect Information Cocktail

Recently uploaded (20)

PDF
KodekX | Application Modernization Development
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Electronic commerce courselecture one. Pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
cuic standard and advanced reporting.pdf
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Machine learning based COVID-19 study performance prediction
KodekX | Application Modernization Development
Understanding_Digital_Forensics_Presentation.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Network Security Unit 5.pdf for BCA BBA.
Agricultural_Statistics_at_a_Glance_2022_0.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Unlocking AI with Model Context Protocol (MCP)
NewMind AI Weekly Chronicles - August'25 Week I
Spectral efficient network and resource selection model in 5G networks
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Electronic commerce courselecture one. Pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
cuic standard and advanced reporting.pdf
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
The AUB Centre for AI in Media Proposal.docx
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
Machine learning based COVID-19 study performance prediction

Web Analytics Architecture

  • 1. Web Analytics Driving Enterprise growth through web analytics Abstract Success of an enterprise largely depends on its ability to effectively monitor and control its key performance indicators (KPI) and take appropriate actions to improve them. The web has revolutionized the way business is conducted and has become the paramount channel for driving growth. Organizations across the globe are capitalizing on the power of web through various means like advertising, order management and delivery of products/ services. In order to measure the performance of web based initiatives, the organizations accumulate huge volumes of data. This data, on careful processing, can give useful insights and thus, prove to be beneficial to the organization. Web analytics refers to data analysis for determining the effectiveness of each web based initiative undertaken by the organization. The purpose of this exercise is to make these initiatives more effective. This article illustrates the importance of web analytics, the various tools and techniques used for the purpose, and the key players in the web analytics space. Oct 2010
  • 2. Executive Summary Enterprises implement several web based techniques for increasing revenue, driving growth and staying ahead of competition. Some of the common techniques include launching websites, blogs, multi-channel advertising, targeted campaigns, and referrals through partners / existing customers, paid search management, third party sites sales etc. Web analytics play a vital role in monitoring the performance of these web based efforts. Such analysis can enable an organization in recognizing growth opportunities and turning them into competitive advantage. Web analytics is the technique of collection, transformation and analyzing the user activities on an organization’s web site. It also includes reporting user’s activities for helping the organization understand website’s usage and planning for optimization. The enterprise needs to implement a sound web analytics system that can provide them precise metrics which can be used to make changes to their current business strategies as well as for predictive analysis for planning their future strategies. Business Benefits of Implementing Web Analytics Web Analytics provides a single, consolidated, accurate view of end-to-end usage data which can be used in decision making. The key benefits of web analytics are listed below. 1. Enables cost-benefit analysis of each web based initiative and helps the organization plan future course of action. The organization will also get a view of Return of Investment (ROI) for each initiative. Thus, they can focus on the initiatives that are in a position of giving maximum return. 2. Identifies areas of improvement for each initiative and highlights the ones which should be considered for refinement. 3. Helps in analyzing the navigation patterns of the users. Thus, it can form the basis of search engine optimization (SEO). 4. Indicates the geographies that attract most visitors. This can help in planning strategies for sales and delivery as well as for activities like server deployment across geographic locations. 5. Enables enterprises to target new regions, markets, and product categories based on the analysis. 6. Indicates the time of day and seasons when users are more active. This helps in planning operations and estimating the support required for smooth business. 7. Provides sophisticated reporting for improved decision making 8. Gathers, profiles, and integrates data from disparate web applications, thereby eliminating data quality issues & exceptions. This, in turn, increases the trust factor of data. 9. Enables on-demand extraction of complex operational reports. Evolution of Web Analytics Strategy From being a tool used only for exception data collection, Web Analytics has now evolved to include features like analysis of user behavior, advanced reporting, real time analysis and real-time decision making. The following illustration depicts history and trends in Web Analytics. 2 | Infosys – View Point
  • 3. A n al y si s M at u r i t y / Co m p l e x i t y / Co st I n v o l v e d • Real -Time Web Analytics – What’s happening now? – Campaign & Content Analysis • Web Analytics – Web server Logs & Java Script Logs – Ecommerce & Commercial Analysis – Hosted or Offsite Analytics • Web Tracking – COTS products Omniture, Web Trends, Accrue – Analyzing the user behavior & site traffics – Installed or Onsite Analytics • Web Analysis – What happened? – Track Exception & Performance of web sites 1990 1995 2000 > 2005 Figure 1: Evolution of Web Analytics Visitors’ data is collected from online systems primarily by scanning through Log files of the web servers and through Page Tagging. Logfiles analysis: In a web based application, each user interaction involving information exchange between application and server has a potential to log the interaction in the log file that is stored on the server. For systems that have large number of users, this logging generates enormous amount of data. The level of logging i.e. the amount of detail can be controlled by the application. It contains client side information like page tags, web beacon, exception logs, and event logs. Periodically these log files can be collected, cleansed and transformed. This data is loaded into the analytical database for further processing. Page Tagging: Log files may contain some inaccurate data due to caching and proxy setups. To overcome this issue, each web page is embedded with page tags. On accessing the tagged web page, the visitor’s activity is communicated to data collection server. This technique is called Page Tagging. Processing for Page Tagging is simpler and faster as compared to Server Side Logging but this technique cannot provide the network related status and statistics. Unique visitor identification is generally done using cookies or IP addresses; this technique, though widely used, is error prone as the same user may have used different browsers to navigate the web pages from two different locations/devices e.g. from office and home. Or, several different users may have navigated through the site using the same computer. A brief comparison between these two techniques is illustrated below: Infosys – View Point | 3
  • 4. To meet network requirements, the concept of Network data collection was introduced by web traffic analyzers. Packet sniffers are placed on the web server or hardware like hub, proxy server, switch etc to collect the data required for analytical processing. These sniffers are capable of providing information like server response time, network related issues etc which impact the visitor’s satisfaction quotient. The limitation of this technique is data loss and server load; hence, extra effort is required from IT side to solve the issues. To overcome the limitation of data inaccuracies in log file analysis and the limitation of search engines in page tagging, a combination of both these techniques is generally used by enterprises. The following techniques are predominantly used: • Combination of Logging and Page Tagging is the most commonly used technique followed by the web analyzers. This hybrid technique helps in analyzing the user behavior, downloading various data like status, usage pattern and pages cached etc. It also provides other granular information like IP Address as well as domain name of requestor, browser type and version, sign-in name, request date & time, request status, requested contents, parameters of dynamic request, response time, and cookie parameters and associated details. • Combination of Network Data Collection & Page Tagging provides rich data for analysis from both user and network perspective. A Packet sniffer requires either web server or hardware to reside, which ultimately needs considerable maintenance effort. Page tagging is being adapted widely because the data collection is outsourced to 3rd party, who will collect data, cleanse and transform as required by the website owner. This combination is the best solution for the website traffic analysis. The combination of one or more data collection techniques is desirable and complex. It is recommended that more attention should be given to automation and the enterprise needs to ensure that the technique reaches the right target. Approach to Website Analysis Website analysis is generally started at the time of website launch as a onetime activity. This analysis is usually outsourced by the organizations to service providers (SaaS) who specialize in this area. These service providers analyze the data and provide requisite reports to the organization. This is known as the Hosted approach. In some cases, the organization deploys and manages analysis software for a specific website which is known as Installed approach. The hosted approach is used by most organizations. 4 | Infosys – View Point
  • 5. The purpose of analysis of either approach can be classified based on the usage: • E-Commerce Management to monitor • Conversion rates of orders • Revenue rate of visitor • Click path analysis • Popular page analysis • Entry page analysis • Exit page analysis • Keyword analysis (Paid / Non-Paid Keywords) • Unique Visitor / Return visitor frequency analysis • Analysis of visit elapsed duration per session • Campaign Management Sites • Campaign tracking & analysis • Conversion tracking & analysis • Visitor path tracking & analysis • Referring channel analysis • Visitor demographics analysis • Visitor segmentation analysis • Click stream conversion rate analysis • Funnel analysis to track the visitor behavior • Content Management Sites • Referring Keyword / Pages • Search success analysis • Popular content analysis • Content download volume / traffic analysis • Analysis on Content download stats • Support & Service Management Sites • Stats average • Summary stats • Analysis on Visitor’s geographical locations • Browser Stats • Operating System Stats • User Access Management • Bounce rate analysis The above metrics can be broadly classified into any of the categories like ratio, counts, key performance metrics/indicators (KPI). • Ratio is the calculated value resulted by dividing count by count or another ratio (e.g. Average Visit Time per Visit). • Count is not derived from other measures. It is the total number of an attribute or action (e.g. Total Number of Unique Visitors). • KPI is associated with business targets (e.g. Success conversion rates) which the management monitors on real-time basis to take appropriate decision. The metrics can be displayed to the analysts as Unit (e.g. Visitor path at a given date time), Aggregate (e.g. Total no. of visitors), and Segregate (e.g. No. of visitors by locations). The figure below illustrates the major steps of the web analysis life cycle. Infosys – View Point | 5
  • 6. Define Webb Collect Logs, s Cleanse, e, Reporting & Analytics Page Tags, & Transform, & m Analysis Metrics Network Data Aggregate Page Visit / Session Unique Visitors Ref erral Channel Entry / Exit Pages Popular Contents Click-Through Rate Bounce Rates Figure 2: Step by Step Approach to Web Analytics Process of building Web analytics comprises of the following phases: • Gather business requirements • Define factors required to derive the metrics • Identify required attributes from the log files by placing appropriate page tags for monitoring the visitor navigation • Capturing network details from the hardware. The attributes gathered by usage of log files or page tagging are unstructured and therefore, unsuitable for further analysis. This data should be cleansed, de-duplicated (if required), aggregated, standardized and loaded into the data models. Then this dimensional data can be presented for analysis with the help of reporting tool or published in the enterprise portal for management decision. Web Analytics Architecture A typical web analytics system comprises of several parts. The above diagram illustrates the architecture of a typical real-time web analytics system. • Website user accesses the web page through a browser and the server returns the web page to the user. Web Browsers Application Tier Analytical Tier Decision Tier Analytical Layer Web Server Transform ESB Web Analytics Reporting Analytic Datamart & Decision Script Analysis Tagging Message Server Maker & Queue Logging Figure 3: Architecture of Real-Time Web Analytics 6 | Infosys – View Point
  • 7. • As the web page is loaded the activity is logged in the web server, used in log file analysis, or a separate request is sent to the analytics server, in case of page tagging. • Once the request is received by the analytics server, it returns a small (1x1 pixel) image to the browser along with a cookie. • The request logs are processed by the analytics software, cleansed, transformed and loaded into datamart. • Reports are generated through datamart which are used for decision making. Conclusion Today the digital consumer experience is the key driving force behind all business decisions including the enterprise level competition to provide the next generation services for their customers. It has become imperative for the enterprises to understand the customer and their varying behavioral patterns on regular basis and react quickly to meet their expectations. In this context we find that usage of web analytics and prediction plays an important role for enterprises to convert new visitors as customers, delight the existing customers and thereby increase revenue and profitability of an organization. Infosys has rich experience in providing customized web analytics and prediction solutions for different Industry segments by using 3rd Party products, home grown tools, systems and processes. About the Author Balasubramanian M. Balasubramanian M is a Delivery Manager in Communication, Media and Entertainment unit at Infosys. With more than 15 years of experience in this industry, he has played multiple roles for top telecom and media clients across the globe. His areas of interest include being in-sync with latest trends in Cable & Media industry; as well as leading teams in conceptualizing and building solutions in the areas of next generation digital experience using latest technology trends like RIA, Web2.0, Interactive TV, Converged experience. He can be reached at M_balasubramanian@Infosys.com Sivaprakasam S.R. Sivaprakasam S.R. is a Principal Technology Architect and mentors the Database and Business Intelligence track in the Infosys’ Communication, Media, and Entertainment business unit. His interests include Enterprise Data Modeling, Enterprise Data Integration, Enterprise Data Warehousing, Enterprise Data Quality Management, Enterprise Data Governance, Enterprise Performance Management and Semantic Data Integration. He can be reached at sivaprakasam_s@infosys.com. Sudhir Kakkar. Sudhir Kakkar is a Senior Project Manager and mentors the portal applications and web analytics track in Infosys’ Communication, Media, and Entertainment business unit. His interests include Web 2.0, Rich Internet Applications (RIA), Adobe Flex, Unified Communications, Presence and Messaging, Performance Engineering and Portal development. He can be reached at sudhir_kakkar@infosys.com.