Web stats
The most important thing to know
There are numerous ways to statistically report
on web activity
And …
Web Statistics
They all suck
They suck
● Not really, but you need to be aware of some
limitation
● And know the chosen method’s limitations
● And be careful what you ask of your stats
● And maintain a healthy skepticism of results
Web stats
You never know exactly what they are
measuring
Measurement problems
● Spider (aka ‘bots’) skew data heavily
● Caching, ajax, and differing browsing
behavior (click vs back forward) skew as well
Additional problems
● The web is a one-way medium
● Links go one way
● Ted Nelson (originator of Project Xanadu)
has been highly critical of this aspect of the
web
To sum up
● If you’re looking at one system over time
○ a healthy skepticism of the numbers is wise
● If you’re looking at two different systems and
comparing their numbers
○ don’t even bother
Overview
● Types of analytics
● What each type is good for
○ and their limitations
● Practical examples and tips
Types
● Web server access logs
● Javascript-based solutions
● Web bugs
Access logs
● Records actual interactions and requests
● Note that these requests are recorded in a
one-way fashion
● Can provide server-centric information
(server errors, complete user-agent info, IP)
as well
Access logs
● Far less likely to be blocked by:
● User settings or legal mandate
● Turning off Javascript
● Disabling cookies (either first party of third
party)
● European countries increasingly hostile to
information collection by US companies
Access Logs
● Configurable
● Won’t go away
○ You own the data
● But these logs give you everything
● Aforementioned bots can really mess with
logs
○ So-called ‘spider-traps’ an extreme example
Access Logs
● As much as I hate to say it, this stuff is
probably obsolete
● Wheelwrights Shop
Javascript-based
● Uses a js file to record interactions
● Less reliable but can offer different info
about remote computer and, sometimes,
user
Differences between js-based
systems
● Not all js-based analytics are the same
● Hosted vs third party
● Open vs proprietary
Javascript-based
● Uses a js file to record interactions
● Less reliable but can offer different info
about remote computer and, sometimes, the
visitors.
Javascript-based
‘less reliable’ meaning:
● They rely on technological capabilities and
settings of the viewers browser
● Not all interactions are recorded
● But this is potentially a good thing
Javascript-based
● Spiders almost universally won’t be counted
by js-based systems
● Generally easy to set up
● Often have built-in integration with other
products (advertisement networks and
shopping carts.)
Javascript-based
Also:
Often no need to do anything to receive new
features and upgrades
Javascript-based
Hosted vs service
● Hosted:
○ Installing the system on a local machine and
collected data stays local
● Third party service:
○ Data collected and housed by them
Self-hosted JS
Pros:
● More easy to customize
● Generally easier to share (no need to sign
up for 3rd party service)
● If open source, you can tweak to your needs
● Less likely to be blocked
Third party
Pros:
● Typically easy to set up
● Built-in integrations with other products
● Generally no need to do anything to receive
upgrades/new features
Tips: Google
● Sometimes it’s difficult to find the screen with
the desired information
● Analytics support site is very helpful here
● Secondary dimensions are very cool
Google
● FYI: switching to a new type of GA account
○ so these labels might not be 100% correct in a week
Google
The basics
Audience: who the users are
Acquisition: Where they come from
Behavior: Where they go on the site
Google
Behavior > Site content > All pages
● List of most frequently visited pages
Audience > Technology > Browser & OS
● List of browser used to access site
○ Also notice “Primary Dimension” options
● Acquisition > Keyword > Organic
○ Note the large percentage of “(not provided)”
Exporting from Google
● Under page title option to export the
currently viewed report
● Select format from dropdown
More advanced stuff
● Custom variables
● Custom segments
● These are user-defined and can be quite
powerful
Pwik
● Similar to Google Analytics
● But self-hosted
● And Open Source
○ Can be extended
○ Has a development roadmap
Pwik
Some benefits of Pwik
● Far less likely to be blocked
○ By user settings or legal mandate (EU)
● Configurable
● Won’t go away
Pwik
Page Views:
● Actions > Pages
Site Traffic by hour
● Visitors > Times
Pwik
Many other reports
● Generally laid out nicely and intuitively
labelled

More Related Content

PPTX
Web accessibility strategies for the new decade
PPT
Web Analytics 101
PPSX
Web Analytics Training for Business Link
PPTX
Web Analytics Training Course
PDF
Weblog analsys
PPTX
Web Analytics Primer
PPTX
Web Analytics Webinar 10June2010
Web accessibility strategies for the new decade
Web Analytics 101
Web Analytics Training for Business Link
Web Analytics Training Course
Weblog analsys
Web Analytics Primer
Web Analytics Webinar 10June2010

Similar to Web stats (20)

PDF
Web Metrics vs Web Behavioral Analytics and Why You Need to Know the Difference
PPT
Web analytics
PDF
Web analyticspres -am-long
PPTX
Introduction to Google Analytics
PDF
Web analytics an intro
PDF
Government Web Analytics
PPTX
Module 1 introduction to web analytics
PPTX
Module 1 introduction to web analytics
PDF
Web Analytics 101
PDF
Andras Barthazi on Google Analytics API & Open Source Analytics - WAW
PPT
Web Analytics Basics
PDF
Web analytics
PPTX
Google Analytics
PDF
Web and social analytics
PPT
IWMW 2005: Lies, Damn Lies, and Web Statistics (1)
PPTX
How To Web - Introduction To Data Mining For Web Applications
PPTX
Webtrends presentation
PDF
Google Analytics tutorial by Jay Murphy
PDF
Emakina Academy - 5 - Know your audience - Web Analytics
PPT
An Introduction to Web Analytics
Web Metrics vs Web Behavioral Analytics and Why You Need to Know the Difference
Web analytics
Web analyticspres -am-long
Introduction to Google Analytics
Web analytics an intro
Government Web Analytics
Module 1 introduction to web analytics
Module 1 introduction to web analytics
Web Analytics 101
Andras Barthazi on Google Analytics API & Open Source Analytics - WAW
Web Analytics Basics
Web analytics
Google Analytics
Web and social analytics
IWMW 2005: Lies, Damn Lies, and Web Statistics (1)
How To Web - Introduction To Data Mining For Web Applications
Webtrends presentation
Google Analytics tutorial by Jay Murphy
Emakina Academy - 5 - Know your audience - Web Analytics
An Introduction to Web Analytics
Ad

Recently uploaded (20)

PDF
📍 LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1 TERPOPULER DI INDONESIA ! 🌟
PPTX
1402_iCSC_-_RESTful_Web_APIs_--_Josef_Hammer.pptx
PPTX
Layers_of_the_Earth_Grade7.pptx class by
PDF
Buy Cash App Verified Accounts Instantly – Secure Crypto Deal.pdf
PPTX
Top Website Bugs That Hurt User Experience – And How Expert Web Design Fixes
PDF
Lean-Manufacturing-Tools-Techniques-and-How-To-Use-Them.pdf
PDF
Alethe Consulting Corporate Profile and Solution Aproach
PDF
📍 LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1 TERPOPULER DI INDONESIA ! 🌟
PPTX
Reading as a good Form of Recreation
PPTX
Introduction to cybersecurity and digital nettiquette
PPTX
IPCNA VIRTUAL CLASSES INTERMEDIATE 6 PROJECT.pptx
PDF
Exploring The Internet Of Things(IOT).ppt
PPTX
t_and_OpenAI_Combined_two_pressentations
PDF
mera desh ae watn.(a source of motivation and patriotism to the youth of the ...
PDF
Exploring VPS Hosting Trends for SMBs in 2025
PPTX
TITLE DEFENSE entitle the impact of social media on education
PDF
Course Overview and Agenda cloud security
PPTX
Mathew Digital SEO Checklist Guidlines 2025
PPTX
module 1-Part 1.pptxdddddddddddddddddddddddddddddddddddd
PPTX
Internet Safety for Seniors presentation
📍 LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1 TERPOPULER DI INDONESIA ! 🌟
1402_iCSC_-_RESTful_Web_APIs_--_Josef_Hammer.pptx
Layers_of_the_Earth_Grade7.pptx class by
Buy Cash App Verified Accounts Instantly – Secure Crypto Deal.pdf
Top Website Bugs That Hurt User Experience – And How Expert Web Design Fixes
Lean-Manufacturing-Tools-Techniques-and-How-To-Use-Them.pdf
Alethe Consulting Corporate Profile and Solution Aproach
📍 LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1 TERPOPULER DI INDONESIA ! 🌟
Reading as a good Form of Recreation
Introduction to cybersecurity and digital nettiquette
IPCNA VIRTUAL CLASSES INTERMEDIATE 6 PROJECT.pptx
Exploring The Internet Of Things(IOT).ppt
t_and_OpenAI_Combined_two_pressentations
mera desh ae watn.(a source of motivation and patriotism to the youth of the ...
Exploring VPS Hosting Trends for SMBs in 2025
TITLE DEFENSE entitle the impact of social media on education
Course Overview and Agenda cloud security
Mathew Digital SEO Checklist Guidlines 2025
module 1-Part 1.pptxdddddddddddddddddddddddddddddddddddd
Internet Safety for Seniors presentation
Ad

Web stats

  • 2. The most important thing to know There are numerous ways to statistically report on web activity And …
  • 4. They suck ● Not really, but you need to be aware of some limitation ● And know the chosen method’s limitations ● And be careful what you ask of your stats ● And maintain a healthy skepticism of results
  • 5. Web stats You never know exactly what they are measuring
  • 6. Measurement problems ● Spider (aka ‘bots’) skew data heavily ● Caching, ajax, and differing browsing behavior (click vs back forward) skew as well
  • 7. Additional problems ● The web is a one-way medium ● Links go one way ● Ted Nelson (originator of Project Xanadu) has been highly critical of this aspect of the web
  • 8. To sum up ● If you’re looking at one system over time ○ a healthy skepticism of the numbers is wise ● If you’re looking at two different systems and comparing their numbers ○ don’t even bother
  • 9. Overview ● Types of analytics ● What each type is good for ○ and their limitations ● Practical examples and tips
  • 10. Types ● Web server access logs ● Javascript-based solutions ● Web bugs
  • 11. Access logs ● Records actual interactions and requests ● Note that these requests are recorded in a one-way fashion ● Can provide server-centric information (server errors, complete user-agent info, IP) as well
  • 12. Access logs ● Far less likely to be blocked by: ● User settings or legal mandate ● Turning off Javascript ● Disabling cookies (either first party of third party) ● European countries increasingly hostile to information collection by US companies
  • 13. Access Logs ● Configurable ● Won’t go away ○ You own the data ● But these logs give you everything ● Aforementioned bots can really mess with logs ○ So-called ‘spider-traps’ an extreme example
  • 14. Access Logs ● As much as I hate to say it, this stuff is probably obsolete ● Wheelwrights Shop
  • 15. Javascript-based ● Uses a js file to record interactions ● Less reliable but can offer different info about remote computer and, sometimes, user
  • 16. Differences between js-based systems ● Not all js-based analytics are the same ● Hosted vs third party ● Open vs proprietary
  • 17. Javascript-based ● Uses a js file to record interactions ● Less reliable but can offer different info about remote computer and, sometimes, the visitors.
  • 18. Javascript-based ‘less reliable’ meaning: ● They rely on technological capabilities and settings of the viewers browser ● Not all interactions are recorded ● But this is potentially a good thing
  • 19. Javascript-based ● Spiders almost universally won’t be counted by js-based systems ● Generally easy to set up ● Often have built-in integration with other products (advertisement networks and shopping carts.)
  • 20. Javascript-based Also: Often no need to do anything to receive new features and upgrades
  • 21. Javascript-based Hosted vs service ● Hosted: ○ Installing the system on a local machine and collected data stays local ● Third party service: ○ Data collected and housed by them
  • 22. Self-hosted JS Pros: ● More easy to customize ● Generally easier to share (no need to sign up for 3rd party service) ● If open source, you can tweak to your needs ● Less likely to be blocked
  • 23. Third party Pros: ● Typically easy to set up ● Built-in integrations with other products ● Generally no need to do anything to receive upgrades/new features
  • 24. Tips: Google ● Sometimes it’s difficult to find the screen with the desired information ● Analytics support site is very helpful here ● Secondary dimensions are very cool
  • 25. Google ● FYI: switching to a new type of GA account ○ so these labels might not be 100% correct in a week
  • 26. Google The basics Audience: who the users are Acquisition: Where they come from Behavior: Where they go on the site
  • 27. Google Behavior > Site content > All pages ● List of most frequently visited pages Audience > Technology > Browser & OS ● List of browser used to access site ○ Also notice “Primary Dimension” options ● Acquisition > Keyword > Organic ○ Note the large percentage of “(not provided)”
  • 28. Exporting from Google ● Under page title option to export the currently viewed report ● Select format from dropdown
  • 29. More advanced stuff ● Custom variables ● Custom segments ● These are user-defined and can be quite powerful
  • 30. Pwik ● Similar to Google Analytics ● But self-hosted ● And Open Source ○ Can be extended ○ Has a development roadmap
  • 31. Pwik Some benefits of Pwik ● Far less likely to be blocked ○ By user settings or legal mandate (EU) ● Configurable ● Won’t go away
  • 32. Pwik Page Views: ● Actions > Pages Site Traffic by hour ● Visitors > Times
  • 33. Pwik Many other reports ● Generally laid out nicely and intuitively labelled