WHAT YOUR
TEST TOOL
DOESN’T TELL
YOU
ME !
ME !
SENJA
SEGLA
SEGLA
EASY
ROUTE
TEST
TOOLS
1.
WHERE TO TEST
15% needed conversion rate uplift
1,000conversions
per month
TEST POWER DETERMINATION
o To determine what pages are
eligible to test on
o Can I run a test on this page
with a Power of >=80%?
“Statistical power is
the likelihood that
an experiment will
detect an effect
when there is an
effect there to be
detected”.
“The higher the
sample size
and/or the
bigger the effect
size the higher
the power”.
TEST POWER DETERMINATION
o Determine unique weekly visitors
and conversions per page type
o Determine test test duration <> uplift
The smaller the uplift you
want to recognize, the larger
the sample size required to
reach statistical significance.
https://goo.gl/FZBqor
2.
HOW TO CODE
What your testtool doesn't tell you
HOW TO CODE - WYSIWYG
$("<style>.commercial-title { position: absolute;top: 40%;left:0; color: white; text-transform: uppercase; font-size: 2.5em; line-height: 1em; text-align: center; width: 100%; text-shadow: 1px 1px #777; } .commercial-title small { display: block; } .commercial-title .fa-
stack { font-size: 0.5em; } </style><div class='info-section' style='margin:0; padding: 40px 0.9em 0 0.9em;'><h3 style='font-weight: 200; font-size: 2em;'>Gezien op TV</h3></div><div class='row blocks-wrap' style='padding:0;margin:0;'><div class='columns small-
12 medium-6 large-4'><div class='promo-block'><a href='https://tix.nl/inspiratie?dest=BKK'><img src='//media.tix.nl/home_ BKKv2.jpg'/><div class='commercial-title'>Jakarta<small><div style=text-transform:lowercase>v.a. &euro; 444,-
*</small></div></div></a></div></div><div class='columns small-12 medium-6 large-4'><div class='promo-block'><a href='https://tix.nl/inspiratie?dest=LON'><img src='//media.tix.nl/home_ LONv2.jpg'/><div class='commercial-title'>Lissabon<small><div
style=text-transform:lowercase>v.a. &euro; 69,- *</small></div></div></a></div></div><div class='columns small-12 medium-6 large-4'><div class='promo-block'><a href='https://tix.nl/inspiratie?dest=MRU'><img src='//media.tix.nl/home_ MRUv2.jpg'/><div
class='commercial-title'>Johannesburg<small><div style=text-transform:lowercase>v.a. &euro; 461,- *</small></div></div></a></div></div><div class='columns small-12 medium-6 large-4'><div class='promo-block'><a
href='https://tix.nl/inspiratie?dest=ROM'><img src='//media.tix.nl/home_ ROMv2.jpg'/><div class='commercial-title'>Barcelona<small><div style=text-transform:lowercase>v.a. &euro; 58,- *</small></div></div></a></div></div><div class='columns small-12
medium-6 large-4'><div class='promo-block'><a href='https://tix.nl/inspiratie?dest=CUR'><img src='//media.tix.nl/home_ CURv2.jpg'/><div class='commercial-title'>Ibiza<small><div style=text-transform:lowercase>v.a. &euro; 69,-
*</small></div></div></a></div></div><div class='columns small-12 medium-6 large-4'><div class='promo-block'><a href='https://tix.nl/inspiratie?dest=HKG'><img src='//media.tix.nl/home_ HKGv2.jpg'/><div class='commercial-title'>Bali<small><div style=text-
transform:lowercase>v.a. &euro; 455,- *</small></div></div></a></div></div><div class='columns small-12'><div class='promo-block' style='background-color:white;color:black;font-size:0.9em;'>* Getoonde prijzen zijn exclusief €25
boekingskosten.</div></div></div>").insertAfter($(".blocks-wrap .row"));
▸$(".blocks-wrap > .blocks-wrap > div:eq(2) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/99786d9839554ef9b83715e5affd744e.jpg"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(2) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=JKT"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(0) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/d772a16393194d61a85cc97ec6c6d77b.jpg"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(1) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/ce0b66adf1fd48b1a9434d5687bc7a74.jpg"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(3) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/db85accd68ad4335a0cd4b63c61a33d6.jpg"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(5) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/79ff8c46da7e4a149f27d53a8c9b6e3b.jpg"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(0) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=LIS"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(1) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=DPS"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(3) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=BCN"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(5) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=CMB"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(0) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/2691a454b6004c488e2d56272b5d8c18.jpg"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(0) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=LON"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(1) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/e37f74b693e745f89a57b43ccac6d970.jpg"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(1) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=ROM"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(2) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/3d44b80cfeda4e27b8fb68a1ddc231d6.jpg"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(2) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=BKK"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(4) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/819da47209d5477fb18bf7a60ccbe774.jpg"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(5) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/f4e923cb5d924c268cc09dd668a3f1d9.jpg"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(4) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=DPS"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(5) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=JKT"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(5) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/177f2e210ca54ead946cdc65c2cbc49b.jpg"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(0) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/ce2bccae9ec44ab28100f964080d2328.jpg"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(0) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=JKT"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(5) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=PAR"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(1) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/dd9b5bd118584e1089248b12f6147f13.jpg"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(2) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/d1b15fc837ee4d2181ba5f07523fcc76.jpg"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(3) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/e7f72d4047f94638a286d72cd70664c7.jpg"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(4) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/834ed7638bef4ce286ff098177a9db9b.jpg"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(1) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=LIS"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(2) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=JNB"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(3) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=TYO"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(4) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=IBZ"});
▸$(".blocks-wrap > .blocks-wrap > div:eq(5) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=HAN"});
▸$(".medium-12 > .blocks-wrap > div:eq(2) > div:eq(5) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"http://media.tix.nl/home_ DPS.jpg"});
▸$(".medium-12 > .blocks-wrap > div:eq(2) > div:eq(3) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"http://media.tix.nl/home_ BAR.jpg"});
▸$(".medium-12 > .blocks-wrap > div:eq(2) > div:eq(3) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=BKK"});
▸$(".medium-12 > .blocks-wrap > div:eq(2) > div:eq(5) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=BKK"});
▸$(".medium-12 > .blocks-wrap > div:eq(2) > div:eq(5) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=DPS"});
▸$("#content-slider-wrap > a > img:eq(0)").replaceWith("<img src= "//media.tix.nl/ingangnaarkaart1.png " alt= " ">");
▸$(".medium-12 > .blocks-wrap > div:eq(2) > div:eq(3) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/cc7d8af46e054ab2ae5a1bfa38c411c7.jpg"});
▸$(".medium-12 > .blocks-wrap > div:eq(2) > div:eq(3) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=BCN"});
o Slows down your site
o Unreliable in different browsers
o Hard to QA
o Limited in what you can test
HOW TO CODE - WYSIWYG
HOW TO CODE – Javascript / jQuery
$("<style>.commercial-title { position: absolute;top: 40%;left:0; color: white; text-transform: uppercase; font-size: 2.5em; line-height: 1em; text-align: center; width: 100%; text-shadow: 1px 1px #777; } .commercial-title small { display: block; } .commercial-
title .fa-stack { font-size: 0.5em; } </style><div class='info-section' style='margin:0; padding: 40px 0.9em 0 0.9em;'><h3 style='font-weight: 200; font-size: 2em;'>Gezien op TV</h3></div><div class='row blocks-wrap' style='padding:0;margin:0;'><div
class='columns small-12 medium-6 large-4'><div class='promo-block'><a href='https://tix.nl/inspiratie?dest=JKT'><img src='//media.tix.nl/home_ JKT.jpg'/><div class='commercial-title'>Jakarta<small><div style=text-transform:lowercase>v.a. &euro;
444,- *</small></div></div></a></div></div><div class='columns small-12 medium-6 large-4'><div class='promo-block'><a href='https://tix.nl/inspiratie?dest=LIS'><img src='//media.tix.nl/home_ LIS.jpg'/><div class='commercial-
title'>Lissabon<small><div style=text-transform:lowercase>v.a. &euro; 69,- *</small></div></div></a></div></div><div class='columns small-12 medium-6 large-4'><div class='promo-block'><a href='https://tix.nl/inspiratie?dest=JNB'><img
src='//media.tix.nl/home_ JNB.jpg'/><div class='commercial-title'>Johannesburg<small><div style=text-transform:lowercase>v.a. &euro; 461,- *</small></div></div></a></div></div><div class='columns small-12 medium-6 large-4'><div class='promo-
block'><a href='https://tix.nl/inspiratie?dest=BCN'><img src='//media.tix.nl/home_ BCN.jpg'/><div class='commercial-title'>Barcelona<small><div style=text-transform:lowercase>v.a. &euro; 58,- *</small></div></div></a></div></div><div class='columns
small-12 medium-6 large-4'><div class='promo-block'><a href='https://tix.nl/inspiratie?dest=IBZ'><img src='//media.tix.nl/home_ IBZ.jpg'/><div class='commercial-title'>Ibiza<small><div style=text-transform:lowercase>v.a. &euro; 69,-
*</small></div></div></a></div></div><div class='columns small-12 medium-6 large-4'><div class='promo-block'><a href='https://tix.nl/inspiratie?dest=DPS'><img src='//media.tix.nl/home_ DPS.jpg'/><div class='commercial-title'>Bali<small><div
style=text-transform:lowercase>v.a. &euro; 455,- *</small></div></div></a></div></div><div class='columns small-12'><div class='promo-block' style='background-color:white;color:black;font-size:0.9em;'>* Getoonde prijzen zijn exclusief €25
boekingskosten.</div></div></div>").insertAfter($(".blocks-wrap .row"));
3.
VALIDITY
THREATS
“The extent to which
a conclusion or
measurement is well-
founded and corresponds
accurately to the real world
SELECTION BIAS
Make sure the proportion of
each group in the sample is
representative of
the total
population
o Test for full weeks to rule out day-of-
the-week effects
DAY-OF-THE-WEEK EFFECTS
o Don’t generalize results found in a
distinct period (e.g. Christmas,
major sale / big campaign)
SEASONAL FLUCTUATION
SHOPPING!
SAMPLE POLLUTION
When people in a test
see both variations
(ABBA-effect) due to
any uncontrolled
external factor.
o Cookie deletion
o Incognito browsing
o Cross-device usage
o Cross-browser usage
o Customer journey effects
CAUSES OF SAMPLE POLLUTION
A
B
Measured
uplift
A
B
Pollution
Pollution
ABBA
A
B
Pollution
Pollution
ABBA
Assumption: when
visitors have seen A
and B, then the
conversion rate is the
average of A and B
A
A
B
Pollution
Pollution
ABBA
Assumption: when
visitors have seen A
and B, then the
conversion rate is the
average of A and B
A
Measured
uplift
A
B
Pollution
Pollution
ABBA
Assumption: when
visitors have seen A
and B, then the
conversion rate is the
average of A and B
A
Actual
uplift
The longer the test
duration, the higher
the test power, but…
the longer the test
duration, the higher
the chance of sample
pollution
“How big are each of
these issues on
your site?
o Test for as few weeks as possible
o Test on 100% of the traffic
o Test with less variations (only A vs B)
o Test bolder changes
HIGH SAMPLE POLLUTION ?
4.
HOW TO
INTERPRET
STATISTICS
2 TYPES OF STATISTICS
1. Frequentist
statistics
2. Bayesian
statistics
2 TYPES OF STATISTICS
1. Frequentist
statistics
2. Bayesian
statistics
Null hypothesis
Defendant is innocent
Alternative hypothesis
Defendant is guilty
Present the evidence
Collect data
Judge the evidence
“Is there reasonable doubt? Can
the defendant still be innocent?”
Yes
Fail to reject H0
No
Reject H0
Null hypothesis
Conversion rate A = B
Alternative hypothesis
Conversion rate A < B
Present the evidence
Run A/B-test
P - value
“Could the data plausibly have
happened by chance if the null
hypothesis is true?”
Yes
Fail to reject H0
No
Reject H0
What your testtool doesn't tell you
H0 = Variation A and B have the same conversion rate
INTERPRET STATISTICS
Conclusion: there’s a 1% chance of
observing a 9.58% difference,
given that there is no difference in
conversion rate between A and B
BINARY OUTCOME
P <= 0.05 P > 0.05
2 TYPES OF STATISTICS
1. Frequentist
statistics
2. Bayesian
statistics
BAYESIAN STATISTICS abtestguide.com/bayesian/
NO BINARY OUTCOME
89,2%
IMPLEMENT OR NOT?
o Depends on how much risk the
business is willing to take
o Depends on the type of test : how
invasive (in terms of resources) is
the test?
IMPLEMENT B PROBABILITY * EFFECT ON REVENU
Expected risk 10.8% - € 204,400
Expected uplift 89.2% + € 647,150
Contribution € 599,333
* Based on 6 months and an average order value of € 175
RISK ASSESSMENT
TAKE AWAYS
1. Determine where you can test (Power >80%)
2. Don’t use the WYSIWYG-editor
3. Check the representativeness of your sample
4. Run tests for full weeks
5. Do research how big sample pollution is
6. Make sure you can interpret the statistics
correctly
THANK YOU!
@AM_Klaassen
a.klaassen@tix.nl
nl.linkedin.com/in/amklaassen

More Related Content

PDF
CAR Email 06.05.02 (a)
PDF
RCEC Email 3.5.03
PDF
CAR Email 6.5.02 (d)
TXT
Sk.php
TXT
Html
PPTX
Symfony 1, mi viejo amigo
PDF
UMA NOVA CONCEPÇÃO DO DIREITO
KEY
Authentication
CAR Email 06.05.02 (a)
RCEC Email 3.5.03
CAR Email 6.5.02 (d)
Sk.php
Html
Symfony 1, mi viejo amigo
UMA NOVA CONCEPÇÃO DO DIREITO
Authentication

What's hot (16)

KEY
Google
PPT
Praktik Pengembangan Konten E-Learning HTML5 Sederhana
PPT
PHP webboard
PDF
Recent Changes to jQuery's Internals
PDF
The Testing Games: Mocking, yay!
PPT
PHP cart
ODP
Javascript & jQuery: A pragmatic introduction
PDF
Юрий Буянов «Squeryl — ORM с человеческим лицом»
PDF
Php code for online quiz
KEY
Deploying
PDF
기계가 선형대수학을 통해 한국어를 이해하는 방법
PPTX
PDF
Java script programms
PDF
jQuery%20on%20Rails%20Presentation
PDF
Ushahidi
ODP
Test du futur avec Spock
Google
Praktik Pengembangan Konten E-Learning HTML5 Sederhana
PHP webboard
Recent Changes to jQuery's Internals
The Testing Games: Mocking, yay!
PHP cart
Javascript & jQuery: A pragmatic introduction
Юрий Буянов «Squeryl — ORM с человеческим лицом»
Php code for online quiz
Deploying
기계가 선형대수학을 통해 한국어를 이해하는 방법
Java script programms
jQuery%20on%20Rails%20Presentation
Ushahidi
Test du futur avec Spock
Ad

Similar to What your testtool doesn't tell you (20)

TXT
Private slideshow
KEY
CSS3 Takes on the World
KEY
Web accessibility
PDF
Яків Крамаренко “Локатори і з чим їх їдять:)”
TXT
Blog skins396734
KEY
Jarv.us Showcase — SenchaCon 2011
PPTX
Лабораторная работа №1
PDF
Yearning jQuery
PDF
Responsive Responsive Design
PPTX
Progressive What Apps?
PDF
Backbone - TDC 2011 Floripa
KEY
jQuery: Tips, tricks and hints for better development and Performance
TXT
smoke1272528461
PDF
Acceptance Testing with Webrat
PDF
HTML5 after the hype - JFokus2015
TXT
Test upload
PPTX
jQuery Foot-Gun Features
DOC
PPT
JQuery Flot
TXT
Fcr 2
Private slideshow
CSS3 Takes on the World
Web accessibility
Яків Крамаренко “Локатори і з чим їх їдять:)”
Blog skins396734
Jarv.us Showcase — SenchaCon 2011
Лабораторная работа №1
Yearning jQuery
Responsive Responsive Design
Progressive What Apps?
Backbone - TDC 2011 Floripa
jQuery: Tips, tricks and hints for better development and Performance
smoke1272528461
Acceptance Testing with Webrat
HTML5 after the hype - JFokus2015
Test upload
jQuery Foot-Gun Features
JQuery Flot
Fcr 2
Ad

More from Annemarie Klaassen (10)

PDF
10 tips to improve the validity of your experiments
PDF
Emerce GAUC - Optimaliseer je optimalisatieprogramma
PDF
Verbeter je A/B-testen | customer data conference
PDF
Optimaliseer je optimalisatieprogramma - digital analytics conference
PPTX
MOA awards jury presentatie
PPTX
Nhtv gastcollege - Methoden van onderzoek
PPTX
Conversion Optimization
PPTX
Workshop data driven test strategy
PPTX
Test for business growth
PPTX
Optimize for money
10 tips to improve the validity of your experiments
Emerce GAUC - Optimaliseer je optimalisatieprogramma
Verbeter je A/B-testen | customer data conference
Optimaliseer je optimalisatieprogramma - digital analytics conference
MOA awards jury presentatie
Nhtv gastcollege - Methoden van onderzoek
Conversion Optimization
Workshop data driven test strategy
Test for business growth
Optimize for money

Recently uploaded (20)

PPTX
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
PDF
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PPTX
Topic 5 Presentation 5 Lesson 5 Corporate Fin
PPTX
Managing Community Partner Relationships
PPTX
FMIS 108 and AISlaudon_mis17_ppt_ch11.pptx
PDF
Global Data and Analytics Market Outlook Report
PPTX
New ISO 27001_2022 standard and the changes
DOCX
Factor Analysis Word Document Presentation
PDF
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
PDF
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
PDF
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
PPTX
SET 1 Compulsory MNH machine learning intro
PPT
Image processing and pattern recognition 2.ppt
PPT
statistic analysis for study - data collection
PDF
Microsoft Core Cloud Services powerpoint
PPT
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
PPTX
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
PPTX
Steganography Project Steganography Project .pptx
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
Topic 5 Presentation 5 Lesson 5 Corporate Fin
Managing Community Partner Relationships
FMIS 108 and AISlaudon_mis17_ppt_ch11.pptx
Global Data and Analytics Market Outlook Report
New ISO 27001_2022 standard and the changes
Factor Analysis Word Document Presentation
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
SET 1 Compulsory MNH machine learning intro
Image processing and pattern recognition 2.ppt
statistic analysis for study - data collection
Microsoft Core Cloud Services powerpoint
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
Steganography Project Steganography Project .pptx

What your testtool doesn't tell you

  • 10. 15% needed conversion rate uplift 1,000conversions per month
  • 11. TEST POWER DETERMINATION o To determine what pages are eligible to test on o Can I run a test on this page with a Power of >=80%?
  • 12. “Statistical power is the likelihood that an experiment will detect an effect when there is an effect there to be detected”.
  • 13. “The higher the sample size and/or the bigger the effect size the higher the power”.
  • 14. TEST POWER DETERMINATION o Determine unique weekly visitors and conversions per page type o Determine test test duration <> uplift
  • 15. The smaller the uplift you want to recognize, the larger the sample size required to reach statistical significance. https://goo.gl/FZBqor
  • 18. HOW TO CODE - WYSIWYG $("<style>.commercial-title { position: absolute;top: 40%;left:0; color: white; text-transform: uppercase; font-size: 2.5em; line-height: 1em; text-align: center; width: 100%; text-shadow: 1px 1px #777; } .commercial-title small { display: block; } .commercial-title .fa- stack { font-size: 0.5em; } </style><div class='info-section' style='margin:0; padding: 40px 0.9em 0 0.9em;'><h3 style='font-weight: 200; font-size: 2em;'>Gezien op TV</h3></div><div class='row blocks-wrap' style='padding:0;margin:0;'><div class='columns small- 12 medium-6 large-4'><div class='promo-block'><a href='https://tix.nl/inspiratie?dest=BKK'><img src='//media.tix.nl/home_ BKKv2.jpg'/><div class='commercial-title'>Jakarta<small><div style=text-transform:lowercase>v.a. &euro; 444,- *</small></div></div></a></div></div><div class='columns small-12 medium-6 large-4'><div class='promo-block'><a href='https://tix.nl/inspiratie?dest=LON'><img src='//media.tix.nl/home_ LONv2.jpg'/><div class='commercial-title'>Lissabon<small><div style=text-transform:lowercase>v.a. &euro; 69,- *</small></div></div></a></div></div><div class='columns small-12 medium-6 large-4'><div class='promo-block'><a href='https://tix.nl/inspiratie?dest=MRU'><img src='//media.tix.nl/home_ MRUv2.jpg'/><div class='commercial-title'>Johannesburg<small><div style=text-transform:lowercase>v.a. &euro; 461,- *</small></div></div></a></div></div><div class='columns small-12 medium-6 large-4'><div class='promo-block'><a href='https://tix.nl/inspiratie?dest=ROM'><img src='//media.tix.nl/home_ ROMv2.jpg'/><div class='commercial-title'>Barcelona<small><div style=text-transform:lowercase>v.a. &euro; 58,- *</small></div></div></a></div></div><div class='columns small-12 medium-6 large-4'><div class='promo-block'><a href='https://tix.nl/inspiratie?dest=CUR'><img src='//media.tix.nl/home_ CURv2.jpg'/><div class='commercial-title'>Ibiza<small><div style=text-transform:lowercase>v.a. &euro; 69,- *</small></div></div></a></div></div><div class='columns small-12 medium-6 large-4'><div class='promo-block'><a href='https://tix.nl/inspiratie?dest=HKG'><img src='//media.tix.nl/home_ HKGv2.jpg'/><div class='commercial-title'>Bali<small><div style=text- transform:lowercase>v.a. &euro; 455,- *</small></div></div></a></div></div><div class='columns small-12'><div class='promo-block' style='background-color:white;color:black;font-size:0.9em;'>* Getoonde prijzen zijn exclusief €25 boekingskosten.</div></div></div>").insertAfter($(".blocks-wrap .row")); ▸$(".blocks-wrap > .blocks-wrap > div:eq(2) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/99786d9839554ef9b83715e5affd744e.jpg"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(2) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=JKT"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(0) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/d772a16393194d61a85cc97ec6c6d77b.jpg"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(1) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/ce0b66adf1fd48b1a9434d5687bc7a74.jpg"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(3) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/db85accd68ad4335a0cd4b63c61a33d6.jpg"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(5) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/79ff8c46da7e4a149f27d53a8c9b6e3b.jpg"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(0) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=LIS"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(1) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=DPS"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(3) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=BCN"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(5) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=CMB"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(0) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/2691a454b6004c488e2d56272b5d8c18.jpg"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(0) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=LON"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(1) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/e37f74b693e745f89a57b43ccac6d970.jpg"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(1) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=ROM"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(2) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/3d44b80cfeda4e27b8fb68a1ddc231d6.jpg"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(2) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=BKK"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(4) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/819da47209d5477fb18bf7a60ccbe774.jpg"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(5) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/f4e923cb5d924c268cc09dd668a3f1d9.jpg"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(4) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=DPS"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(5) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=JKT"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(5) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/177f2e210ca54ead946cdc65c2cbc49b.jpg"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(0) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/ce2bccae9ec44ab28100f964080d2328.jpg"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(0) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=JKT"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(5) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=PAR"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(1) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/dd9b5bd118584e1089248b12f6147f13.jpg"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(2) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/d1b15fc837ee4d2181ba5f07523fcc76.jpg"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(3) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/e7f72d4047f94638a286d72cd70664c7.jpg"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(4) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/834ed7638bef4ce286ff098177a9db9b.jpg"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(1) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=LIS"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(2) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=JNB"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(3) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=TYO"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(4) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=IBZ"}); ▸$(".blocks-wrap > .blocks-wrap > div:eq(5) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=HAN"}); ▸$(".medium-12 > .blocks-wrap > div:eq(2) > div:eq(5) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"http://media.tix.nl/home_ DPS.jpg"}); ▸$(".medium-12 > .blocks-wrap > div:eq(2) > div:eq(3) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"http://media.tix.nl/home_ BAR.jpg"}); ▸$(".medium-12 > .blocks-wrap > div:eq(2) > div:eq(3) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=BKK"}); ▸$(".medium-12 > .blocks-wrap > div:eq(2) > div:eq(5) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=BKK"}); ▸$(".medium-12 > .blocks-wrap > div:eq(2) > div:eq(5) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=DPS"}); ▸$("#content-slider-wrap > a > img:eq(0)").replaceWith("<img src= "//media.tix.nl/ingangnaarkaart1.png " alt= " ">"); ▸$(".medium-12 > .blocks-wrap > div:eq(2) > div:eq(3) > div:eq(0) > a:eq(0) > img:eq(0)").attr({"src":"//cdn.optimizely.com/img/593520515/cc7d8af46e054ab2ae5a1bfa38c411c7.jpg"}); ▸$(".medium-12 > .blocks-wrap > div:eq(2) > div:eq(3) > div:eq(0) > a:eq(0)").attr({"href":"https://tix.nl/inspiratie?dest=BCN"});
  • 19. o Slows down your site o Unreliable in different browsers o Hard to QA o Limited in what you can test HOW TO CODE - WYSIWYG
  • 20. HOW TO CODE – Javascript / jQuery $("<style>.commercial-title { position: absolute;top: 40%;left:0; color: white; text-transform: uppercase; font-size: 2.5em; line-height: 1em; text-align: center; width: 100%; text-shadow: 1px 1px #777; } .commercial-title small { display: block; } .commercial- title .fa-stack { font-size: 0.5em; } </style><div class='info-section' style='margin:0; padding: 40px 0.9em 0 0.9em;'><h3 style='font-weight: 200; font-size: 2em;'>Gezien op TV</h3></div><div class='row blocks-wrap' style='padding:0;margin:0;'><div class='columns small-12 medium-6 large-4'><div class='promo-block'><a href='https://tix.nl/inspiratie?dest=JKT'><img src='//media.tix.nl/home_ JKT.jpg'/><div class='commercial-title'>Jakarta<small><div style=text-transform:lowercase>v.a. &euro; 444,- *</small></div></div></a></div></div><div class='columns small-12 medium-6 large-4'><div class='promo-block'><a href='https://tix.nl/inspiratie?dest=LIS'><img src='//media.tix.nl/home_ LIS.jpg'/><div class='commercial- title'>Lissabon<small><div style=text-transform:lowercase>v.a. &euro; 69,- *</small></div></div></a></div></div><div class='columns small-12 medium-6 large-4'><div class='promo-block'><a href='https://tix.nl/inspiratie?dest=JNB'><img src='//media.tix.nl/home_ JNB.jpg'/><div class='commercial-title'>Johannesburg<small><div style=text-transform:lowercase>v.a. &euro; 461,- *</small></div></div></a></div></div><div class='columns small-12 medium-6 large-4'><div class='promo- block'><a href='https://tix.nl/inspiratie?dest=BCN'><img src='//media.tix.nl/home_ BCN.jpg'/><div class='commercial-title'>Barcelona<small><div style=text-transform:lowercase>v.a. &euro; 58,- *</small></div></div></a></div></div><div class='columns small-12 medium-6 large-4'><div class='promo-block'><a href='https://tix.nl/inspiratie?dest=IBZ'><img src='//media.tix.nl/home_ IBZ.jpg'/><div class='commercial-title'>Ibiza<small><div style=text-transform:lowercase>v.a. &euro; 69,- *</small></div></div></a></div></div><div class='columns small-12 medium-6 large-4'><div class='promo-block'><a href='https://tix.nl/inspiratie?dest=DPS'><img src='//media.tix.nl/home_ DPS.jpg'/><div class='commercial-title'>Bali<small><div style=text-transform:lowercase>v.a. &euro; 455,- *</small></div></div></a></div></div><div class='columns small-12'><div class='promo-block' style='background-color:white;color:black;font-size:0.9em;'>* Getoonde prijzen zijn exclusief €25 boekingskosten.</div></div></div>").insertAfter($(".blocks-wrap .row"));
  • 22. “The extent to which a conclusion or measurement is well- founded and corresponds accurately to the real world
  • 23. SELECTION BIAS Make sure the proportion of each group in the sample is representative of the total population
  • 24. o Test for full weeks to rule out day-of- the-week effects DAY-OF-THE-WEEK EFFECTS
  • 25. o Don’t generalize results found in a distinct period (e.g. Christmas, major sale / big campaign) SEASONAL FLUCTUATION SHOPPING!
  • 26. SAMPLE POLLUTION When people in a test see both variations (ABBA-effect) due to any uncontrolled external factor.
  • 27. o Cookie deletion o Incognito browsing o Cross-device usage o Cross-browser usage o Customer journey effects CAUSES OF SAMPLE POLLUTION
  • 30. A B Pollution Pollution ABBA Assumption: when visitors have seen A and B, then the conversion rate is the average of A and B A
  • 31. A B Pollution Pollution ABBA Assumption: when visitors have seen A and B, then the conversion rate is the average of A and B A Measured uplift
  • 32. A B Pollution Pollution ABBA Assumption: when visitors have seen A and B, then the conversion rate is the average of A and B A Actual uplift
  • 33. The longer the test duration, the higher the test power, but… the longer the test duration, the higher the chance of sample pollution
  • 34. “How big are each of these issues on your site?
  • 35. o Test for as few weeks as possible o Test on 100% of the traffic o Test with less variations (only A vs B) o Test bolder changes HIGH SAMPLE POLLUTION ?
  • 37. 2 TYPES OF STATISTICS 1. Frequentist statistics 2. Bayesian statistics
  • 38. 2 TYPES OF STATISTICS 1. Frequentist statistics 2. Bayesian statistics
  • 39. Null hypothesis Defendant is innocent Alternative hypothesis Defendant is guilty Present the evidence Collect data Judge the evidence “Is there reasonable doubt? Can the defendant still be innocent?” Yes Fail to reject H0 No Reject H0
  • 40. Null hypothesis Conversion rate A = B Alternative hypothesis Conversion rate A < B Present the evidence Run A/B-test P - value “Could the data plausibly have happened by chance if the null hypothesis is true?” Yes Fail to reject H0 No Reject H0
  • 42. H0 = Variation A and B have the same conversion rate INTERPRET STATISTICS Conclusion: there’s a 1% chance of observing a 9.58% difference, given that there is no difference in conversion rate between A and B
  • 43. BINARY OUTCOME P <= 0.05 P > 0.05
  • 44. 2 TYPES OF STATISTICS 1. Frequentist statistics 2. Bayesian statistics
  • 47. IMPLEMENT OR NOT? o Depends on how much risk the business is willing to take o Depends on the type of test : how invasive (in terms of resources) is the test?
  • 48. IMPLEMENT B PROBABILITY * EFFECT ON REVENU Expected risk 10.8% - € 204,400 Expected uplift 89.2% + € 647,150 Contribution € 599,333 * Based on 6 months and an average order value of € 175 RISK ASSESSMENT
  • 49. TAKE AWAYS 1. Determine where you can test (Power >80%) 2. Don’t use the WYSIWYG-editor 3. Check the representativeness of your sample 4. Run tests for full weeks 5. Do research how big sample pollution is 6. Make sure you can interpret the statistics correctly

Editor's Notes

  • #2: First of all, I’d like to thank Jackie for inviting me to this Conversion Elite conference in Manchester.
  • #3: For everyone who doesn’t know me, I’ll start with a short introduction. I can keep it real short, because Eline did the most work for me. As said I’ve worked in the field of webanalytics for over 7 years now and have worked for several companies. I like the travel industry the most, that’s why I’ve recently switched jobs. Right now, I’m Conversion Manager at Tix.nl and am responsible for the Abtesting program. I love running experiments and optimize websites based on data.
  • #4: As said, the travel industry really appeals to me. And that’s not just work related. My life motto is basically: work, save, travel, repeat. The thing that make me the most happy en thrilled is wandering the world. Right now I’m in the work phase, but tomorrow it’s travel time. I have extended my stay until this weekend so if you have any tips for me please let me know during drinks.
  • #5: Last summer we went to Northern Norway. We visited Tromso and the Lofoten, but also went to a lesser known island called Senja. There were no travelguides about this area so we googled to find out what we could do. We found a hike going to one of the highest mountains of Senja, which should have an amazing view. It was supposed to be a 2 hour hike, so very doable.
  • #6: The trail started off rather easy and we were opportunistic and enthousiastic to reach the top. But our enthousiasm soon declined. The trail was getting steeper and steeper and there were several times that I wanted to call it a day and return to the car. My limbs were sore and I was totally out of breath.
  • #7: But we persevered and made it to the top. The view was to die for. And the whole journey towards the top was totally worth it.
  • #8: It would have been soooo much easier if there was an easy route, like a cable car. But then we wouldn’t have overcome the challenges and learned to persevere, and we wouldn’t have been as thrilled to made it to the top.
  • #9: This hike for me stands for the way we look at A/B-testing as well. We are inclined to only have the end goal in mind and take the easy route. We buy a testingtool and immediately expect amazing outcomes. We let the marketeer run the test program. The testtool can make the variation so we don’t need developers, the testtool will tell when the test is cooked so we don’t need data scientists and the testtool will tell you if it’s a winner so we don’t have to do the analysis ourselves. Easy does it! But running a proper CRO program can’t be done by skipping the hiking trail and taking the cable car to the top. You need to put time and effort in it. Because there are important things our testingtool doesn’t tell us.
  • #10: The first thing testtools don’t tell you is what’s worth to test? Where should you start? You can set up a test on a page with 50 visits per day, but such a test would run for ages and the results would then still be questionable! So how do you determine where you should A/B-test on your site and where you shouldn’t?
  • #11: Well, there’s a rough rule of thumb to keep in mind. Basically, if you have less then 1,000 conversions per month, it’s not worth the trouble. Because in order to find a significant effect between A and B you will need a conversion rate uplift of around 15%. And these kind of uplifts are not easy to accomplish. So if a page has fewer than 1,000 conversions per month, you shouldn’t spend your resources on it, but find other ways to validate your idea (with other types of more qualitative research). But this benchmark of a 1000 conversions per month isn’t specific enough.
  • #12: Before I start with an optimization project I first start with a test power determination. This is basically a fancy word for determining whether of not you can run a test on a specific page with a reasonable chance to find a winner. This reasonable chance to detect a winner is called ‘Statistical Power’
  • #13: I know this sounds a bit cryptic, but what you should remember is
  • #14: you need a certain amount of people in your test and the change should make an impact on behavior to be able to proof your hypothesis. the more people in your test, the higher the power and the bigger de difference between A and B the higher the power.
  • #15: For each page template you want to test on you should determine the unique weekly visitors and the unique weekly buyers. Given the 80% Power critical value and the pre-determined confidence level you can calculate the needed sample size and corresponding effect size. Suppose you have 32.000 weekly visitors and 800 buyers. These are nice numbers to perform an Abtest on right?
  • #16: Well, in order to have a Power of at least 80% you can detect uplifts of 9% and higher in 4 weeks time. If you expect that your variation will only result in a 5% uplift, then your sample size needs to be way higher to garantee 80% power. You then need to run the test for 13 weeks. And this is not something you want to do. The adviced maximum test duration is 4 weeks. If you would test shorter than this, then the chance to detect a winner will be far lower than 80%. You need to be aware of the of the effect you need from your test and how long the tests needs to run. If you know what impact you need to make, you can design your test accordingly. If you know you need an uplift of 10% you will probably test bolder changes.
  • #17: The 2nd thing your testing tool doesnt tell you is how you should code your variations.
  • #18: A couple of months ago we did an Abtest where we showed our destinations that are highlighted in the tv-commercial on the homepage. The code for these 6 blocks was written by our front-end developer, but during the test the visuals and prices needed to be updated. My colleague was responsible for this and used the WYSIWYG editor for this.
  • #19: The result after 4 weeks of testing was this monstrosity of a code.
  • #20: This is really dirty code and front-end developers will probably hate you when you show them this. Because, it doesn’t only look ugly, it is ugly as well. Because the longer the code, the slower it will load. And it’s unreliable in different browsers. Some adjustments will work perfectly find in Chrome, but will not work in Internet Explorer. And where you have these kind of codes it becomes hard to QA And the third drawback of the WYSIWYG editor is that it’s rather limited. You cannot add new functionalities or completely change the lay-out.
  • #21: For easy tests you can code this yourself using html/css and javascript and jQuery. You can quickly learn the basics on Codecademy.com. For the more complicated tests you really need development resources. There’s no way around it.
  • #22: The 3rd thing testing tools don’t address is the possibility of validity problems.
  • #23: Validity is the extent to which a conclusion or measurement is well-founded and corresponds accurately to the real world. If you have run a valid experiment, the results can be generalized for the whole population, but there are a couple of checks you need to do.
  • #24: You need to make sure that each group in the sample is representative of the total population. Especially if you have a smaller sample size this might be an issue and when you have groups in your population with very different conversion rate. For example, new visitors normally have a far lower conversion rate then returning visitors. This means that when you have relatively more returning visitors in one of your variations the measured uplift may well be caused by the difference in composition of the samples. So you need to check the representativeness of each sample before drawing conclusions. This is something a testtool doesn’t do for you.
  • #25: All the testing tools in the market will call a winning result regardless of how long the test has run. Of course they take into account sample size, but they don’t take into account day-of-the-week effects. If you take a look at your data over a longer period of time and big differences appear in conversion rates per day, you should always test for full weeks. Because a winning test based on only weekdays, may well underperform in the weekend. Again: these results cannot be generalized over all the days of the week.
  • #26: And another important one: you cannot generalize results found in a period where user behavior is different than normal. If you found a winning result around Christmas, you should at least re-test this in another time of the year. Or wait for the next Christmas to implement the winner.
  • #27: Another threat to running a successfull testing program is the possibility of sample pollution. We wrongfully asume that we can build the perfect scientific experiment online.
  • #28: There are several reasons why unique users might end up in both variations of an experiment. First of all visitors can delete their cookies or browse incognito. When visitors return to the site they have a 50% chance (when you just run A vs B) that they will see the other variation. Secondly, more and more visitors will use more than 1 browser of device in their quest for your product. Especially cross-device usage is a big polluter of your sample. And lastly, if the orientation phase for the product you are selling is long, then visitors will probably have seen the original page before they ended up in your test variation.
  • #29: So suppose you did an experiment and you have found that B has a higher conversion rate than A
  • #30: Now you know a proportion of both the samples is polluted. Some visitors in A have also seen B and vice versa.
  • #31: We don’t know what the conversion rate is of the ABBA group, but we make the assumption that their conversion rate is probably the average of A and B. This means that the actual conversion rate of A is even lower (because it positively influenced by the ABBA group). And the actual conversion rate of B is probably higher – because it’s negatively influenced by the ABBA group.
  • #32: So you measured this uplift in your test
  • #33: But the actual uplift is far higher! This means that if you have a lot of pollution in your test, you will have a hard time finding the effect.
  • #35: Especially if your win ratio is at the lower end (like lower than 20%) you need to do research how big these issues are.
  • #36: If you have found out that the pollution rate is very high, you need to adjust your testing strategy.
  • #37: And the last thing I want to address it that you need a basic understanding of the statistics that are used by your testtool. Because every A/B-test uses its own statsengine and you need to be aware what this means when drawing conclusions.
  • #38: There are 2 types of statistics that have been used in A/B-testing: frequentist and Bayesian.
  • #39: Historically most testtools used frequentist statistics, but over the last couple of years more and more tools switched to Bayesian statistics. And this is not without reason. Using frequentist statistics has a couple of challenges. I’ll explain this.
  • #40: Frequentist testing can be compared with court trial in the US. The null hypothesis says that the defendant is innocent. The defendant is innocent until proven guilty! and the alternative hypothesis says that the defendant is guilty. We then present evidence or, or in other words, collect data. Then, we judge this evidence and ask ourselves the question, could the evidence have happened by chance if the defendant is innocent? Is there reasonable doubt?
  • #41: This principle is used in Frequentist statistics as well. You have a null hypothesis stating that there is no difference in conversion rate and you try to disprove this claim. You run your test and you judge the results based on the p-value. You try to answer the question: could the data plausibly have happend if the null hypothesis is true? If there still is doubt then you fail to reject the null hypothesis and conclude that there is no difference. But if there’s no doubt (or very little) than you reject the null hypothesis.
  • #42: Snoop Dogg has an excellent line for this: if the p is low, the ho must go
  • #43: I will give an example how this translates to an A/B-test. So, suppose you did an experiment and the p-value of that test was 0.01. You remembered: the p-value is very low, so the H0 needs to go. But what is the exact conclusion of this test? With the use of frequentist statistics you can only conclude how surprising the results are based on the hypothesis that A and B perform exactly the same. I don’t know about you, but this confuses the hell out of me! This is really hard to explain – not only to fellow optimizers but mainly to your boss. And besides the confusion, I’m actually not interested in “how unlikely it is that I found these results.” I just want to know whether variation B is better than A. Frequentist statistics are counter intuitive.
  • #44: The other challenge with using frequentist statistics is that an A/B-test can only have 2 outcomes: you either have a winner of no winner. In other words, you can either reject the null hypothesis or fail to reject it. And there is no wiggle room. If you take a look at this test-result you would conclude that there is no winner, that it mustn’t be implemented and that the measured uplift in conversion rate wasn’t enough. So you will see this a loser and move on to another test idea. However, there seems to be a positive movement (the measured uplift is 5%), but it isn’t big enough to recognize as a significant winner.
  • #45: The alternative to using frequentist statistics, is Bayesian statistics. And as said most test tools have switched to using Bayesian (or using flavors of Bayesian). And that’s not without reason: Bayesian statistics makes more sense, since it far better suits the underlying business question.
  • #46: When you use Bayesian statistics, to evaluate your A/B-test, then there is no difficult statistical terminology involved anymore. There’s no null hypothesis, no p-value or z-value et cetera. It just shows you the measured uplift and answers the question what the chance is that B is better than A. Easy right? Everyone can understand this. Based on the same numbers of the A/B-test I showed you earlier you have a 89,2% chance that B will actually be better than A. Probably every manager would understand this and will like these odds.
  • #47: When using a Bayesian A/B-test evaluation method you no longer have a binary outcome like the t-test does. A test result won’t tell you winner / no winner, but a percentage between 0 and 100% whether the variation performs better than the original. In this example 89,2%. The question that remains is: is this enough to be implemented?
  • #48: Well that depends on a couple of things. If you would implement a test variation with a probability of 51% then you’re not doing much better than just flipping a coin. The risk of implementing a losing variation is quite high. Depending in the type of business you may be more or less willing to take risks. If you are a start-up you might want to take more risk then a full grown business, but still we don’t really like the chance to lose money so what we see with our clients that most need at least a probability of 75%. But it also depends on the type of test. If you only changed a headline then the risk is lower, then when you need to implement a new functionality on the page. This will consume much more resources. Hence, you will need a higher probability. This deliberation isn’t presented in your testtool, you need to make this decision yourself. For some tests you need 95% probability, whereas for other you are happy with 75%.
  • #49: What you can do is make a risk assessment. You can calculate what the results mean in terms of revenue. When the client decides to implement the variation they have a 10.8% chance of a drop in revenue of 200.000 in 6 months time (and an average order value of 175) But on the other hand, they also have a 89.2% chance that the variation is actually better and brings in nearly 650.000 euro. You can show this table to your boss and ask whether he would place the bet. So it’s not the testingtool which decides if it’s a winner or not it’s you (or your boss).
  • #50: So, to sum up there are a lot of things your testing tool doesn’t tell you: - you need to: 1,2,3,4,5,6…. It’s not an easy path and there will be times you want to call it a day, but if you persevere you will make it to the top and fully enjoy what you have accomplished.