SlideShare a Scribd company logo
Mark Thomas
@SearchMATH
BOTIFY
Why	auditing	your	
rel=canonical	configuration	
is	a	shrewd	move
http://www.slideshare.net/MarkThomas114
TECHNICAL SEO CONTENT REAL RANKINGS & CTR
The first unified suite to drive SEO success in each phase of the
organic search process
I would also like to add to this conversation
that we have learned the hard way that if we
use canonicals for pages that aren’t
duplicates or near-duplicates, we have
no impact at best and a ranking
drop at worst. Please don’t get clever
with canonicals in a market that I need to meet
my targets.
45 Million URLs are being tagged with an alternative
incorrect canonical tag. Which confuses Google and
forces the crawl of 45M unnecessary URLs:
When a query URL has a space, the canonical rule will
substitute the "+" character for the encoded character
"%20”
https://www.example.com/a/audi-a4?query=audi+a4
(which has the internal links)
will canonicalize to
https://www. example.com/a/audi-a4?query=audi%20a4
Why Audit Your Canonical Set Up?
1) Get More Inventory Ranking
2) Get The Right Inventory Ranking
Why Audit Your Canonical Set Up?
rel=canonical
Issues
Detect
Fix
Agenda
…a process for converting data
that has more than one
possible representation into a
"standard", "normal", or
canonical form.
Canonicalization
Canonical
Duplicate 98% Match
Partner Site
Including a rel=canonical link in
your webpage is a strong hint to
search engines about your
preferred version to index
among duplicate pages on the web.
Why auditing your rel=canonical configuration is a shrewd move
rel=canonical Fact File
An “Element” rather than a “Tag”
rel=canonical is a hint, not a directive
rel=‘canonical’ or rel=“canonical” are fine when placed in the <head>
Google processes rel=canonical as a 2nd/3rd step – not during crawl
What does rel=canonical offer?
• Circumvents Duplicate Content
• Avoids diluting Link Authority
• Avoids Content Cannibalisation in SERPs
What does rel=canonical offer?
Specific cases for rel=canonical
Uppercase/lowercase URL paths, Session IDs, Tracking Codes
Product review pages with /review/product/list/
Multiple versions of category pages derived from dynamic filters
Product Page: Multiple versions
‘Show all’ category pages: if a different URL
Content Syndication
The cleaner you can make
your signals, the more
likely we'll use them.
John Mueller
Reddit AMA, April 2018
Why auditing your rel=canonical configuration is a shrewd move
Google chose different canonical
than user – There are many cases
where Google simply gets this wrong.
Are there any methods that would
force Google to honor the canonical
specified by the webmaster?
• Redirect to your preferred version
• Make internal links, hreflang, rel=next/prev/etc.
point to the preferred version
• Put it into a sitemap file, etc.
10 Common Issues
Abundance Too many pages Canonicalizing to a single page
Code rel=canonical in <body>, multiple declarations, etc.
Content Lack of parity between canonical and canonicalized
Duplication Too little canonicalization
hreflang Canonicalizing pages in a hreflang cluster to one language variant
HTTP Codes Canonicalizing to non-200 HTTP Status codes
Linking More links to canonicalized page rather than canonical
Noindex Noindex present on a canonicalized page
Pagination
Canonicalizing component pages to the first page in a paginated
set
Tracking Parameters generating duplicate URLs
Non-canonical Gaining Impressions
11 Step Canonical Audit
Review GSC Index Coverage Report
Build Data Warehouse Including: Simulated Web Crawl, Logs, JS, GSC, GA/Adobe
Review Duplicate Content Situation
Assess Crawl Budget Impact
Assess Canonical Content Similarity
Check Internal Linking Signals (canonical should receive most internal links)
XML Sitemap Check (should only contain canonical URLs)
Check URLs with canonicals pointing to a 404 or noindex
Check URLs missing a canonical element
Check paginated URLs have a self-referencing canonical
Check hreflang clusters self-referencing canonical
Review GSC Index Coverage Report
Use Google’s Index Coverage Report
Valid - Indexed; consider marking as canonical: The URL was indexed. Because it has duplicate URLs, we
recommend explicitly marking this URL as canonical
Excluded - Duplicate page without canonical tag: This page has duplicates, none of which is
marked canonical. We think this page is not the canonical one. You should explicitly mark the
canonical for this page
Google chose different canonical than user: This page is marked as
canonical for a set of pages, but Google thinks another URL makes a better canonical. Google has indexed the page
we consider canonical rather than this one. We recommend that you explicitly mark this page as a duplicate of the
canonical URL
Submitted URL not selected as canonical: difference between this status
and "Google chose different canonical than user" is that, in this case, you explicitly requested indexing.
Data Warehouse
Build Data Warehouse Including: Simulated Web Crawl, Logs, JS, GSC, GA/Adobe
Assess Crawl Budget Impact
Canonical Conversion
Canonical Similarity
Assess Canonical Content Similarity
How do sites compare?
Industry URLs Crawled Known URLs
Number of
Compliant
URLs crawled
Canonical Not
Equal Volume
Meta Noindex +
Canonical Not
Equal or Bad
Status Code
Total Canonical
Not Equal
Volume
% Canonical
Not Equal
Duplicate
Content: No. of
Pages with
Similarity >
90%
% of Pages
with Similarity >
90%
Pages Less
Than 50%
Similar to
Canonical
% of
Canonicalised
Pages Less
Than 50%
Similar to
Canonical
Number of
URLs Crawled
by Botify &
Google
Number of
Compliant
Pages crawled
by Botify &
Google
% of of
Compliant
Pages crawled
by Botify &
Google
Number of
URLs Crawled
>80%
Number of
Compliant
URLs Crawled
>80%
%of Compliant
Pages Crawled
>80%
Number of
URLs Crawled
20%-79%
No of Incoming
Canonical Tags
>5
No of Incoming
Canonical Tags
>10
No of Incoming
Canonical Tags
>50
Canonical Not
Equal but
present in
Sitemap
Travel 4616 4616 3376 513 2 515 11 % 419 12 % 150 29 % 1061 850 25 % 186 161 5% 570 2 1 0 0
Retail 22161 22161 8010 8890 262 9152 41 % 1386 17 % 36 0% 10884 5830 73 % 1223 1099 14 % 2868 681 138 0 0
Retail 25720 25270 19216 2130 0 2130 8% 2770 14 % 57 3% 10681 9969 52 % 151 149 1% 2060 87 48 4 0
Retail 43,499 43,499 39,663 3751 16 3767 9% 1112 3% 3465 92 % 29328 29169 74 % 468 468 1% 3083 132 103 13 0
Classified 123,336 123,336 122,085 34 0 34 0% 7716 6% 20 59 % 103098 102492 84 % 6445 6442 5% 28623 0 0 0 0
Publishing
316487 316487 220597 50278 4519
54797 17 %
4391
2%
5824
11 %
132887 123466
56 %
8607 8603
4%
38415 358 243 84 40
Travel 366068 366068 171113 71425 69354 140779 38 % 24256 14 % 4927 3% 163586 94749 55 % 6528 6376 4% 41905 205 100 27 3
Travel 421166 421166 115144 43855 72
43927 10 %
33293
29 %
2649
6%
115783 69293
60 %
6312 6244
5%
40874 297 181 1 3
Retail
1141182 1141182 654029 142920 21480 164400 14 %
53924
8%
120734
73 %
318000 232201
36 %
8708 8392
1%
26106 70 58 46 730
Retail 2798951 2798951 727844 911137 701798 1612935 58 % 15786 2% 280547 17 % 961585 704797 97 % 170270 166,363.00 23 % 479940 67966 40101 0 0
>0 Flagged >10% Flagged >10% Flagged >10% Flagged <80% Flagged <20% Flagged
How do sites compare?
Industry
URLs
Crawled
Known URLs
Number of
Compliant
URLs crawled
Canonical Not
Equal Volume
Meta Noindex
+ Canonical
Not Equal or
Bad Status
Code
Total
Canonical Not
Equal Volume
% Canonical
Not Equal
Duplicate
Content: No.
of Pages with
Similarity >
90%
% of Pages
with Similarity
> 90%
Pages Less
Than 50%
Similar to
Canonical
% of
Canonicalised
Pages Less
Than 50%
Similar to
Canonical
Travel 4616 4616 3376 513 2 515 11% 419 12% 150 29%
Retail 22161 22161 8010 8890 262 9152 41% 1386 17% 36 0%
Retail 25720 25270 19216 2130 0 2130 8% 2770 14% 57 3%
Retail 43,499 43,499 39,663 3751 16 3767 9% 1112 3% 3465 92%
Classified 123,336 123,336 122,085 34 0 34 0% 7716 6% 20 59%
Publishing 316487 316487 220597 50278 4519 54797 17% 4391 2% 5824 11%
Travel 366068 366068 171113 71425 69354 140779 38% 24256 14% 4927 3%
Travel 421166 421166 115144 43855 72 43927 10% 33293 29% 2649 6%
Retail 1141182 1141182 654029 142920 21480 164400 14% 53924 8% 120734 73%
Retail 2798951 2798951 727844 911137 701798 1612935 58% 15786 2% 280547 17%
>0 Flagged
>10%
Flagged
>10%
Flagged
>10%
Flagged
How do sites compare?
Industry
Number of
URLs
Crawled by
Botify &
Google
Number of
Compliant
Pages
crawled by
Botify &
Google
% of of
Compliant
Pages
crawled by
Botify &
Google
Number of
URLs
Crawled
>80%
Number of
Compliant
URLs
Crawled
>80%
%of
Compliant
Pages
Crawled
>80%
Number of
URLs
Crawled 20%-
79%
No of
Incoming
Canonical
Tags >5
No of
Incoming
Canonical
Tags >10
No of
Incoming
Canonical
Tags >50
Canonical Not
Equal but
present in
Sitemap
Travel 1061 850 25% 186 161 5% 570 2 1 0 0
Retail 10884 5830 73% 1223 1099 14% 2868 681 138 0 0
Retail 10681 9969 52% 151 149 1% 2060 87 48 4 0
Retail 29328 29169 74% 468 468 1% 3083 132 103 13 0
Classified 103098 102492 84% 6445 6442 5% 28623 0 0 0 0
Publishing 132887 123466 56% 8607 8603 4% 38415 358 243 84 40
Travel 163586 94749 55% 6528 6376 4% 41905 205 100 27 3
Travel 115783 69293 60% 6312 6244 5% 40874 297 181 1 3
Retail 318000 232201 36% 8708 8392 1% 26106 70 58 46 730
Retail 961585 704797 97% 170270 166,363.00 23% 479940 67966 40101 0 0
<80%
Flagged
<20%
Flagged
Fix Upstream Where Possible
rel=canonical Fixes
TECHNIQUE DETAIL
Upstream
Standardised URLs
Get shot of: event tracking,
session IDs, query strings, etc.
Consistent Internal Links Use 301 if absolutely necessary
Robots.txt Disallow: /*?query=
GSC Exclude Parameters
Downstream rel=canonical Consistent Signals
Conclusion
Be consistent:
“Consistency is the
mother of all good SEO.”
Matt Cutts 2009 | John Mueller 2016
@SearchMATH
http://www.slideshare.net/
MarkThomas114

More Related Content

PPTX
BrightonSEO 2019 - Crawl Budget is dead, please welcome Rendering Budget
PDF
[BrightonSEO 2019] Restructuring Websites to Improve Indexability
PPTX
SEO Data - The Circle of Trust
PDF
Mark Osborne - Brighton SEO April 2019 The Seedy Underbelly of Keyword Resear...
PDF
BrightonSEO April'19 Key Takeaways
PPTX
Single Page Apps - Gerry White @ BrightonSEO
PPTX
Building an SEO Exponential Growth model by closing your content gaps
PDF
How to Optimize Your Website for Crawl Efficiency
BrightonSEO 2019 - Crawl Budget is dead, please welcome Rendering Budget
[BrightonSEO 2019] Restructuring Websites to Improve Indexability
SEO Data - The Circle of Trust
Mark Osborne - Brighton SEO April 2019 The Seedy Underbelly of Keyword Resear...
BrightonSEO April'19 Key Takeaways
Single Page Apps - Gerry White @ BrightonSEO
Building an SEO Exponential Growth model by closing your content gaps
How to Optimize Your Website for Crawl Efficiency

What's hot (16)

PPTX
Use Google Docs to monitor SEO by pulling in Google Analytics #BrightonSEO
PDF
What is SEO
PPTX
SEO Week: Basics of SEO Day One
PPTX
How Marketers Can Work With Code
PDF
11 Advanced Uses of Screaming Frog Nov 2019 DMSS
PPTX
TFM - Using Google Tag Manager for ecom
PPTX
Optimizing For Google Discover | SEO Camp'us Paris 2020 ft. Dan Taylor
PPTX
SEO for Ecommerce: A Comprehensive Guide
PDF
How Does Google Crawl the Web?
PPTX
Site Indexing - The Most Effective SEO Technique
PPTX
Optimization 2020 | Using Edge SEO For Technical Issues ft. Dan Taylor
PPTX
Seo onpage & offpage, Search Engine Optimization, SEO
PDF
Sample SEO Audit Report
PDF
SearchLeeds 2019 - Polly Pospelova - How to hack rankings with page speed opt...
PPTX
Google PageSpeed: 5 Steps to 100% (Mobile) Success
PPTX
Why You Should Invest in Technical SEO by Ruth Burr Reedy
Use Google Docs to monitor SEO by pulling in Google Analytics #BrightonSEO
What is SEO
SEO Week: Basics of SEO Day One
How Marketers Can Work With Code
11 Advanced Uses of Screaming Frog Nov 2019 DMSS
TFM - Using Google Tag Manager for ecom
Optimizing For Google Discover | SEO Camp'us Paris 2020 ft. Dan Taylor
SEO for Ecommerce: A Comprehensive Guide
How Does Google Crawl the Web?
Site Indexing - The Most Effective SEO Technique
Optimization 2020 | Using Edge SEO For Technical Issues ft. Dan Taylor
Seo onpage & offpage, Search Engine Optimization, SEO
Sample SEO Audit Report
SearchLeeds 2019 - Polly Pospelova - How to hack rankings with page speed opt...
Google PageSpeed: 5 Steps to 100% (Mobile) Success
Why You Should Invest in Technical SEO by Ruth Burr Reedy
Ad

Similar to Why auditing your rel=canonical configuration is a shrewd move (20)

PDF
Conflicting Website Signals & Confused Search Engines | Raleigh SEO Conferenc...
PPTX
SEO - What matters and What to do about it
PDF
seo - on page - part iv - link structure
PDF
So you think you know canonical tags - Sean Butcher Brighton SEO presentation
PDF
Important Digital Marketing Topics
PPTX
A Crash Course in Technical SEO from Patrick Stox - Beer & SEO Meetup May 2019
PDF
10 Technical SEO Wins to Dominate Google Search
PDF
Canonical Tags Explained: How to Improve Your Website’s SEO
PDF
CANONICALISATION
PPTX
Ox-Comm
PPTX
200 SEO Ranking Factors for Lincolnshire Business Expo 2019
PPTX
On-Page SEO EXTREME - SEOZone Istanbul 2013
PPTX
Technical SEO Updated
PDF
How Developers Can Make A Website SEO Friendly
PPTX
What is a canonical tag?
PPTX
Older SEO Presales report from 2011
PPT
Grow your Magento store: going multilingual and setting up a marketplace
PDF
International SEO and Content Silos | John Caldwell | CreatorSEO
PDF
How to do a SEO Site Audit
PPTX
All About HTML Tags
Conflicting Website Signals & Confused Search Engines | Raleigh SEO Conferenc...
SEO - What matters and What to do about it
seo - on page - part iv - link structure
So you think you know canonical tags - Sean Butcher Brighton SEO presentation
Important Digital Marketing Topics
A Crash Course in Technical SEO from Patrick Stox - Beer & SEO Meetup May 2019
10 Technical SEO Wins to Dominate Google Search
Canonical Tags Explained: How to Improve Your Website’s SEO
CANONICALISATION
Ox-Comm
200 SEO Ranking Factors for Lincolnshire Business Expo 2019
On-Page SEO EXTREME - SEOZone Istanbul 2013
Technical SEO Updated
How Developers Can Make A Website SEO Friendly
What is a canonical tag?
Older SEO Presales report from 2011
Grow your Magento store: going multilingual and setting up a marketplace
International SEO and Content Silos | John Caldwell | CreatorSEO
How to do a SEO Site Audit
All About HTML Tags
Ad

More from Botify (20)

PPTX
Faceted Navigation: (Almost) Everyone is Doing it Wrong
PDF
From Search to Transaction: How to Master the Customer Experience
PPTX
The Evolution of Customer Journeys & SEO
PPTX
How Is COVID-19 Impacting Organic Search by Industry & What Can We Do About It?
PPTX
How to Find Your Site's True Ranking Factors
PDF
Webinar: How to Make Data-Driven Marketing Decisions Without a Data Science D...
PDF
The Total Economic Impact of Botify
PPTX
Algo Updates, Volatility, & How to Roll with the Punches in SEO
PPTX
New Holiday Data Reveals Insights About Handling Seasonal Volatility - Q1 202...
PPTX
Living in a mobile first index world
PDF
Botify Webinar - The new Version of Botify Keywords
PPTX
Mobile-First Index: A Data-Driven Analysis & Discussion
PDF
Botify webinar Internal Linking - October 2018
PDF
GSC vs Scraping: Go Beyond Rankings
PPTX
The GDPR: What, Why and How Botify is Compliant by Design
PDF
Demystifying JavaScript & SEO
PDF
Webinar Structured Data
PDF
Mobile first index webinar
PPTX
Decrypt Google’s Behavior with Botify Log Analyzer
PPTX
Understand the impact of Javascript on SEO
Faceted Navigation: (Almost) Everyone is Doing it Wrong
From Search to Transaction: How to Master the Customer Experience
The Evolution of Customer Journeys & SEO
How Is COVID-19 Impacting Organic Search by Industry & What Can We Do About It?
How to Find Your Site's True Ranking Factors
Webinar: How to Make Data-Driven Marketing Decisions Without a Data Science D...
The Total Economic Impact of Botify
Algo Updates, Volatility, & How to Roll with the Punches in SEO
New Holiday Data Reveals Insights About Handling Seasonal Volatility - Q1 202...
Living in a mobile first index world
Botify Webinar - The new Version of Botify Keywords
Mobile-First Index: A Data-Driven Analysis & Discussion
Botify webinar Internal Linking - October 2018
GSC vs Scraping: Go Beyond Rankings
The GDPR: What, Why and How Botify is Compliant by Design
Demystifying JavaScript & SEO
Webinar Structured Data
Mobile first index webinar
Decrypt Google’s Behavior with Botify Log Analyzer
Understand the impact of Javascript on SEO

Recently uploaded (20)

PDF
Mastering Bulk Email Campaign Optimization for 2025
PPTX
Presentation - MindfulHeal Digital Ayurveda GTM & Marketing Plan.pptx
PDF
20K Btc Enabled Cash App Accounts – Safe, Fast, Verified.pdf
PPTX
Strategic Sage Digital-The Professional Digital Marketing Company in Mohali.pptx
PDF
UNIT 1 -4 Profile of Rural Consumers (1).pdf
PDF
exceptionalinsights.group visitor traffic statistics 08-08-25
PPTX
Captain Morgan x FOS_Revised_8.8.25.pptx
PDF
5 free to use google tools to understand your customers online behavior in 20...
PDF
Boost Sales Around the Clock with AI Chatbots for Marketing
PPTX
Your score increases as you pick a category, fill out a long description and ...
PPTX
Mastering eCommerce SEO: Strategies to Boost Traffic and Maximize Conversions
PDF
Nurpet Packaging Company Profile (Basic)
DOCX
procubiz_modern digital marketingblog.docx
PPTX
Fixing-AI-Hallucinations-The-NeuroRanktm-Approach.pptx
PPTX
hnk joint business plan for_Rooftop_Plan
PDF
Missing skill for SEO in AI Era eSkydecode.pdf
PDF
Building a strong social media presence.
PPTX
Presentation - GreenPantry – Instagram-First Home Kitchen Brand.pptx
PPTX
Kimberly Crossland Storytelling Marketing Class 5stars.pptx
PDF
SEO vs. AEO: Optimizing for Google vs AI-Powered Search Assistants
Mastering Bulk Email Campaign Optimization for 2025
Presentation - MindfulHeal Digital Ayurveda GTM & Marketing Plan.pptx
20K Btc Enabled Cash App Accounts – Safe, Fast, Verified.pdf
Strategic Sage Digital-The Professional Digital Marketing Company in Mohali.pptx
UNIT 1 -4 Profile of Rural Consumers (1).pdf
exceptionalinsights.group visitor traffic statistics 08-08-25
Captain Morgan x FOS_Revised_8.8.25.pptx
5 free to use google tools to understand your customers online behavior in 20...
Boost Sales Around the Clock with AI Chatbots for Marketing
Your score increases as you pick a category, fill out a long description and ...
Mastering eCommerce SEO: Strategies to Boost Traffic and Maximize Conversions
Nurpet Packaging Company Profile (Basic)
procubiz_modern digital marketingblog.docx
Fixing-AI-Hallucinations-The-NeuroRanktm-Approach.pptx
hnk joint business plan for_Rooftop_Plan
Missing skill for SEO in AI Era eSkydecode.pdf
Building a strong social media presence.
Presentation - GreenPantry – Instagram-First Home Kitchen Brand.pptx
Kimberly Crossland Storytelling Marketing Class 5stars.pptx
SEO vs. AEO: Optimizing for Google vs AI-Powered Search Assistants

Why auditing your rel=canonical configuration is a shrewd move

  • 2. TECHNICAL SEO CONTENT REAL RANKINGS & CTR The first unified suite to drive SEO success in each phase of the organic search process
  • 3. I would also like to add to this conversation that we have learned the hard way that if we use canonicals for pages that aren’t duplicates or near-duplicates, we have no impact at best and a ranking drop at worst. Please don’t get clever with canonicals in a market that I need to meet my targets.
  • 4. 45 Million URLs are being tagged with an alternative incorrect canonical tag. Which confuses Google and forces the crawl of 45M unnecessary URLs: When a query URL has a space, the canonical rule will substitute the "+" character for the encoded character "%20” https://www.example.com/a/audi-a4?query=audi+a4 (which has the internal links) will canonicalize to https://www. example.com/a/audi-a4?query=audi%20a4
  • 5. Why Audit Your Canonical Set Up?
  • 6. 1) Get More Inventory Ranking 2) Get The Right Inventory Ranking Why Audit Your Canonical Set Up?
  • 8. …a process for converting data that has more than one possible representation into a "standard", "normal", or canonical form.
  • 10. Including a rel=canonical link in your webpage is a strong hint to search engines about your preferred version to index among duplicate pages on the web.
  • 12. rel=canonical Fact File An “Element” rather than a “Tag” rel=canonical is a hint, not a directive rel=‘canonical’ or rel=“canonical” are fine when placed in the <head> Google processes rel=canonical as a 2nd/3rd step – not during crawl
  • 14. • Circumvents Duplicate Content • Avoids diluting Link Authority • Avoids Content Cannibalisation in SERPs What does rel=canonical offer?
  • 15. Specific cases for rel=canonical Uppercase/lowercase URL paths, Session IDs, Tracking Codes Product review pages with /review/product/list/ Multiple versions of category pages derived from dynamic filters Product Page: Multiple versions ‘Show all’ category pages: if a different URL Content Syndication
  • 16. The cleaner you can make your signals, the more likely we'll use them. John Mueller Reddit AMA, April 2018
  • 18. Google chose different canonical than user – There are many cases where Google simply gets this wrong. Are there any methods that would force Google to honor the canonical specified by the webmaster?
  • 19. • Redirect to your preferred version • Make internal links, hreflang, rel=next/prev/etc. point to the preferred version • Put it into a sitemap file, etc.
  • 20. 10 Common Issues Abundance Too many pages Canonicalizing to a single page Code rel=canonical in <body>, multiple declarations, etc. Content Lack of parity between canonical and canonicalized Duplication Too little canonicalization hreflang Canonicalizing pages in a hreflang cluster to one language variant HTTP Codes Canonicalizing to non-200 HTTP Status codes Linking More links to canonicalized page rather than canonical Noindex Noindex present on a canonicalized page Pagination Canonicalizing component pages to the first page in a paginated set Tracking Parameters generating duplicate URLs
  • 22. 11 Step Canonical Audit Review GSC Index Coverage Report Build Data Warehouse Including: Simulated Web Crawl, Logs, JS, GSC, GA/Adobe Review Duplicate Content Situation Assess Crawl Budget Impact Assess Canonical Content Similarity Check Internal Linking Signals (canonical should receive most internal links) XML Sitemap Check (should only contain canonical URLs) Check URLs with canonicals pointing to a 404 or noindex Check URLs missing a canonical element Check paginated URLs have a self-referencing canonical Check hreflang clusters self-referencing canonical
  • 23. Review GSC Index Coverage Report Use Google’s Index Coverage Report Valid - Indexed; consider marking as canonical: The URL was indexed. Because it has duplicate URLs, we recommend explicitly marking this URL as canonical Excluded - Duplicate page without canonical tag: This page has duplicates, none of which is marked canonical. We think this page is not the canonical one. You should explicitly mark the canonical for this page Google chose different canonical than user: This page is marked as canonical for a set of pages, but Google thinks another URL makes a better canonical. Google has indexed the page we consider canonical rather than this one. We recommend that you explicitly mark this page as a duplicate of the canonical URL Submitted URL not selected as canonical: difference between this status and "Google chose different canonical than user" is that, in this case, you explicitly requested indexing.
  • 24. Data Warehouse Build Data Warehouse Including: Simulated Web Crawl, Logs, JS, GSC, GA/Adobe
  • 25. Assess Crawl Budget Impact Canonical Conversion
  • 27. How do sites compare? Industry URLs Crawled Known URLs Number of Compliant URLs crawled Canonical Not Equal Volume Meta Noindex + Canonical Not Equal or Bad Status Code Total Canonical Not Equal Volume % Canonical Not Equal Duplicate Content: No. of Pages with Similarity > 90% % of Pages with Similarity > 90% Pages Less Than 50% Similar to Canonical % of Canonicalised Pages Less Than 50% Similar to Canonical Number of URLs Crawled by Botify & Google Number of Compliant Pages crawled by Botify & Google % of of Compliant Pages crawled by Botify & Google Number of URLs Crawled >80% Number of Compliant URLs Crawled >80% %of Compliant Pages Crawled >80% Number of URLs Crawled 20%-79% No of Incoming Canonical Tags >5 No of Incoming Canonical Tags >10 No of Incoming Canonical Tags >50 Canonical Not Equal but present in Sitemap Travel 4616 4616 3376 513 2 515 11 % 419 12 % 150 29 % 1061 850 25 % 186 161 5% 570 2 1 0 0 Retail 22161 22161 8010 8890 262 9152 41 % 1386 17 % 36 0% 10884 5830 73 % 1223 1099 14 % 2868 681 138 0 0 Retail 25720 25270 19216 2130 0 2130 8% 2770 14 % 57 3% 10681 9969 52 % 151 149 1% 2060 87 48 4 0 Retail 43,499 43,499 39,663 3751 16 3767 9% 1112 3% 3465 92 % 29328 29169 74 % 468 468 1% 3083 132 103 13 0 Classified 123,336 123,336 122,085 34 0 34 0% 7716 6% 20 59 % 103098 102492 84 % 6445 6442 5% 28623 0 0 0 0 Publishing 316487 316487 220597 50278 4519 54797 17 % 4391 2% 5824 11 % 132887 123466 56 % 8607 8603 4% 38415 358 243 84 40 Travel 366068 366068 171113 71425 69354 140779 38 % 24256 14 % 4927 3% 163586 94749 55 % 6528 6376 4% 41905 205 100 27 3 Travel 421166 421166 115144 43855 72 43927 10 % 33293 29 % 2649 6% 115783 69293 60 % 6312 6244 5% 40874 297 181 1 3 Retail 1141182 1141182 654029 142920 21480 164400 14 % 53924 8% 120734 73 % 318000 232201 36 % 8708 8392 1% 26106 70 58 46 730 Retail 2798951 2798951 727844 911137 701798 1612935 58 % 15786 2% 280547 17 % 961585 704797 97 % 170270 166,363.00 23 % 479940 67966 40101 0 0 >0 Flagged >10% Flagged >10% Flagged >10% Flagged <80% Flagged <20% Flagged
  • 28. How do sites compare? Industry URLs Crawled Known URLs Number of Compliant URLs crawled Canonical Not Equal Volume Meta Noindex + Canonical Not Equal or Bad Status Code Total Canonical Not Equal Volume % Canonical Not Equal Duplicate Content: No. of Pages with Similarity > 90% % of Pages with Similarity > 90% Pages Less Than 50% Similar to Canonical % of Canonicalised Pages Less Than 50% Similar to Canonical Travel 4616 4616 3376 513 2 515 11% 419 12% 150 29% Retail 22161 22161 8010 8890 262 9152 41% 1386 17% 36 0% Retail 25720 25270 19216 2130 0 2130 8% 2770 14% 57 3% Retail 43,499 43,499 39,663 3751 16 3767 9% 1112 3% 3465 92% Classified 123,336 123,336 122,085 34 0 34 0% 7716 6% 20 59% Publishing 316487 316487 220597 50278 4519 54797 17% 4391 2% 5824 11% Travel 366068 366068 171113 71425 69354 140779 38% 24256 14% 4927 3% Travel 421166 421166 115144 43855 72 43927 10% 33293 29% 2649 6% Retail 1141182 1141182 654029 142920 21480 164400 14% 53924 8% 120734 73% Retail 2798951 2798951 727844 911137 701798 1612935 58% 15786 2% 280547 17% >0 Flagged >10% Flagged >10% Flagged >10% Flagged
  • 29. How do sites compare? Industry Number of URLs Crawled by Botify & Google Number of Compliant Pages crawled by Botify & Google % of of Compliant Pages crawled by Botify & Google Number of URLs Crawled >80% Number of Compliant URLs Crawled >80% %of Compliant Pages Crawled >80% Number of URLs Crawled 20%- 79% No of Incoming Canonical Tags >5 No of Incoming Canonical Tags >10 No of Incoming Canonical Tags >50 Canonical Not Equal but present in Sitemap Travel 1061 850 25% 186 161 5% 570 2 1 0 0 Retail 10884 5830 73% 1223 1099 14% 2868 681 138 0 0 Retail 10681 9969 52% 151 149 1% 2060 87 48 4 0 Retail 29328 29169 74% 468 468 1% 3083 132 103 13 0 Classified 103098 102492 84% 6445 6442 5% 28623 0 0 0 0 Publishing 132887 123466 56% 8607 8603 4% 38415 358 243 84 40 Travel 163586 94749 55% 6528 6376 4% 41905 205 100 27 3 Travel 115783 69293 60% 6312 6244 5% 40874 297 181 1 3 Retail 318000 232201 36% 8708 8392 1% 26106 70 58 46 730 Retail 961585 704797 97% 170270 166,363.00 23% 479940 67966 40101 0 0 <80% Flagged <20% Flagged
  • 30. Fix Upstream Where Possible
  • 31. rel=canonical Fixes TECHNIQUE DETAIL Upstream Standardised URLs Get shot of: event tracking, session IDs, query strings, etc. Consistent Internal Links Use 301 if absolutely necessary Robots.txt Disallow: /*?query= GSC Exclude Parameters Downstream rel=canonical Consistent Signals
  • 33. Be consistent: “Consistency is the mother of all good SEO.” Matt Cutts 2009 | John Mueller 2016