SlideShare a Scribd company logo
2
Most read
6
Most read
11
Most read
Using Python and Data
Science Practices in
SEO Analysis of Data
Benj Arriola
85SIXTY
Speakerdeck.com/benjarriola
@benjarriola
Without Python and Data Science Practices
● SEO Tools
● Crawling Tools, Audit Tools, Keyword Research,
Content Analysis, Crawlability, Indexability, Rank
Tracking, Web Analytics, Social Listening, Backlink
Analysis, etc.
● Analysis of Data
● Data often ends up in a spreadsheet.
● You sort, filter, join with Vlookups, summarize with
Pivot tables, visualize with charts and graphs
● Recommended Action Items
● Update content
● Update code
● Update server setting
● Try to get other sites to update their sites
● Implementation
● SEO / Marketers / Content Managers / Writers with
CMS Access
● Developers with Backend Code Access
● IT / Network Admin / Server Admin with Server
Access
With Python and Data Science Practices
● SEO Tools
● Crawling Tools, Audit Tools, Keyword Research,
Content Analysis, Crawlability, Indexability, Rank
Tracking, Web Analytics, Social Listening, Backlink
Analysis, etc.
● Analysis of Data
● Data often ends up in a spreadsheet.
● You sort, filter, join with Vlookups, summarize with
Pivot tables, visualize with charts and graphs
● Recommended Action Items
● Update content
● Update code
● Update server setting
● Try to get other sites to update their sites
● Implementation
● SEO / Marketers / Content Managers / Writers with
CMS Access
● Developers with Backend Code Access
● IT / Network Admin / Server Admin with Server
Access
Export as Spreadsheets, or Use
APIs
Export Results of Analysis
Integrate with CMS for
Automated Updates
Converting your manual
analysis into steps your Python
script will do
Typically, the biggest challenge is: (1) converting your manual analysis into distinct steps and
(2) turning these steps into Python instructions.
Python
Analysis of
Data and
Integration
with CMS’ for
Automated
Updates may
take time to
setup, but
once
completed,
everything
will be faster
Why We Use Python and Data Science Practices?
Handling Large
Amounts of
Data
Repetitive
Process
Faster Gathering
Data, Analysis &
Implementation
Crawling / Audit
Tools
Keyword
Research
Content
Analysis
Crawlability /
Indexability
Ranking
Tracking
Web
Analytics
Social
Listening
Backlink
Analysis
Sources of Data
Vector Embeddings
● Some NLP Vector
Embedding Factors
● Semantic Similarity
● Gender and Role
Relationships
● Tense and Parts of
Speech
● Polarity and Sentiment
● Formal vs. Informal
Language
● Topical Relevance
● Lexical Hierarchy
● Contextual
Dependency
● Sentential Structure
● Emotional Tone or
Connotation
● SEO Tools
Known to Use
Vector
Embedding
● Screaming Frog
● inLinks
● WordLift
● MarketMuse
● MarketBrew
Examples Applications
● Reporting Dashboards and Analysis from Various Tools
● Determining Revenue Per Keyword
● Quantifying the Effect of Pagespeed Updates on Traffic and Revenue
● Keyword Research
● Targeting the Best Keywords
● Vector Embedding
● Finding Duplicate Content
● Identifying Internal Linking Opportunities
● Updating Image Alt Text as Scale
● Writing Title Tags and Meta Descriptions for Thousands or Millions of
Pages
Speakerdeck.com/benjarriola
@benjarriola
Keyword Research
Python & Data
Science Use Case
Common Keyword Research Process
● Objective
● Determine the best keywords
to target and optimize
● General Process
● Start with some initial seed
words
● Use keyword research tools to
get more keyword ideas
● Analyze the data to determine
which keywords to target
● Finalize the primary target
keywords
Seed Words
Keyword
Exploration
Keyword
Analysis
Primary Target
Keywords
Common Keyword Research Process
● Seed Words
● Common Sense
● Current Ranking Keywords
(GSC, SEMRush)
● Main Navigation
● Keyword Exploration
● Competitor Keywords –
SEMRush
● Keyword Analysis
● Python Script
● Target Keywords
● Exported from Python Script
Seed Words
Keyword
Exploration
Keyword
Analysis
Primary Target
Keywords
Keyword
Exploration
Keyword
Analysis
Primary Target
Keywords
Seed Words
Keyword
Exploration
Keyword
Analysis
Primary Target
Keywords
Keyword Analysis
● Analysis Stage: Converting
your manual analysis into
steps your Python script will
do
● More popular industries
● More competitors
● More current ranking keywords
● More website pages
● More product lines / services
More Keywords to Analyze
Breakdown into steps to narrow down hundreds or millions of keywords to a smaller number of keywords to target.
Target Keywords – Narrow Down
● Narrow Down Rules
● A good balance of long tail and head
terms
● Why? Get the low hanging fruit and monitor the
journey to the general head terms
● A balance of transactional,
informational, navigational
● Why?
● Transactional for more sales
● Informational, for new customer discovery and
link bait purposes
● Navigational, for customer support, ORM
● Balance of high KEI and high Search
Volume
● Why? Low-hanging fruit to gain benefits, more
sales quicker, while working on the higher search
volume in the process
Target Keywords – Narrow Down
● Narrow Down Rules
● Common keywords between
competitors
● Why?
● Quicker way to narrow down industry-relevant
keywords.
● Quicker to exclude competitor brands.
● A good balance of different topics
● Why? High search volume or high KEI can make
keywords too focused on 1 thing. It might be 1
product, 1 service, 1 product trend,
● Include and exclude list filter
● Long tail keyword pattern analysis
Whether this is 1,000 keywords, or 1,000,000 keywords, the Python Script will run all rules and narrow
down to whatever number of keywords you set this out to.
Different SEOs
may have a
different process,
as long as this
process can be
articulated into
distinct steps,
then it can be
transformed into
Python code.
Implementation
● Having the targeted keywords
alone is good enough for
keyword research.
● Implementation:
● Use keywords in key areas of a
page.
● Title Tag
● Meta Description
● Heading Tags
● Main Content
● Image Alt Text
● URLs
● Schema Markup
● Etc.
Implementation Automation
● Categorize Targeted Keywords
● Use the ChatGPT API
● Cost for running this for 100
keywords
OpenAI's pricing for the GPT-4
model is $0.03 per 1,000 prompt
tokens and $0.06 per 1,000
completion tokens.
● Prompt Cost: (150 tokens /
1,000) * $0.03 = $0.0045
● Completion Cost: (500 tokens /
1,000) * $0.06 = $0.03
● Total Cost: $0.0045 + $0.03 =
$0.0345
Implementation Automation
● Assigning a category and
subcategory to a page
● Uses Screaming Frog’s Vector
Embedding
● The category and
subcategories will look for the
highest similarity match.
Implementation Automation
● Rewriting Title Tags if the main
target keyword is not used
● For each category and
subcategory, find the highest
search volume keyword.
● If the keyword is not used in the
title tag, use the ChatGPT API to
rewrite the title and use the
keyword.
● Similar rules can be done for the
meta description
● Doing a manual QA check
would be a smart thing to do in
the initial runs
Speakerdeck.com/benjarriola
@benjarriola
Getting Started
& Learning More
Other Information
& Resources
Potential Challenges
● The Learning Curve
● Not everyone learns at the same pace
● The Technical SEO team does it
for the rest of the team
● Great internal training program
● Create an application with a user-
friendly interface (Streamlit, Dash,
Flask, Django)
● Hire people
● Partner with other companies
● Tool Ownership
● You or your employer?
Other Resources
● Learning Python and
PANDAS Basics
● YouTube
● Coursera
● Udemy
● ChatGPT
● SEO Application
● Jacky Chou - Indexy
● Greg Bernhardt – ImportSEM
● JC Chouinard
Speakerdeck.com/benjarriola
@benjarriola
Welcome to San Diego!
Thanks
Benj Arriola
Spoken at 48
conferences in 4
different countries,
21 cities since 2007.
Won major prizes in
7 international SEO
competitions from
2005 to 2009 that
include cash and a
brand new car!
Python
Usage
Speakerdeck.com/benjarriola
@benjarriola
Questions?

More Related Content

PDF
Croud Presents: How to Build a Data-driven SEO Strategy Using NLP
PPTX
Getting Started with Python and Machine Learning for SEO | BrightonSEO Octobe...
PDF
Python For SEO specialists and Content Marketing - Hand in Hand
PPTX
Tackling Python: How It Can Help With Technical SEO | Pint Sized Meetup Janua...
PPTX
Machine Learning and Python For Marketing Automation | MKGO October 2019 | Ru...
PPTX
The Power of Python :: How It Can Help With Technical SEO | Bristol SEO May 2...
PPTX
Tackling Python: What is it and how can it help with Technical SEO?
PDF
SEO Exellence with ChatGPT-Webinar Duda
Croud Presents: How to Build a Data-driven SEO Strategy Using NLP
Getting Started with Python and Machine Learning for SEO | BrightonSEO Octobe...
Python For SEO specialists and Content Marketing - Hand in Hand
Tackling Python: How It Can Help With Technical SEO | Pint Sized Meetup Janua...
Machine Learning and Python For Marketing Automation | MKGO October 2019 | Ru...
The Power of Python :: How It Can Help With Technical SEO | Bristol SEO May 2...
Tackling Python: What is it and how can it help with Technical SEO?
SEO Exellence with ChatGPT-Webinar Duda

Similar to Using Python and Data Science Practices in SEO Analysis of Data (20)

PDF
Seb Dziubek October 04, 2024 Marketing & SEO 0 27 Practical ways to harness G...
PPTX
Tackling Python: What is it and How Can it Help with Technical SEO? | TechSEO...
PDF
Enhancing SEO Efficiency Using Python in 2025
PDF
Understanding Semantic Search and AI Content to Drive Growth in 2023 March 2023
PDF
What does Google want? Future of Digital Marketing 2015
PPTX
Data and Evidence-driven SEO
PPTX
BrightonSEO 2019 - Mining the SERP for SEO, Content & Customer Insights
PDF
Gaps in the algorithm
PPTX
SearchLove London - Analysing the SERPs for SEO, Content & Customer Insights
PPSX
How To Use AI To Enhance Your SEO & Create Better Content
PPTX
MancSEO - Using the SERPs for SEO, Content & Customer Insights
PPTX
SEO & Statistics Presentation by Micah Fisher-Kirshner for UC Davis Graduate ...
PDF
Everything You Wish You Knew About Search
PPTX
SEO presentation for marketing summit 2017
PDF
SEJ Webinar_How To Supercharge Your Keyword Research with Powerful Topic Clus...
PPTX
Surprising facts about google and 2017 seo
PDF
SEO, PPC and AI in 2023 and Beyond
PDF
Generative AI and SEO
PPTX
Redefining Technical SEO - Paul Shapiro at MozCon 2019
PDF
What You Need to Know About Technical SEO
Seb Dziubek October 04, 2024 Marketing & SEO 0 27 Practical ways to harness G...
Tackling Python: What is it and How Can it Help with Technical SEO? | TechSEO...
Enhancing SEO Efficiency Using Python in 2025
Understanding Semantic Search and AI Content to Drive Growth in 2023 March 2023
What does Google want? Future of Digital Marketing 2015
Data and Evidence-driven SEO
BrightonSEO 2019 - Mining the SERP for SEO, Content & Customer Insights
Gaps in the algorithm
SearchLove London - Analysing the SERPs for SEO, Content & Customer Insights
How To Use AI To Enhance Your SEO & Create Better Content
MancSEO - Using the SERPs for SEO, Content & Customer Insights
SEO & Statistics Presentation by Micah Fisher-Kirshner for UC Davis Graduate ...
Everything You Wish You Knew About Search
SEO presentation for marketing summit 2017
SEJ Webinar_How To Supercharge Your Keyword Research with Powerful Topic Clus...
Surprising facts about google and 2017 seo
SEO, PPC and AI in 2023 and Beyond
Generative AI and SEO
Redefining Technical SEO - Paul Shapiro at MozCon 2019
What You Need to Know About Technical SEO
Ad

More from Benj Arriola (18)

PPTX
Digital Marketing for Enterprise Level Companies by Benj Arriola
PPTX
Geo-Targeted SEO for the Online Retailer
PPTX
SEO Audit Tools, Tips and Tricks - SMX West 2016
PPTX
Tools of the Trade for Running SEO Audits - SMX East 2015: Essential Steps fo...
PPTX
Alignment of Usability, SEO & CRO in Site Architecture
PPTX
Creative Content Marketing: From Strategy to Execution
PPTX
Link Building and Content Marketing: How They Work Together
PPTX
Clarity14 - seoClarity's Enterprise SEO conference
PPTX
Generating Payback from Internet Marketing
PPTX
SEO Secrets - Benj Arriola Introduction
PPTX
Optimizing Your Website for Organic Search
PPTX
Multilingual & Multinational Link Building with Multilingual Content Marketing
PPTX
The Battle Against Penguins and Pandas: Getting Back Ranking and Running Futu...
PPTX
Google Tools for SEO
PPTX
SEOToolbox - Tools aside from Google's Tools for SEO
PPTX
Local & International SEO - MORCon 2011 Presentation
PPTX
SEO.org.ph - Bataan Botcamp October 2010
PPTX
Demystifying SEO - What really goes into a comprehensive SEO campaign
Digital Marketing for Enterprise Level Companies by Benj Arriola
Geo-Targeted SEO for the Online Retailer
SEO Audit Tools, Tips and Tricks - SMX West 2016
Tools of the Trade for Running SEO Audits - SMX East 2015: Essential Steps fo...
Alignment of Usability, SEO & CRO in Site Architecture
Creative Content Marketing: From Strategy to Execution
Link Building and Content Marketing: How They Work Together
Clarity14 - seoClarity's Enterprise SEO conference
Generating Payback from Internet Marketing
SEO Secrets - Benj Arriola Introduction
Optimizing Your Website for Organic Search
Multilingual & Multinational Link Building with Multilingual Content Marketing
The Battle Against Penguins and Pandas: Getting Back Ranking and Running Futu...
Google Tools for SEO
SEOToolbox - Tools aside from Google's Tools for SEO
Local & International SEO - MORCon 2011 Presentation
SEO.org.ph - Bataan Botcamp October 2010
Demystifying SEO - What really goes into a comprehensive SEO campaign
Ad

Recently uploaded (20)

PDF
MARG’s Door & Window Hardware Catalogue | Trending Branding Digital Solutions
PDF
UNIT 1 -3 Factors Influencing RURAL CONSUMER BEHAVIOUR.pdf
PPTX
Kimberly Crossland Storytelling Marketing Class 5stars.pptx
PDF
Proven AI Visibility: From SEO Strategy To GEO Tactics
PPTX
Fixing-AI-Hallucinations-The-NeuroRanktm-Approach.pptx
PDF
Pay-Per-Click Marketing: Strategies That Actually Work in 2025
PPTX
Ranking a Webpage with SEO (And Tracking It with the Right Attribution Type a...
PDF
Digital Marketing Agency in Thrissur with Proven Strategies for Local Growth
PDF
Hidden gems in Microsoft ads with Navah Hopkins
PDF
RC 14001 Certification: Enhancing ISO 14001 with EHS & Security Standards
PDF
UNIT 1 -4 Profile of Rural Consumers (1).pdf
PPTX
"Best Healthcare Digital Marketing Ideas
PDF
Digital Marketing in the Age of AI: What CEOs Need to Know - Jennifer Apy, Ch...
PDF
Fly Emirates SEO case study by Rakesh pathak.pdf
PPTX
Ipsos+Protocols+Playbook+V1.2+(DEC2024)+final+IntClientUseOnly.pptx
PDF
Building a strong social media presence.
PDF
AFCAT Syllabus 2026 Guide by Best Defence Academy in Lucknow.pdf
PDF
Prove and Prioritize Profitability in Every Marketing Campaign - Zach Sherrod...
PPTX
Best Digital marketing service provider in Chandigarh.pptx
PDF
UNIT 2 - 5 DISTRIBUTION IN RURAL MARKETS.pdf
MARG’s Door & Window Hardware Catalogue | Trending Branding Digital Solutions
UNIT 1 -3 Factors Influencing RURAL CONSUMER BEHAVIOUR.pdf
Kimberly Crossland Storytelling Marketing Class 5stars.pptx
Proven AI Visibility: From SEO Strategy To GEO Tactics
Fixing-AI-Hallucinations-The-NeuroRanktm-Approach.pptx
Pay-Per-Click Marketing: Strategies That Actually Work in 2025
Ranking a Webpage with SEO (And Tracking It with the Right Attribution Type a...
Digital Marketing Agency in Thrissur with Proven Strategies for Local Growth
Hidden gems in Microsoft ads with Navah Hopkins
RC 14001 Certification: Enhancing ISO 14001 with EHS & Security Standards
UNIT 1 -4 Profile of Rural Consumers (1).pdf
"Best Healthcare Digital Marketing Ideas
Digital Marketing in the Age of AI: What CEOs Need to Know - Jennifer Apy, Ch...
Fly Emirates SEO case study by Rakesh pathak.pdf
Ipsos+Protocols+Playbook+V1.2+(DEC2024)+final+IntClientUseOnly.pptx
Building a strong social media presence.
AFCAT Syllabus 2026 Guide by Best Defence Academy in Lucknow.pdf
Prove and Prioritize Profitability in Every Marketing Campaign - Zach Sherrod...
Best Digital marketing service provider in Chandigarh.pptx
UNIT 2 - 5 DISTRIBUTION IN RURAL MARKETS.pdf

Using Python and Data Science Practices in SEO Analysis of Data

  • 1. Using Python and Data Science Practices in SEO Analysis of Data Benj Arriola 85SIXTY Speakerdeck.com/benjarriola @benjarriola
  • 2. Without Python and Data Science Practices ● SEO Tools ● Crawling Tools, Audit Tools, Keyword Research, Content Analysis, Crawlability, Indexability, Rank Tracking, Web Analytics, Social Listening, Backlink Analysis, etc. ● Analysis of Data ● Data often ends up in a spreadsheet. ● You sort, filter, join with Vlookups, summarize with Pivot tables, visualize with charts and graphs ● Recommended Action Items ● Update content ● Update code ● Update server setting ● Try to get other sites to update their sites ● Implementation ● SEO / Marketers / Content Managers / Writers with CMS Access ● Developers with Backend Code Access ● IT / Network Admin / Server Admin with Server Access
  • 3. With Python and Data Science Practices ● SEO Tools ● Crawling Tools, Audit Tools, Keyword Research, Content Analysis, Crawlability, Indexability, Rank Tracking, Web Analytics, Social Listening, Backlink Analysis, etc. ● Analysis of Data ● Data often ends up in a spreadsheet. ● You sort, filter, join with Vlookups, summarize with Pivot tables, visualize with charts and graphs ● Recommended Action Items ● Update content ● Update code ● Update server setting ● Try to get other sites to update their sites ● Implementation ● SEO / Marketers / Content Managers / Writers with CMS Access ● Developers with Backend Code Access ● IT / Network Admin / Server Admin with Server Access Export as Spreadsheets, or Use APIs Export Results of Analysis Integrate with CMS for Automated Updates Converting your manual analysis into steps your Python script will do Typically, the biggest challenge is: (1) converting your manual analysis into distinct steps and (2) turning these steps into Python instructions. Python Analysis of Data and Integration with CMS’ for Automated Updates may take time to setup, but once completed, everything will be faster
  • 4. Why We Use Python and Data Science Practices? Handling Large Amounts of Data Repetitive Process Faster Gathering Data, Analysis & Implementation
  • 5. Crawling / Audit Tools Keyword Research Content Analysis Crawlability / Indexability Ranking Tracking Web Analytics Social Listening Backlink Analysis Sources of Data
  • 6. Vector Embeddings ● Some NLP Vector Embedding Factors ● Semantic Similarity ● Gender and Role Relationships ● Tense and Parts of Speech ● Polarity and Sentiment ● Formal vs. Informal Language ● Topical Relevance ● Lexical Hierarchy ● Contextual Dependency ● Sentential Structure ● Emotional Tone or Connotation ● SEO Tools Known to Use Vector Embedding ● Screaming Frog ● inLinks ● WordLift ● MarketMuse ● MarketBrew
  • 7. Examples Applications ● Reporting Dashboards and Analysis from Various Tools ● Determining Revenue Per Keyword ● Quantifying the Effect of Pagespeed Updates on Traffic and Revenue ● Keyword Research ● Targeting the Best Keywords ● Vector Embedding ● Finding Duplicate Content ● Identifying Internal Linking Opportunities ● Updating Image Alt Text as Scale ● Writing Title Tags and Meta Descriptions for Thousands or Millions of Pages
  • 9. Common Keyword Research Process ● Objective ● Determine the best keywords to target and optimize ● General Process ● Start with some initial seed words ● Use keyword research tools to get more keyword ideas ● Analyze the data to determine which keywords to target ● Finalize the primary target keywords Seed Words Keyword Exploration Keyword Analysis Primary Target Keywords
  • 10. Common Keyword Research Process ● Seed Words ● Common Sense ● Current Ranking Keywords (GSC, SEMRush) ● Main Navigation ● Keyword Exploration ● Competitor Keywords – SEMRush ● Keyword Analysis ● Python Script ● Target Keywords ● Exported from Python Script Seed Words Keyword Exploration Keyword Analysis Primary Target Keywords Keyword Exploration Keyword Analysis Primary Target Keywords Seed Words Keyword Exploration Keyword Analysis Primary Target Keywords
  • 11. Keyword Analysis ● Analysis Stage: Converting your manual analysis into steps your Python script will do ● More popular industries ● More competitors ● More current ranking keywords ● More website pages ● More product lines / services More Keywords to Analyze Breakdown into steps to narrow down hundreds or millions of keywords to a smaller number of keywords to target.
  • 12. Target Keywords – Narrow Down ● Narrow Down Rules ● A good balance of long tail and head terms ● Why? Get the low hanging fruit and monitor the journey to the general head terms ● A balance of transactional, informational, navigational ● Why? ● Transactional for more sales ● Informational, for new customer discovery and link bait purposes ● Navigational, for customer support, ORM ● Balance of high KEI and high Search Volume ● Why? Low-hanging fruit to gain benefits, more sales quicker, while working on the higher search volume in the process
  • 13. Target Keywords – Narrow Down ● Narrow Down Rules ● Common keywords between competitors ● Why? ● Quicker way to narrow down industry-relevant keywords. ● Quicker to exclude competitor brands. ● A good balance of different topics ● Why? High search volume or high KEI can make keywords too focused on 1 thing. It might be 1 product, 1 service, 1 product trend, ● Include and exclude list filter ● Long tail keyword pattern analysis Whether this is 1,000 keywords, or 1,000,000 keywords, the Python Script will run all rules and narrow down to whatever number of keywords you set this out to. Different SEOs may have a different process, as long as this process can be articulated into distinct steps, then it can be transformed into Python code.
  • 14. Implementation ● Having the targeted keywords alone is good enough for keyword research. ● Implementation: ● Use keywords in key areas of a page. ● Title Tag ● Meta Description ● Heading Tags ● Main Content ● Image Alt Text ● URLs ● Schema Markup ● Etc.
  • 15. Implementation Automation ● Categorize Targeted Keywords ● Use the ChatGPT API ● Cost for running this for 100 keywords OpenAI's pricing for the GPT-4 model is $0.03 per 1,000 prompt tokens and $0.06 per 1,000 completion tokens. ● Prompt Cost: (150 tokens / 1,000) * $0.03 = $0.0045 ● Completion Cost: (500 tokens / 1,000) * $0.06 = $0.03 ● Total Cost: $0.0045 + $0.03 = $0.0345
  • 16. Implementation Automation ● Assigning a category and subcategory to a page ● Uses Screaming Frog’s Vector Embedding ● The category and subcategories will look for the highest similarity match.
  • 17. Implementation Automation ● Rewriting Title Tags if the main target keyword is not used ● For each category and subcategory, find the highest search volume keyword. ● If the keyword is not used in the title tag, use the ChatGPT API to rewrite the title and use the keyword. ● Similar rules can be done for the meta description ● Doing a manual QA check would be a smart thing to do in the initial runs
  • 19. Potential Challenges ● The Learning Curve ● Not everyone learns at the same pace ● The Technical SEO team does it for the rest of the team ● Great internal training program ● Create an application with a user- friendly interface (Streamlit, Dash, Flask, Django) ● Hire people ● Partner with other companies ● Tool Ownership ● You or your employer?
  • 20. Other Resources ● Learning Python and PANDAS Basics ● YouTube ● Coursera ● Udemy ● ChatGPT ● SEO Application ● Jacky Chou - Indexy ● Greg Bernhardt – ImportSEM ● JC Chouinard
  • 22. Benj Arriola Spoken at 48 conferences in 4 different countries, 21 cities since 2007. Won major prizes in 7 international SEO competitions from 2005 to 2009 that include cash and a brand new car! Python Usage