SlideShare a Scribd company logo
Doing More with Less: Automated, High-Quality Content Generation
HAMLET BATISTA
CEO AND FOUNDER, RANKSENSE INC.
@hamletbatista
@hamletbatista#SEJLive
You are Leveraging AI in Your
SEO Work Already
Using Google
Docs or Gmail
Smart
Compose?
@hamletbatista#SEJLive
Facing Writers’ block?
Start typing
and hit the
tab key for
full sentence
ideas!
https://transformer.huggin
gface.co/doc/distil-gpt2
@hamletbatista#SEJLive
How to Go Deeper with Keyword Research
“We are in the era where intent-based searches are more
important to us than pure volume.”
“You should take the extra step to learn the questions
customers are asking and how they describe their problems.”
“Go from keywords to questions”
-Mindy Weinstein
SEJ / May 9, 2020
https://www.searchenginejournal.com/deep-keyword-research-process/214096/
@hamletbatista#SEJLive
What is the Opportunity?
1. Search engines are
answering engines these
days.
2. One effective way to write
original, and popular
content is to answer your
target audience’s most
important questions.
@hamletbatista#SEJLive
3. FAQ search snippets
take more real estate in
the SERPs.
4. Doing this manually is
going to be expensive
and time consuming.
@hamletbatista#SEJLive
What is the Opportunity?
5. Let’s automate it by
leveraging AI and existing
content assets
@hamletbatista#SEJLive
What is the Opportunity?
Leveraging Existing Knowledge
1. Most established
businesses have
valuable knowledge
bases
2. Many times not yet
publicly available
(support emails,
chats, internal wikis).
@hamletbatista#SEJLive
Open Source AI + Proprietary Knowledge
Using a technique called
Transfer Learning, we
can produce original,
quality content by
combining proprietary
knowledge bases and
public deep learning
models and datasets.
@hamletbatista#SEJLive
AGENDA
We are going to review automated question and answer generation approaches:
1. We will source popular questions using online tools
2. We will answer them using two NLG approaches:
a. A span search approach
b. A “closed book” approach
3. We will add FAQ schema and validate using the SDTT
4. Resources to learn more
@hamletbatista#SEJLive
SOURCING
POPULAR
QUESTIONS
@hamletbatista#SEJLive
● Answer the Public
● Question Analyzer by BuzzSumo
● AlsoAsked.com
SOURCING POPULAR QUESTIONS
@hamletbatista#SEJLive
Answer the Public
Question Analyzer by BuzzSumo
AlsoAsked.com
QUESTION
ANSWERING
SYSTEM
and
@hamletbatista#SEJLive
Papers with Code
Question
Answering
is an area
of very
active
research
https://paperswithcode.com/task/question-answering
@hamletbatista#SEJLive
Papers with Code
@hamletbatista#SEJLive
Stanford Question Answering Dataset
@hamletbatista#SEJLive
A
SPAN SEARCH
APPROACH
@hamletbatista#SEJLive
In Just 3 Lines of Python Code
!pip install transformers
from transformers import pipeline
# Allocate a pipeline for question-answering
nlp = pipeline('question-answering')
nlp({
'question': 'What is the name of the repository ?',
'context': 'Pipeline have been included in the huggingface/transformers repository'
})
{'answer': 'huggingface/transformers',
'end': 59,
'score': 0.5135626548884602,
'start': 35}
@hamletbatista#SEJLive
A SPAN SEARCH APPROACH
● Load the Transformers NLP library
https://github.com/huggingface/transformers
● Allocate a Question Answering pipeline
https://huggingface.com/transformers/usage.html#extractive-
question-answering
● Provide the question and context (content/text most likely to
include the answer)
@hamletbatista#SEJLive
How to Get the Context
!pip install requests-html
from requests_html import HTMLSession
session = HTMLSession()
url = "https://www.searchenginejournal.com/uncover-powerful-data-stories-phyton/328471/"
selector = "#post-328471 > div:nth-child(2) > div > div > div.sej-article-content.gototop-pos"
with session.get(url) as r:
post = r.html.find(selector, first=True)
text = post.text
@hamletbatista#SEJLive
SOMETHING
A BIT MORE
AMBITIOUS
@hamletbatista#SEJLive
Exploring the Limits of NLG with T5
and Turing-NLG
Google’s T5 (11-billion
parameter model) and
Microsoft’s TuringNG
(17-billion parameter
model) are able answer
questions without
providing any context!
🤯🤯🤯
@hamletbatista#SEJLive
Open Book vs Closed Book Question Answering
@hamletbatista#SEJLive
Closed Book Trivia Challenge with T5
The Google’s T5 team
went head-to-head
with the 11-billion
parameter model in a
pub trivia challenge
and lost! 😅
https://t5-
trivia.glitch.me/
@hamletbatista#SEJLive
LET’S TRAIN,
FINE-TUNE and
LEVERAGE T5
@hamletbatista#SEJLive
Here is T5AnsweringArbitrary Questions
We are going to
train the 3-billion
parameter model
using a free
Google Colab
TPU.
These are some
example
predictions.
● Copy the example Colab notebook to your Google Drive
● Change the runtime environment to Cloud TPU
● Create a Google Cloud Storage bucket (use the free $300 in credits)
● Provide the bucket path to the notebook
● Select the 3-billion parameters model
● Run the remaining cells up to the prediction step
We won’t need to write any Python code 😞
HERE IS THE TECHNICAL PLAN
@hamletbatista#SEJLive
Copy the Colab Notebook to Your Google Drive
@hamletbatista#SEJLive
Change the Runtime Environment to Cloud TPU
@hamletbatista#SEJLive
Change the Runtime Environment to Cloud TPU
@hamletbatista#SEJLive
Create a Google Cloud Storage Bucket
@hamletbatista#SEJLive
Provide the Bucket Path to the Notebook
@hamletbatista#SEJLive
Select the 3-billion Parameters Model
@hamletbatista#SEJLive
Run the Remaining Cells up to the Prediction Step
@hamletbatista#SEJLive
FINE TUNING TO ADD
PROPRIETARY
KNOWLEDGE
@hamletbatista#SEJLive
Add New Proprietary Training Datasets
1. Preprocess your
proprietary
knowledge base
into a format that
can work with T5
2. Adapt the existing
code for this
purpose (Natural
Questions,
TriviaQA)
1. Extract
2. Transform
3. Load
https://www.searchenginejournal.com/machine-learning-practical-introduction-seo-professionals/366304/
Add New Proprietary Training Datasets
ADDING
FAQ
SCHEMA
@hamletbatista#SEJLive
https://developers.google.com/search/docs/data-types/faqpage
Doing More with Less: Automated, High-Quality Content Generation
https://www.searchenginejournal.com/introduction-modern-javascript-for-seo/347836/
RESOURCES
TO LEARN
MORE
@hamletbatista#SEJLive
1. Introduction to Python
for SEOs
2. Introduction to Machine
Learning for SEOs
3. Leverage SOTA models
with one line of code
4. Exploring Transfer
Learning with T5
5. Deep Learning on
Steroids with the Power
of Knowledge Transfer!
6. MarketMuse First Draft
@RankSense#SEJLive
About RankSense
Automate tedious SEO tasks in
Google Sheets.
Import the sheets and deploy
them as experiments to
Cloudflare.
Learn which changes are
effective.
https://www.ranksense.com
@RankSense#SEJLive

More Related Content

PDF
The Python Cheat Sheet for the Busy Marketer
PPTX
Scaling Keyword Research to Find Content Gaps
PPTX
Query Classification on Steroids with BERT
PDF
Quality Content at Scale Through Automated Text Summarization of UGC
PPTX
Python for SEO
PDF
SEO Meets Automation
PPTX
The New Renaissance of JavaScript
PDF
A Deep Dive Into SEO Tactics For Modern Javascript Frameworks
The Python Cheat Sheet for the Busy Marketer
Scaling Keyword Research to Find Content Gaps
Query Classification on Steroids with BERT
Quality Content at Scale Through Automated Text Summarization of UGC
Python for SEO
SEO Meets Automation
The New Renaissance of JavaScript
A Deep Dive Into SEO Tactics For Modern Javascript Frameworks

What's hot (20)

PPTX
Split Testing for SEO - 9 Months of Learning
PPTX
The Future of SEO #LearnInbound
PPT
Pubcon Vegas 2017 You're Going To Screw Up International SEO - Patrick Stox
PPTX
Scaling automated quality text generation for enterprise sites
PPTX
The State of HTTPS In Search
PDF
So you think you know canonical tags - Sean Butcher Brighton SEO presentation
PDF
How To Tackle Enterprise Sites - Rachel Costello, Technical SEO, DeepCrawl
PPT
Extreme Google
 
PPTX
Tom Capper Mozcon 2021 - Core Web Vitals - The Fast & The Spurious
PPTX
Using Competitive Gap Analyses to Discover Low-Hanging Fruit
PDF
The State of the Web: Pagination and Infinite Scroll
PPTX
Advanced Technical SEO in 2020 - Data Science
PDF
SearchLove Boston 2016 | Mike King | Developer Thinking for SEOs
PPTX
Schema Markup Basics - Pubcon 2017
PDF
Reading SEO Feb 2020 - The Massive Growth Of Structured Data - Tom Pool - Blu...
PPTX
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
PDF
BrightonSEO - How to use XPath with eCommerce Websites
PPTX
#CMC2019: Advanced SEO: Competitive intelligence, Web Scraping, and More.
PPTX
SearchLove Boston 2016 | Paul Shapiro | How to Automate Your Keyword Research
PDF
Rendering SEO Manifesto - Why we need to go beyond JavaScript SEO
Split Testing for SEO - 9 Months of Learning
The Future of SEO #LearnInbound
Pubcon Vegas 2017 You're Going To Screw Up International SEO - Patrick Stox
Scaling automated quality text generation for enterprise sites
The State of HTTPS In Search
So you think you know canonical tags - Sean Butcher Brighton SEO presentation
How To Tackle Enterprise Sites - Rachel Costello, Technical SEO, DeepCrawl
Extreme Google
 
Tom Capper Mozcon 2021 - Core Web Vitals - The Fast & The Spurious
Using Competitive Gap Analyses to Discover Low-Hanging Fruit
The State of the Web: Pagination and Infinite Scroll
Advanced Technical SEO in 2020 - Data Science
SearchLove Boston 2016 | Mike King | Developer Thinking for SEOs
Schema Markup Basics - Pubcon 2017
Reading SEO Feb 2020 - The Massive Growth Of Structured Data - Tom Pool - Blu...
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
BrightonSEO - How to use XPath with eCommerce Websites
#CMC2019: Advanced SEO: Competitive intelligence, Web Scraping, and More.
SearchLove Boston 2016 | Paul Shapiro | How to Automate Your Keyword Research
Rendering SEO Manifesto - Why we need to go beyond JavaScript SEO
Ad

Similar to Doing More with Less: Automated, High-Quality Content Generation (20)

PPTX
Google Machine Learning Algorithms and SEO
PDF
Gaps in the algorithm
PPTX
ML & Automation in SEO - Traffic Think Tank Conference 2019
PDF
Tailoring Small Language Models for Enterprise Use Cases
PDF
Generative AI: The New Wild West of SEO - Ryan Huser, Ayima
PDF
Data Science Resume
PDF
TechSEO Boost 2018: Python for SEOs
PPTX
Getting Started with Python and Machine Learning for SEO | BrightonSEO Octobe...
PPTX
Using the search engine as recommendation engine
PDF
How do OpenAI GPT Models Work - Misconceptions and Tips for Developers
PPTX
Applied AI Workshop - Presentation - Connect Day GDL
PDF
Modern Search: Using ML & NLP advances to enhance search and discovery
PDF
Dato Keynote
PDF
Generative AI The New Wild West of SEO - Ryan Huser, Resignal
PPTX
The Intersection of Robotics, Search and AI with Solr, MyRobotLab, and Deep L...
PPTX
Robotics, Search and AI with Solr, MyRobotLab, and Deeplearning4j
PDF
Generative AI - The New Wild West of SEO - Ryan Huser, Resignal
PPTX
Tackling Python: How It Can Help With Technical SEO | Pint Sized Meetup Janua...
PPTX
ML for SEOs - Content Jam 2019
PPTX
Open techai 20180429 v1
Google Machine Learning Algorithms and SEO
Gaps in the algorithm
ML & Automation in SEO - Traffic Think Tank Conference 2019
Tailoring Small Language Models for Enterprise Use Cases
Generative AI: The New Wild West of SEO - Ryan Huser, Ayima
Data Science Resume
TechSEO Boost 2018: Python for SEOs
Getting Started with Python and Machine Learning for SEO | BrightonSEO Octobe...
Using the search engine as recommendation engine
How do OpenAI GPT Models Work - Misconceptions and Tips for Developers
Applied AI Workshop - Presentation - Connect Day GDL
Modern Search: Using ML & NLP advances to enhance search and discovery
Dato Keynote
Generative AI The New Wild West of SEO - Ryan Huser, Resignal
The Intersection of Robotics, Search and AI with Solr, MyRobotLab, and Deep L...
Robotics, Search and AI with Solr, MyRobotLab, and Deeplearning4j
Generative AI - The New Wild West of SEO - Ryan Huser, Resignal
Tackling Python: How It Can Help With Technical SEO | Pint Sized Meetup Janua...
ML for SEOs - Content Jam 2019
Open techai 20180429 v1
Ad

More from Hamlet Batista (14)

PDF
Automated Duplicate Content Consolidation with Google Cloud Functions
PDF
Automating Google Lighthouse
PDF
Creando una Sección de FAQS y su Marcado de Datos Estructurados en 30 Minutos
PPTX
How to scale SEO work NOBODY wants to do (including your competitors) to rapi...
PPTX
Agile SEO: Faster SEO Results
PPTX
Solving Complex JavaScript Issues and Leveraging Semantic HTML5
PPTX
Python for Data-driven Storytelling
PPTX
Data and Evidence-driven SEO
PPTX
Advanced Data-Driven SEO
PPTX
Technical SEO "Overoptimization"
PPTX
Why Pay for Performance When You Can Lead the World To Your Door for Free?
PPTX
Gettin' It Up And Keepin' It Up in Google
PPTX
Batista, Hamlet, Beyond The Usual Link Building
PPT
White Hat Cloaking
Automated Duplicate Content Consolidation with Google Cloud Functions
Automating Google Lighthouse
Creando una Sección de FAQS y su Marcado de Datos Estructurados en 30 Minutos
How to scale SEO work NOBODY wants to do (including your competitors) to rapi...
Agile SEO: Faster SEO Results
Solving Complex JavaScript Issues and Leveraging Semantic HTML5
Python for Data-driven Storytelling
Data and Evidence-driven SEO
Advanced Data-Driven SEO
Technical SEO "Overoptimization"
Why Pay for Performance When You Can Lead the World To Your Door for Free?
Gettin' It Up And Keepin' It Up in Google
Batista, Hamlet, Beyond The Usual Link Building
White Hat Cloaking

Recently uploaded (20)

PDF
Hidden gems in Microsoft ads with Navah Hopkins
PPTX
"Best Healthcare Digital Marketing Ideas
PDF
Pay-Per-Click Marketing: Strategies That Actually Work in 2025
PDF
Future Retail Disruption Trends and Observations
PDF
PDF
Mastering Content Strategy in 2025 ss.pdf
PDF
Master Fullstack Development Course in Chennai – Enroll Now!
PPTX
hnk joint business plan for_Rooftop_Plan
PDF
Digital Marketing Agency in Thrissur with Proven Strategies for Local Growth
PDF
Mastering Bulk Email Campaign Optimization for 2025
PDF
Building a strong social media presence.
PDF
UNIT 2 - 5 DISTRIBUTION IN RURAL MARKETS.pdf
PDF
Unit 1 -2 THE 4 As of RURAL MARKETING MIX.pdf
DOCX
Parkville marketing plan .......MR.docx
PPTX
Fixing-AI-Hallucinations-The-NeuroRanktm-Approach.pptx
PPTX
Presentation - MindfulHeal Digital Ayurveda GTM & Marketing Plan.pptx
PPTX
Your score increases as you pick a category, fill out a long description and ...
PPTX
Kimberly Crossland Storytelling Marketing Class 5stars.pptx
PPTX
Sumit Saxena IIM J Project Market segmentation.pptx
PDF
20K Btc Enabled Cash App Accounts – Safe, Fast, Verified.pdf
Hidden gems in Microsoft ads with Navah Hopkins
"Best Healthcare Digital Marketing Ideas
Pay-Per-Click Marketing: Strategies That Actually Work in 2025
Future Retail Disruption Trends and Observations
Mastering Content Strategy in 2025 ss.pdf
Master Fullstack Development Course in Chennai – Enroll Now!
hnk joint business plan for_Rooftop_Plan
Digital Marketing Agency in Thrissur with Proven Strategies for Local Growth
Mastering Bulk Email Campaign Optimization for 2025
Building a strong social media presence.
UNIT 2 - 5 DISTRIBUTION IN RURAL MARKETS.pdf
Unit 1 -2 THE 4 As of RURAL MARKETING MIX.pdf
Parkville marketing plan .......MR.docx
Fixing-AI-Hallucinations-The-NeuroRanktm-Approach.pptx
Presentation - MindfulHeal Digital Ayurveda GTM & Marketing Plan.pptx
Your score increases as you pick a category, fill out a long description and ...
Kimberly Crossland Storytelling Marketing Class 5stars.pptx
Sumit Saxena IIM J Project Market segmentation.pptx
20K Btc Enabled Cash App Accounts – Safe, Fast, Verified.pdf

Doing More with Less: Automated, High-Quality Content Generation

  • 2. HAMLET BATISTA CEO AND FOUNDER, RANKSENSE INC. @hamletbatista
  • 4. You are Leveraging AI in Your SEO Work Already Using Google Docs or Gmail Smart Compose? @hamletbatista#SEJLive
  • 5. Facing Writers’ block? Start typing and hit the tab key for full sentence ideas! https://transformer.huggin gface.co/doc/distil-gpt2 @hamletbatista#SEJLive
  • 6. How to Go Deeper with Keyword Research “We are in the era where intent-based searches are more important to us than pure volume.” “You should take the extra step to learn the questions customers are asking and how they describe their problems.” “Go from keywords to questions” -Mindy Weinstein SEJ / May 9, 2020 https://www.searchenginejournal.com/deep-keyword-research-process/214096/ @hamletbatista#SEJLive
  • 7. What is the Opportunity? 1. Search engines are answering engines these days. 2. One effective way to write original, and popular content is to answer your target audience’s most important questions. @hamletbatista#SEJLive
  • 8. 3. FAQ search snippets take more real estate in the SERPs. 4. Doing this manually is going to be expensive and time consuming. @hamletbatista#SEJLive What is the Opportunity?
  • 9. 5. Let’s automate it by leveraging AI and existing content assets @hamletbatista#SEJLive What is the Opportunity?
  • 10. Leveraging Existing Knowledge 1. Most established businesses have valuable knowledge bases 2. Many times not yet publicly available (support emails, chats, internal wikis). @hamletbatista#SEJLive
  • 11. Open Source AI + Proprietary Knowledge Using a technique called Transfer Learning, we can produce original, quality content by combining proprietary knowledge bases and public deep learning models and datasets. @hamletbatista#SEJLive
  • 12. AGENDA We are going to review automated question and answer generation approaches: 1. We will source popular questions using online tools 2. We will answer them using two NLG approaches: a. A span search approach b. A “closed book” approach 3. We will add FAQ schema and validate using the SDTT 4. Resources to learn more @hamletbatista#SEJLive
  • 14. ● Answer the Public ● Question Analyzer by BuzzSumo ● AlsoAsked.com SOURCING POPULAR QUESTIONS @hamletbatista#SEJLive
  • 19. Papers with Code Question Answering is an area of very active research https://paperswithcode.com/task/question-answering @hamletbatista#SEJLive
  • 21. Stanford Question Answering Dataset @hamletbatista#SEJLive
  • 23. In Just 3 Lines of Python Code !pip install transformers from transformers import pipeline # Allocate a pipeline for question-answering nlp = pipeline('question-answering') nlp({ 'question': 'What is the name of the repository ?', 'context': 'Pipeline have been included in the huggingface/transformers repository' }) {'answer': 'huggingface/transformers', 'end': 59, 'score': 0.5135626548884602, 'start': 35} @hamletbatista#SEJLive
  • 24. A SPAN SEARCH APPROACH ● Load the Transformers NLP library https://github.com/huggingface/transformers ● Allocate a Question Answering pipeline https://huggingface.com/transformers/usage.html#extractive- question-answering ● Provide the question and context (content/text most likely to include the answer) @hamletbatista#SEJLive
  • 25. How to Get the Context !pip install requests-html from requests_html import HTMLSession session = HTMLSession() url = "https://www.searchenginejournal.com/uncover-powerful-data-stories-phyton/328471/" selector = "#post-328471 > div:nth-child(2) > div > div > div.sej-article-content.gototop-pos" with session.get(url) as r: post = r.html.find(selector, first=True) text = post.text @hamletbatista#SEJLive
  • 27. Exploring the Limits of NLG with T5 and Turing-NLG Google’s T5 (11-billion parameter model) and Microsoft’s TuringNG (17-billion parameter model) are able answer questions without providing any context! 🤯🤯🤯 @hamletbatista#SEJLive
  • 28. Open Book vs Closed Book Question Answering @hamletbatista#SEJLive
  • 29. Closed Book Trivia Challenge with T5 The Google’s T5 team went head-to-head with the 11-billion parameter model in a pub trivia challenge and lost! 😅 https://t5- trivia.glitch.me/ @hamletbatista#SEJLive
  • 30. LET’S TRAIN, FINE-TUNE and LEVERAGE T5 @hamletbatista#SEJLive
  • 31. Here is T5AnsweringArbitrary Questions We are going to train the 3-billion parameter model using a free Google Colab TPU. These are some example predictions.
  • 32. ● Copy the example Colab notebook to your Google Drive ● Change the runtime environment to Cloud TPU ● Create a Google Cloud Storage bucket (use the free $300 in credits) ● Provide the bucket path to the notebook ● Select the 3-billion parameters model ● Run the remaining cells up to the prediction step We won’t need to write any Python code 😞 HERE IS THE TECHNICAL PLAN @hamletbatista#SEJLive
  • 33. Copy the Colab Notebook to Your Google Drive @hamletbatista#SEJLive
  • 34. Change the Runtime Environment to Cloud TPU @hamletbatista#SEJLive
  • 35. Change the Runtime Environment to Cloud TPU @hamletbatista#SEJLive
  • 36. Create a Google Cloud Storage Bucket @hamletbatista#SEJLive
  • 37. Provide the Bucket Path to the Notebook @hamletbatista#SEJLive
  • 38. Select the 3-billion Parameters Model @hamletbatista#SEJLive
  • 39. Run the Remaining Cells up to the Prediction Step @hamletbatista#SEJLive
  • 40. FINE TUNING TO ADD PROPRIETARY KNOWLEDGE @hamletbatista#SEJLive
  • 41. Add New Proprietary Training Datasets 1. Preprocess your proprietary knowledge base into a format that can work with T5 2. Adapt the existing code for this purpose (Natural Questions, TriviaQA)
  • 42. 1. Extract 2. Transform 3. Load https://www.searchenginejournal.com/machine-learning-practical-introduction-seo-professionals/366304/ Add New Proprietary Training Datasets
  • 48. 1. Introduction to Python for SEOs 2. Introduction to Machine Learning for SEOs 3. Leverage SOTA models with one line of code 4. Exploring Transfer Learning with T5 5. Deep Learning on Steroids with the Power of Knowledge Transfer! 6. MarketMuse First Draft
  • 50. About RankSense Automate tedious SEO tasks in Google Sheets. Import the sheets and deploy them as experiments to Cloudflare. Learn which changes are effective. https://www.ranksense.com @RankSense#SEJLive