SlideShare a Scribd company logo
Answer, why
this
presentation
matters?
Utilizing The Natural
Language Toolkit for
Keyword Classification &
Cluster Analysis
Miracle Inameti-Archibong | Erudite
SLIDESHARE.NET/USERNAME
@mira_inam
How to do
keyword
research and
keep the will
to live
B R I G T H O N S E O S U M M E R 2 0 2 1
Why Keyword
Research.
Keyword research is the
process of identifying
which terms or phrases
your target audience use
to search for your goods,
services.
Why do we
need it?
Enables you analyse
user behaviour in
order to evaluate if
your content
satisfies the
searcher’s intent.
Discover new
Opportunity.
Related products.
Should filtered pages be
optimised and indexable.
How do your consumers'
research.
Create a search
and user-
friendly
information
architecture.
What do we
need for
keyword
research?
Keywords and search
volume
*other keyword Research Tools exist
https://erudite.agency/insights/how-to-conduct-an-in-depth-
Analysing the
data.
Categories.
Use case.
Intent.
@mira_inam #BrightonSEO
Manual
keyword data
sorting .
Using excel, look for terms or
phrases that occur frequently
and tag them up in the desired
category.
Cleaning
Categorisation
Analysis
10,000
Keywords
Clocked hours
28:45:00
How to do
keyword
research and
keep the will
to live
B R I G H T O N S E O S U M M E R 2 0 2 1
Automating frequency
tagging with the
Natural Language Tool
Kit (NLTK).
B R I G H T O N S E O S U M M E R 2 0 2 1
NLTK is a leading
platform for building
Python programs to work
with human language
data.
Source - https://www.nltk.org/
Natural language processor with
python
Classification Tokenization Stemming
Tagging Parsing
Semantic
Reasoning
Natural language processor with
python
Classification
Tokenization
Pull out various tokens
from text and classify
them by nouns, verbs
Natural language processor with
python
Instruction Determine categories
based on word
frequency
Everyday applications of NLP
Chat boxes
Voice commands on various devices
Google’s knowledge graph – evolved
from keyword lookup to language parsing
B R I G H T O N S E O S U M M E R 2 0 2 1
Getting started.
Download and install python library.
Python.org/downloads/
(preferably version 3.6.4)
B R I G H T O N S E O S U M M E R 2 0 2 1
You also need the
following python
libraries.
Powerful data
structures for data
analysis, time
series, and
statistics.
pandas 1.2.4
https://pypi.org/project/pandas/
A simple API for common
natural language
processing tasks such as
part-of-speech tagging,
noun phrase extraction,
sentiment analysis,
classification,
translation, and more.
https://textblob.readthedocs.io/en/dev/
TextBlob
Both libraries for reading
data and formatting
information from Excel
files .xls format
https://xlrd.readthedocs.io/en/latest/
https://openpyxl.readthedocs.io/en/stable/
&
B R I G H T O N S E O S U M M E R 2 0 2 1
How to pip install the
python libraries
Pip install these libraries.
Download requirement(1).txt file and save
it on your computer (note the file path)
Open the command prompt.
Open Windows Menu
Type command prompt
❑In the Command Prompt
Copy and paste - pip install -r
C:somepathtorequirement
s.txt
@mira_inam #BrightonSEO
Paste the instruction and click enter.
pip install -r
C:somepathtorequi
rementsrequirements
.txt
Highlighted in pink is the path
where your file is saved
@mira_inam #BrightonSEO
The analysis script.
Download and save the python script
Prepare your keywords list.
❑Ensure your keywords list is
clean, has no spaces or
empty rows
❑Heading must be keywords
❑File must be .xlsx not CSV
Running the analysis.
❑Open a command prompt
❑Run the script
⮚ Python C:somepathtokeywords.py
C:somepathtokeywordlist.xlsx
C:somepathtotest_output.xls
Replace highlight with where your file is
saved.
C:somepathtokeywords.py-
C:somepathtokeywordlist.xlsx
Replace highlight with where you want
your test output file to be saved.
C:somepathtotest_output.xls
@mir_inam #BrightonSEO
Run the script.
python C:UsersmiracDownloadskeywords.py
C:UsersmiracDesktoptest.xlsx C:UsersmiracDownloadstest_output.xls
Output.
Sort the data to identify trends.
Spot topics & themes from the data.
B R I G H T O N S E O S U M M E R 2 0 2 1
Running the script on
Google Colab
Make a copy and save on your
google drive
http://bit.ly/keywordclassification
Make a copy and save on your
google drive
Click on the folder
to open upload
data prompt
Upload your keywords data.
Upload the file you
want to run.
Same rules about
uploading a clean
file still apply see
slide 31
Very important – Keep all naming
conventions the same.
Keywords list – test.xlsx
Click run all or Ctrl+F9.
Run all
Download the output file.
Right click the 3
dots and
download the
output file
Download the output file.
It can be a bit buggy so if you
don’t see your output file wait
a few minutes or re-run.
B R I G H T O N S E O S U M M E R 2 0 2 1
Cluster
Classification
Pulls out the most common clusters by
frequency in a histogram.
Coffee Dresses
Pulls out word clouds.
Make a copy and save on your
google drive
http://bit.ly/keywordcluster
How to use.
Running the data.
Same way to
upload data and
run as slide 32-35
Increase the number of clusters
depending on the size of the data.
Click on cluster
analysis to view
the code.
Increase the number of clusters
depending on the size of the data.
Increase or
decrease the
number of clusters
No download.
Scroll to the bottom of the script
to use images of the cluster.
Feedback.
 What additional functions would you
like to see?
 Send me a message @mira_inam .
Thanks to
Eniola Olaleye @galileoeni
Mahdi Mairza
View my slides
here in full.
erudite.agency/brighton-seo
Erudite Agency Ltd 2010-2020 ©
contact@erudite.agency
Let’s start your
growth journey.
If you would prefer our help
prioritising and consulting on
your audit, or for more about how
we can help you increase your
visibility and exposure please e-
mail or call:
+44 (0) 1256 384890

More Related Content

PPTX
Content Automation at Scale | Michael Costanzo @ #brightonSEO 2021
PDF
SEO for Large/Enterprise Websites - Data & Tech Side
PPTX
Technical SEO for large eCommerce websites
PPTX
Tom Bennet – BrightonSEO April 2016: Site Speed for content Marketers
PPTX
BrightonSEO: Leveraging information architecture for Ecommerce SEO
PPTX
How App Indexation Works
PPTX
Hreflang - why and how and why not for International SEO
PPTX
SEO for Large Websites
Content Automation at Scale | Michael Costanzo @ #brightonSEO 2021
SEO for Large/Enterprise Websites - Data & Tech Side
Technical SEO for large eCommerce websites
Tom Bennet – BrightonSEO April 2016: Site Speed for content Marketers
BrightonSEO: Leveraging information architecture for Ecommerce SEO
How App Indexation Works
Hreflang - why and how and why not for International SEO
SEO for Large Websites

What's hot (20)

PPTX
How to build simple web apps to automate your SEO tasks - BrightonSEO Spring ...
PPTX
How Testing Stops Product Changes Harming Your Organic Performance - Brighton...
PDF
Technical Content Optimization
PDF
How to Succeed in B2B SEO
PDF
SearchLeeds 2019 - Polly Pospelova - How to hack rankings with page speed opt...
PPTX
SMX East - SEO Tools Panel
PDF
Software Testing for SEO
PPTX
We’ve analysed the SEO of over 100 eCom sites - this is what we’ve learned!
PDF
11 Advanced Uses of Screaming Frog Nov 2019 DMSS
PDF
SearchLeeds 2018 - Steve Chambers - Stickyeyes - How not to F**K up a Migration
PDF
SEO for website migrations - 53 SEO factors for a successful website relaunch
PDF
rel canonical audit BrightonSEO September 2018
PDF
SearchLeeds 2019 - Matt Howells-Barby - Everything you need to know before sc...
PDF
Hey Googlebot, did you cache that ?
PPTX
How to scale SEO work NOBODY wants to do (including your competitors) to rapi...
PDF
Technical SEO Auditing Tips for the Modern Marketer by Melody Petulla at Merkle
PDF
Data Studio for SEOs: Reporting Automation Tips - Weekly SEO with Lazarina Stoy
PPTX
20 free SEO Tools you should be using - 20180829
PPTX
TechSEO Boost 2017: Fun with Machine Learning: How Machine Learning is Shapin...
PPTX
Scaling Keyword Research to Find Content Gaps
How to build simple web apps to automate your SEO tasks - BrightonSEO Spring ...
How Testing Stops Product Changes Harming Your Organic Performance - Brighton...
Technical Content Optimization
How to Succeed in B2B SEO
SearchLeeds 2019 - Polly Pospelova - How to hack rankings with page speed opt...
SMX East - SEO Tools Panel
Software Testing for SEO
We’ve analysed the SEO of over 100 eCom sites - this is what we’ve learned!
11 Advanced Uses of Screaming Frog Nov 2019 DMSS
SearchLeeds 2018 - Steve Chambers - Stickyeyes - How not to F**K up a Migration
SEO for website migrations - 53 SEO factors for a successful website relaunch
rel canonical audit BrightonSEO September 2018
SearchLeeds 2019 - Matt Howells-Barby - Everything you need to know before sc...
Hey Googlebot, did you cache that ?
How to scale SEO work NOBODY wants to do (including your competitors) to rapi...
Technical SEO Auditing Tips for the Modern Marketer by Melody Petulla at Merkle
Data Studio for SEOs: Reporting Automation Tips - Weekly SEO with Lazarina Stoy
20 free SEO Tools you should be using - 20180829
TechSEO Boost 2017: Fun with Machine Learning: How Machine Learning is Shapin...
Scaling Keyword Research to Find Content Gaps
Ad

Similar to Utilizing the natural langauage toolkit for keyword research (20)

DOCX
Page 18Goal Implement a complete search engine. Milestones.docx
PDF
Pratical Deep Dive into the Semantic Web - #smconnect
PDF
Building multi billion ( dollars, users, documents ) search engines on open ...
PPTX
Data Science Process.pptx
DOCX
SURE Research Report
PPTX
Toolboxes for data scientists
PDF
CRACKED DP-600 Exam in Just 4 Hours | Implementing Analytics Solutions Using ...
PPTX
Navigating the Mess of a Shared drive Migration to SharePoint
DOCX
Must be similar to screenshotsI must be able to run the projects.docx
PPTX
ExperTwin: An Alter Ego in Cyberspace for Knowledge Workers
PPTX
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
PPT
OpenKM commercial
PPTX
Information Extraction from Text, presented @ Deloitte
DOCX
Evaluate a Health WebsiteName Click here to enter text.Course Cli.docx
PDF
Exploring and accessing knowledge in Research
PPTX
Qualitative Analysis in Atlas.ti
DOC
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
PDF
[IJET V2I3P7] Authors: Muthe Sandhya, Shitole Sarika, Sinha Anukriti, Aghav S...
PPT
Evaluation of Research Tools
PDF
Optimization Software Class Libraries 1st Edition Stefan Voß
Page 18Goal Implement a complete search engine. Milestones.docx
Pratical Deep Dive into the Semantic Web - #smconnect
Building multi billion ( dollars, users, documents ) search engines on open ...
Data Science Process.pptx
SURE Research Report
Toolboxes for data scientists
CRACKED DP-600 Exam in Just 4 Hours | Implementing Analytics Solutions Using ...
Navigating the Mess of a Shared drive Migration to SharePoint
Must be similar to screenshotsI must be able to run the projects.docx
ExperTwin: An Alter Ego in Cyberspace for Knowledge Workers
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
OpenKM commercial
Information Extraction from Text, presented @ Deloitte
Evaluate a Health WebsiteName Click here to enter text.Course Cli.docx
Exploring and accessing knowledge in Research
Qualitative Analysis in Atlas.ti
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
[IJET V2I3P7] Authors: Muthe Sandhya, Shitole Sarika, Sinha Anukriti, Aghav S...
Evaluation of Research Tools
Optimization Software Class Libraries 1st Edition Stefan Voß
Ad

More from Erudite (20)

PPTX
Top accessibility pitfalls and how to fix them
PPTX
10 no test cro wins
PDF
TAD Digital - Beyond Rankings: Demonstrating the value of your SEO campaign ...
PPTX
How to Use Screaming Frog Custom Extractions
PPTX
All You Can EAT - SMX London
PDF
Proving your worth - Demonstrating the value of your SEO campaign
PPTX
Guide to actionable speed audits; getting your developer to work with you
PDF
Are PWAs The Future of the Web?
PDF
Let's Get Physical - Making the Journey Feel Better to Drive Growth
PDF
We Made Our Website a PWA & Why You Should Too - Brighton SEO
PDF
5,000 UK Websites Mobile Sitespeed Comparison - Search Elite
PDF
Progressive Web Apps - Intro and State of Market in Australia
PDF
Making HREFLANG Manageable: Search Marketing Summit (Sydney)
PDF
UK Top 5,000 Websites; Mobile Site Speed Benchmark - BrightonSEO
PDF
Fast Is The Only Speed
PDF
Driving ROI with Technical SEO
PDF
Digital Gaggle 2017 - Mobile Index
PDF
SEO Checklist for Google Mobile First Index
PDF
HREFLANG for International SEO: Lessons from 3,000 Implementations
PDF
How Generation Z is Driving Change in Search UX: Brighton SEO 2016
Top accessibility pitfalls and how to fix them
10 no test cro wins
TAD Digital - Beyond Rankings: Demonstrating the value of your SEO campaign ...
How to Use Screaming Frog Custom Extractions
All You Can EAT - SMX London
Proving your worth - Demonstrating the value of your SEO campaign
Guide to actionable speed audits; getting your developer to work with you
Are PWAs The Future of the Web?
Let's Get Physical - Making the Journey Feel Better to Drive Growth
We Made Our Website a PWA & Why You Should Too - Brighton SEO
5,000 UK Websites Mobile Sitespeed Comparison - Search Elite
Progressive Web Apps - Intro and State of Market in Australia
Making HREFLANG Manageable: Search Marketing Summit (Sydney)
UK Top 5,000 Websites; Mobile Site Speed Benchmark - BrightonSEO
Fast Is The Only Speed
Driving ROI with Technical SEO
Digital Gaggle 2017 - Mobile Index
SEO Checklist for Google Mobile First Index
HREFLANG for International SEO: Lessons from 3,000 Implementations
How Generation Z is Driving Change in Search UX: Brighton SEO 2016

Recently uploaded (20)

PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
Global journeys: estimating international migration
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PDF
.pdf is not working space design for the following data for the following dat...
PDF
Fluorescence-microscope_Botany_detailed content
PDF
Foundation of Data Science unit number two notes
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Miokarditis (Inflamasi pada Otot Jantung)
Data_Analytics_and_PowerBI_Presentation.pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
Reliability_Chapter_ presentation 1221.5784
Global journeys: estimating international migration
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Moving the Public Sector (Government) to a Digital Adoption
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
.pdf is not working space design for the following data for the following dat...
Fluorescence-microscope_Botany_detailed content
Foundation of Data Science unit number two notes
Business Acumen Training GuidePresentation.pptx
Introduction-to-Cloud-ComputingFinal.pptx

Utilizing the natural langauage toolkit for keyword research

Editor's Notes

  • #12: categ
  • #13: categ
  • #29: Insert the location of the requirements txt