SlideShare a Scribd company logo
3
Most read
6
Most read
8
Most read
INTRODUCTION TO
WEB SCRAPING USING
PYTHON
Tushar Mittal
@techytushar
AGENDA
What we’ll do
What is Web Scraping?
Need of Web Scraping.
Real Life Used Cases.
Workflow and Libraries used.
Demo (Scrape a Website)
Rules of Web Scraping.
WEB SCRAPING
What is it?
Web Scraping is a technique to fetch data and
information from websites.
Everything you see on a webpage can be
scraped.
Can be done in most programming languages,
we’ll use Python (coz its a python meetup :p).
NEED OF WEB SCRAPING
But I Can Just Copy/Paste the Data
What about a thousand webpages or even more.
When no API is provided or there is only limited
number of requests.
Online tools with less customizations.
Learn something new and be your own boss!
USAGE
Real Life Used Cases
Web Crawlers
E-Commerce price comparer.
Preparing dataset for your ML model.
Scraping Social Media Profiles.
Weather Data.
(Sky’s the limit)
WORKFLOW & LIBRARIES
Steps and Tools Involved
Send Request and Load the webpage.
(Requests, urllib, httplib)
Parse the content for desired data.
(Beautiful Soup, re, Scrapy)
Store the data the way you want.
LET’S SCRAPE SOME DATA
RULES OF WEB SCRAPING
Beware!
Don’t crawl at disruptive rate.
Read T&C of Use.
Data is valuable use it wisely.

More Related Content

PPTX
WEB Scraping.pptx
PDF
Web scraping in python
PPTX
Web scraping
PPTX
Web Scrapping Using Python
PDF
Introduction to Git
PPT
Web Scraping and Data Extraction Service
PDF
Toyota logistics
ODP
100 growth hacks 100 days | 1 to 10
WEB Scraping.pptx
Web scraping in python
Web scraping
Web Scrapping Using Python
Introduction to Git
Web Scraping and Data Extraction Service
Toyota logistics
100 growth hacks 100 days | 1 to 10

What's hot (20)

PDF
Getting started with Web Scraping in Python
PPTX
Web Scraping With Python
PDF
What is web scraping?
PPTX
Web Scraping Basics
PDF
What is Web-scraping?
PDF
Tutorial on Web Scraping in Python
PDF
Scraping data from the web and documents
PDF
Intro to web scraping with Python
PDF
Introduction to Machine Learning with SciKit-Learn
PPTX
Web scraping
PPT
Pagerank Algorithm Explained
PDF
Creating data apps using Streamlit in Python
PDF
Skillshare - Introduction to Data Scraping
PPTX
Web Mining Presentation Final
PDF
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck
PPTX
Web mining
PPT
Seo and page rank algorithm
PDF
Web Scraping
PPT
Recommendation techniques
PDF
Intro to beautiful soup
Getting started with Web Scraping in Python
Web Scraping With Python
What is web scraping?
Web Scraping Basics
What is Web-scraping?
Tutorial on Web Scraping in Python
Scraping data from the web and documents
Intro to web scraping with Python
Introduction to Machine Learning with SciKit-Learn
Web scraping
Pagerank Algorithm Explained
Creating data apps using Streamlit in Python
Skillshare - Introduction to Data Scraping
Web Mining Presentation Final
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck
Web mining
Seo and page rank algorithm
Web Scraping
Recommendation techniques
Intro to beautiful soup
Ad

Similar to Introduction to Web Scraping using Python and Beautiful Soup (20)

ODP
Mashup Application at Barcampbkk2
PPTX
Scrappy
PPTX
Web scraping with BeautifulSoup, LXML, RegEx and Scrapy
PPT
Living On A Cloud, Dr Keith Marlow
PDF
Guide for web scraping with Python libraries_ Beautiful Soup, Scrapy, and mor...
PPTX
Jeremy cabral search marketing summit - scraping data-driven content (1)
PPTX
Api strategy and practice
PPTX
API Athens Meetup - API standards 25-6-2014
PPTX
API Athens Meetup - API standards 25-6-2014
PPT
Web 2.0 for IA's
PDF
Mining the Social Web for Fun and Profit: A Getting Started Guide
KEY
Big data and APIs for PHP developers - SXSW 2011
PPT
Semantic Web Science
PDF
Mining the Social Web for Fun and Profit: A Getting Started Guide
PPTX
Data Collection from Social Media Platforms
PPTX
CSOM (Client Side Object Model). Explained @ SharePoint Saturday Houston
PPTX
So You Want to Be a SharePoint Developer - SPS Utah 2015
PPTX
Leading Your Business To Success & The Cloud
PDF
How I built a data platform by myself.pdf
Mashup Application at Barcampbkk2
Scrappy
Web scraping with BeautifulSoup, LXML, RegEx and Scrapy
Living On A Cloud, Dr Keith Marlow
Guide for web scraping with Python libraries_ Beautiful Soup, Scrapy, and mor...
Jeremy cabral search marketing summit - scraping data-driven content (1)
Api strategy and practice
API Athens Meetup - API standards 25-6-2014
API Athens Meetup - API standards 25-6-2014
Web 2.0 for IA's
Mining the Social Web for Fun and Profit: A Getting Started Guide
Big data and APIs for PHP developers - SXSW 2011
Semantic Web Science
Mining the Social Web for Fun and Profit: A Getting Started Guide
Data Collection from Social Media Platforms
CSOM (Client Side Object Model). Explained @ SharePoint Saturday Houston
So You Want to Be a SharePoint Developer - SPS Utah 2015
Leading Your Business To Success & The Cloud
How I built a data platform by myself.pdf
Ad

Recently uploaded (20)

PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
A Presentation on Artificial Intelligence
PPTX
Cloud computing and distributed systems.
PDF
Approach and Philosophy of On baking technology
PDF
cuic standard and advanced reporting.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Machine learning based COVID-19 study performance prediction
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Encapsulation theory and applications.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PPT
Teaching material agriculture food technology
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Advanced methodologies resolving dimensionality complications for autism neur...
A Presentation on Artificial Intelligence
Cloud computing and distributed systems.
Approach and Philosophy of On baking technology
cuic standard and advanced reporting.pdf
Electronic commerce courselecture one. Pdf
Machine learning based COVID-19 study performance prediction
Reach Out and Touch Someone: Haptics and Empathic Computing
MYSQL Presentation for SQL database connectivity
Encapsulation theory and applications.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
NewMind AI Weekly Chronicles - August'25 Week I
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
The AUB Centre for AI in Media Proposal.docx
Teaching material agriculture food technology
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Dropbox Q2 2025 Financial Results & Investor Presentation
Unlocking AI with Model Context Protocol (MCP)
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows

Introduction to Web Scraping using Python and Beautiful Soup

  • 1. INTRODUCTION TO WEB SCRAPING USING PYTHON Tushar Mittal @techytushar
  • 2. AGENDA What we’ll do What is Web Scraping? Need of Web Scraping. Real Life Used Cases. Workflow and Libraries used. Demo (Scrape a Website) Rules of Web Scraping.
  • 3. WEB SCRAPING What is it? Web Scraping is a technique to fetch data and information from websites. Everything you see on a webpage can be scraped. Can be done in most programming languages, we’ll use Python (coz its a python meetup :p).
  • 4. NEED OF WEB SCRAPING But I Can Just Copy/Paste the Data What about a thousand webpages or even more. When no API is provided or there is only limited number of requests. Online tools with less customizations. Learn something new and be your own boss!
  • 5. USAGE Real Life Used Cases Web Crawlers E-Commerce price comparer. Preparing dataset for your ML model. Scraping Social Media Profiles. Weather Data. (Sky’s the limit)
  • 6. WORKFLOW & LIBRARIES Steps and Tools Involved Send Request and Load the webpage. (Requests, urllib, httplib) Parse the content for desired data. (Beautiful Soup, re, Scrapy) Store the data the way you want.
  • 8. RULES OF WEB SCRAPING Beware! Don’t crawl at disruptive rate. Read T&C of Use. Data is valuable use it wisely.