SlideShare a Scribd company logo
Present by:
Animesh Pramanik
Prasanjit Mondal
(IT 3rd Year)
Jalpaiguri Government Engineering College(Autonomas)
 A large portion of data available on the web is present
in the so called deep web..
 World Wide Web content that is not part of the
Surface Web and is indexed by search engines.
 It is called the Deep Web, Invisible Web or Hidden
Web.
 Largest growing category of new information on the
internet.
 For example, web pages regarding private user
accounts are in the deep web (Private Info).
 The Deep Web is the majority of online content,
estimated to be 400-550 times larger than the surface
web.
 Early Days: static html pages, crawlers can easily reach
 In mid-90’s: Introduction of dynamic pages, that are
generated as a result of a query.
 Jill Ellsworth used the term “Invisible Web” in 1994 to
refer to website that were not registered with any
search engine.
 Another early use of the term invisible Web by Bruce
Mount and Matthew B.Koll of personal Library
software in 1996.
 The first use of the specific term “Deep Web”, now
generally accepted, occurred in the aforementioned
2001 Bergman study.
 “… when you can measure what you are speaking
about, and express it in numbers, you know
something about it…” – Lord Kelvin
 First Attempt: Bergman (2000 )
 Size of surface web is around 19 TB
 Size of Deep Web is around 7500 TB
 Deep Web is nearly 400 times larger than the
Surface Web.
 In 2004 Mites classified the deep web more
accurately
 Most of the html
forms are found
either on the fist
hop or 2nd hop from
the home page
 Unstructured: Data objects as unstructured media
(text, images, audio, video)
 Structured: data objects
as structured “relational”
records with
attribute-value pairs.
Deep web
Task
Specific
Database
LVS Manager
Label Value Set
(LVS)
Form Analyzer
Parser
Form Processor
URL List
Crawl Manager
Response Analyzer
WWW
Data Sources
… Response
Feedback
Form Submission
9
SUBMIT
Keywords
CLEAR
Surface Web Deep Web
 Entries are statically
generated
 Linked Content (web
crawled)
 Readily accessible through
any browser or search
engine unlike the Deep
Web, which requires
special search engines,
browsers, and proxies to
access.
 Entries are dynamically
generated (submitted to a
query or accessed via form).
 Unlinked Content
 Contextual Web
 Private Web
 Scripted Content
 Non-HTML content
 Limited Access Content
(anti-robot protocols like
CAPTCHA)
Advantages Of Invisible Web
Content
 Specialized content focus – large amounts of information
focused on an exact subject
 Contains information that might not be available on the
visible web
 Allows a user to find a precise answer to a specific question
 Allows a user to find webpages from a specific date or time
 Search Engine construct a database of the web by using
programs called spiders or web crawlers that begin with a
list of known Web pages.
 The spider gets a copy of each page and indexes it, storing
useful information that will let the page be quickly
retrieved again latter.
 Any hyperlinks to new pages are added to the list of pages
to be crawled.
 Eventually all reachable pages are indexed, unless the
spider runs out of time or disk space.
 The collection of reachable pages defines the Surfs Web .
Deep web
Deep web
Deep web
Deep web
Deep web
 The web that the vast majority of internet users are
accustomed to.
 Accessible in any nation that does not block internet
access, even places like China and Egypt.
 Social media sites like Facebook, informational
websites like Wikipedia, general websites, etc.
 The layer of the Surface Web that is blocked in some
nations. Some other information is only accessible
through illegal means.
 Google locked results
 Recently web crawled old content
 Pirated Media
 Requires a proxy or two (namely Tor) to access.
 Contains most of the archived web pages of the 1990s
Web that did not renew their domain names and such.
 Government/Business/Collegiate Research.
 Hackers/Script Kiddies/Virus Information.
 Illegal and Obscene Content (CP, Gore, Suicides, etc.)
 Like the Regular Deep Web, but harder to get into and
more illegal content.
 Advanced covert government research.
 Most of the internet black market (run on bitcoins)
 Human/Arms/Drug/Rare Animal Trafficking.
 Assassination networks , bounty hunters, illegal game
hunting, line of blood locations, etc.
 More banned obscene content like CP, Gore, etc.
 Lowest known level of the Deep Web.
 Named after the Spanish Technician who created it.
 Extremely difficult to access, users say it is the safest
part of the internet due to how private it is.
 Julian Assange and other top-level Wikileaks members
are believed to have access.
 You may wonder how any money-related
transition can happen when sellers and buyers
can’t identify each other.
 That’s where Bitcoin comes in.
 Bitcoin, it’s basically an encrypted digital
currency.
 Like regular cash, Bitcoin is good for all
transactions of all kinds and notably, it also
allows for anonymity; no one can trace a
purchase, illegal or otherwise.
 When paired properly with TOR, it’s perhaps
the closest thing to a foolproof way to buy and
sell on the web.
 Tor is the software that installs into your
browser and sets up the specific connections you need to
access dark Web Site.
 Critically it is free software enabling online anonymity and
censorship resistance.
 Onion routing refers to the process of removing encryption
layer from internet communication, similar to peeling back
the layer of an onion.
 Using Tor makes it more difficult to trace Internet Activity,
including “visiting to web sites, online posts, instant
message, and other communication forms”, back to the
user.
 It is intended to protect the personal privacy of users, as
well their freedom and ability internet activists from being
monitored.
 In send of seeing domains that end in .com, .in or .org,
these hidden sits end in .onion.
 The FBI eventually captured Ross Ulbricht, who operated
Silk Road, but copycat sites like Black Market Reloaded are
still readily available.
 Tor is the result of research done by the U.S. Naval
Research Laborite, which created Tor for political
dissidents and whistleblowers, allowing them to
communicate without fear of reprisal.
Deep web
Deep web
Deep web
Deep web

More Related Content

PPTX
The Dark Web
PPTX
The Dark side of the Web
PPTX
Dark Web and Privacy
PPTX
Journey To The Dark Web
PPTX
Social engineering
PPTX
Illuminating the dark web
PPTX
Cybersecurity and the DarkNet
PPTX
Social engineering
The Dark Web
The Dark side of the Web
Dark Web and Privacy
Journey To The Dark Web
Social engineering
Illuminating the dark web
Cybersecurity and the DarkNet
Social engineering

What's hot (20)

PDF
What is Hacking? AND Types of Hackers
PPTX
Phishing ppt
DOCX
Full seminar report on ethical hacking
PPTX
Online privacy & security
PPTX
Introduction To Dark Web
PPTX
Introduction to cyber security amos
PPTX
Cyber Crime & Security
PPT
Module 2 Foot Printing
PPT
All about Hacking
PPTX
trojan horse- malware(virus)
PPTX
Cyber crime & security
PPTX
PPT dark web
PPTX
Hacking & its types
PPTX
Cyber security system presentation
PDF
Deep Dark Web - How to get inside?
PDF
What is Social Engineering? An illustrated presentation.
PPTX
Ethical hacking Presentation
PPTX
Ethical hacking
PDF
Social engineering attacks
PPTX
The Deep Web, TOR Network and Internet Anonymity
What is Hacking? AND Types of Hackers
Phishing ppt
Full seminar report on ethical hacking
Online privacy & security
Introduction To Dark Web
Introduction to cyber security amos
Cyber Crime & Security
Module 2 Foot Printing
All about Hacking
trojan horse- malware(virus)
Cyber crime & security
PPT dark web
Hacking & its types
Cyber security system presentation
Deep Dark Web - How to get inside?
What is Social Engineering? An illustrated presentation.
Ethical hacking Presentation
Ethical hacking
Social engineering attacks
The Deep Web, TOR Network and Internet Anonymity
Ad

Viewers also liked (11)

PPTX
Aspek hukum dalam ekonomi
PPTX
Html_Day_One (W3Schools)
PPTX
Historia juegos olimpicos
DOC
Ігор Свєшніков – дослідник поля Берестецької битви
PDF
PDF
Final Project
PPTX
โครงงานคอมพิวเตอร์
PPTX
Records of the olympic field games (6)
PPS
Naaijer familiedag 2014
PDF
Onet eng
PPT
Matriz foda maría daniela salgado glosario
Aspek hukum dalam ekonomi
Html_Day_One (W3Schools)
Historia juegos olimpicos
Ігор Свєшніков – дослідник поля Берестецької битви
Final Project
โครงงานคอมพิวเตอร์
Records of the olympic field games (6)
Naaijer familiedag 2014
Onet eng
Matriz foda maría daniela salgado glosario
Ad

Similar to Deep web (20)

PPTX
PptDW.pptxvbbbvvvxxfjjrtuhvcfhhjddffcvjjgg
PPTX
Dark web presentation
PPTX
DEEP WEB PRESENTATION.pptx
PPTX
Invisible Web
PPTX
Ali shahbazi khojasteh - deep web
PPTX
PPTX
Deep web
PPTX
The Deep Web.pptx
PDF
Deep Web
PPTX
Cyber crime- a case study
PDF
darkeeb_royida-alhayali darkeebf.ppt.pdf
PDF
Wp below the_surface
PDF
Deeplight Intelliagg
PPTX
Dark Web
PPTX
Dark Web.pptx
PPT
Spooky Halloween IT Security Lecture -- The Deep Web
PPTX
Research in the deep web
PPTX
darkwebbbvxvbjvccjjbvcgjnbvvvbnhc nmk.pptx
PPT
Darknets - Introduction & Deanonymization of Tor Users By Hitesh Bhatia
PPTX
Montilla K32 - DEEP WEB
PptDW.pptxvbbbvvvxxfjjrtuhvcfhhjddffcvjjgg
Dark web presentation
DEEP WEB PRESENTATION.pptx
Invisible Web
Ali shahbazi khojasteh - deep web
Deep web
The Deep Web.pptx
Deep Web
Cyber crime- a case study
darkeeb_royida-alhayali darkeebf.ppt.pdf
Wp below the_surface
Deeplight Intelliagg
Dark Web
Dark Web.pptx
Spooky Halloween IT Security Lecture -- The Deep Web
Research in the deep web
darkwebbbvxvbjvccjjbvcgjnbvvvbnhc nmk.pptx
Darknets - Introduction & Deanonymization of Tor Users By Hitesh Bhatia
Montilla K32 - DEEP WEB

Recently uploaded (20)

PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PPTX
Lesson 3_Tessellation.pptx finite Mathematics
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PDF
composite construction of structures.pdf
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PDF
Digital Logic Computer Design lecture notes
PPTX
OOP with Java - Java Introduction (Basics)
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
CH1 Production IntroductoryConcepts.pptx
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
Structs to JSON How Go Powers REST APIs.pdf
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
Sustainable Sites - Green Building Construction
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
Lesson 3_Tessellation.pptx finite Mathematics
Embodied AI: Ushering in the Next Era of Intelligent Systems
UNIT-1 - COAL BASED THERMAL POWER PLANTS
CYBER-CRIMES AND SECURITY A guide to understanding
Model Code of Practice - Construction Work - 21102022 .pdf
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
composite construction of structures.pdf
Operating System & Kernel Study Guide-1 - converted.pdf
Digital Logic Computer Design lecture notes
OOP with Java - Java Introduction (Basics)
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
CH1 Production IntroductoryConcepts.pptx
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Structs to JSON How Go Powers REST APIs.pdf
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
Sustainable Sites - Green Building Construction
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd

Deep web

  • 1. Present by: Animesh Pramanik Prasanjit Mondal (IT 3rd Year) Jalpaiguri Government Engineering College(Autonomas)
  • 2.  A large portion of data available on the web is present in the so called deep web..  World Wide Web content that is not part of the Surface Web and is indexed by search engines.  It is called the Deep Web, Invisible Web or Hidden Web.  Largest growing category of new information on the internet.  For example, web pages regarding private user accounts are in the deep web (Private Info).  The Deep Web is the majority of online content, estimated to be 400-550 times larger than the surface web.
  • 3.  Early Days: static html pages, crawlers can easily reach  In mid-90’s: Introduction of dynamic pages, that are generated as a result of a query.  Jill Ellsworth used the term “Invisible Web” in 1994 to refer to website that were not registered with any search engine.  Another early use of the term invisible Web by Bruce Mount and Matthew B.Koll of personal Library software in 1996.  The first use of the specific term “Deep Web”, now generally accepted, occurred in the aforementioned 2001 Bergman study.
  • 4.  “… when you can measure what you are speaking about, and express it in numbers, you know something about it…” – Lord Kelvin  First Attempt: Bergman (2000 )  Size of surface web is around 19 TB  Size of Deep Web is around 7500 TB  Deep Web is nearly 400 times larger than the Surface Web.
  • 5.  In 2004 Mites classified the deep web more accurately  Most of the html forms are found either on the fist hop or 2nd hop from the home page
  • 6.  Unstructured: Data objects as unstructured media (text, images, audio, video)  Structured: data objects as structured “relational” records with attribute-value pairs.
  • 8. Task Specific Database LVS Manager Label Value Set (LVS) Form Analyzer Parser Form Processor URL List Crawl Manager Response Analyzer WWW Data Sources … Response Feedback Form Submission
  • 10. Surface Web Deep Web  Entries are statically generated  Linked Content (web crawled)  Readily accessible through any browser or search engine unlike the Deep Web, which requires special search engines, browsers, and proxies to access.  Entries are dynamically generated (submitted to a query or accessed via form).  Unlinked Content  Contextual Web  Private Web  Scripted Content  Non-HTML content  Limited Access Content (anti-robot protocols like CAPTCHA)
  • 11. Advantages Of Invisible Web Content  Specialized content focus – large amounts of information focused on an exact subject  Contains information that might not be available on the visible web  Allows a user to find a precise answer to a specific question  Allows a user to find webpages from a specific date or time
  • 12.  Search Engine construct a database of the web by using programs called spiders or web crawlers that begin with a list of known Web pages.  The spider gets a copy of each page and indexes it, storing useful information that will let the page be quickly retrieved again latter.  Any hyperlinks to new pages are added to the list of pages to be crawled.  Eventually all reachable pages are indexed, unless the spider runs out of time or disk space.  The collection of reachable pages defines the Surfs Web .
  • 18.  The web that the vast majority of internet users are accustomed to.  Accessible in any nation that does not block internet access, even places like China and Egypt.  Social media sites like Facebook, informational websites like Wikipedia, general websites, etc.
  • 19.  The layer of the Surface Web that is blocked in some nations. Some other information is only accessible through illegal means.  Google locked results  Recently web crawled old content  Pirated Media
  • 20.  Requires a proxy or two (namely Tor) to access.  Contains most of the archived web pages of the 1990s Web that did not renew their domain names and such.  Government/Business/Collegiate Research.  Hackers/Script Kiddies/Virus Information.  Illegal and Obscene Content (CP, Gore, Suicides, etc.)
  • 21.  Like the Regular Deep Web, but harder to get into and more illegal content.  Advanced covert government research.  Most of the internet black market (run on bitcoins)  Human/Arms/Drug/Rare Animal Trafficking.  Assassination networks , bounty hunters, illegal game hunting, line of blood locations, etc.  More banned obscene content like CP, Gore, etc.
  • 22.  Lowest known level of the Deep Web.  Named after the Spanish Technician who created it.  Extremely difficult to access, users say it is the safest part of the internet due to how private it is.  Julian Assange and other top-level Wikileaks members are believed to have access.
  • 23.  You may wonder how any money-related transition can happen when sellers and buyers can’t identify each other.  That’s where Bitcoin comes in.  Bitcoin, it’s basically an encrypted digital currency.  Like regular cash, Bitcoin is good for all transactions of all kinds and notably, it also allows for anonymity; no one can trace a purchase, illegal or otherwise.  When paired properly with TOR, it’s perhaps the closest thing to a foolproof way to buy and sell on the web.
  • 24.  Tor is the software that installs into your browser and sets up the specific connections you need to access dark Web Site.  Critically it is free software enabling online anonymity and censorship resistance.  Onion routing refers to the process of removing encryption layer from internet communication, similar to peeling back the layer of an onion.  Using Tor makes it more difficult to trace Internet Activity, including “visiting to web sites, online posts, instant message, and other communication forms”, back to the user.
  • 25.  It is intended to protect the personal privacy of users, as well their freedom and ability internet activists from being monitored.  In send of seeing domains that end in .com, .in or .org, these hidden sits end in .onion.  The FBI eventually captured Ross Ulbricht, who operated Silk Road, but copycat sites like Black Market Reloaded are still readily available.  Tor is the result of research done by the U.S. Naval Research Laborite, which created Tor for political dissidents and whistleblowers, allowing them to communicate without fear of reprisal.