SlideShare a Scribd company logo
Data Leakage Detection
Submitted by:
Name: GauravKumar
Sic:15cs0967
Branch: CSE
Mob.No 9572343425
Contents…..
 Introduction
 How data leakage takes place
 Biggest data breaches of the 21st century
 Existing data leakage detection techniques
 Disadvantages of existing techniques
 Future scope
 Applications
 Conclusion
Introduction…..
DATA LEAKAGE is the unauthorized transmission of sensitive data or
information from within an organization to an external destination or
recipient.
SENSITIVE DATA of companies and organization includes
 intellectual property,
 Financial information,
 Patient information,
 Personal credit card data,
and other information depending upon the business and the industry
How data leakage takes place..??
 In the course of doing business, sometimes data must be handed
over to the trusted third parties for some enhancement or
operations.
 Sometimes these trusted third parties may act as points of data
leakage.
 Examples:
A. A hospital may give patient records to researcher who will devise
new treatments.
B. A company may have partnership with other companies that
require sharing of customer data.
C. An enterprise may outsource its data processing, so data must be
given to various other companies.
 Owner of data is termed as the distributor and the third
parties are called as the agents.
 In case of data leakage, the distributor must access the
likelihood that the leaked data come from one or more agents,
as opposed to having been independently gathered by other
means.
Biggest data breaches of the 21st century
1. Yahoo
Date: September 2016
Impact: 3 billion user accounts
Details: In September 2016, the once dominant Internet
giant, while in negotiations to sell itself to Verizon,
announced it had been the victim of the biggest data breach
in history, likely by “a state-sponsored actor,” in 2014. The
attack compromised the real names, email addresses, dates
of birth and telephone numbers of 500 million users. The
company said the "vast majority" of the passwords involved
had been hashed using the robust bcrypt algorithm.
2. eBay
Date: May 2014
Impact: 145 million users compromised
Details: The online auction giant reported a cyberattack in
May 2014 that it said exposed names, addresses, dates of
birth and encrypted passwords of all of its 145 million users.
The company said hackers got into the company network
using the credentials of three corporate employees, and had
complete inside access for 229 days, during which time they
were able to make their way to the user database.
3. Uber
Date: Late 2016
Impact: Personal information of 57 million Uber users and 600,000
drivers exposed.
4. Election Systems & Software – 1.8 million accounts
In August, IT security experts discovered an open Amazon Web
Services (AWS) cloud container. It contained a backup copy of data
from Election Systems & Software (ES&S), a company that
manufactures voting machines and elections management
systems. The data contained a total of almost 2 million accounts
with names, addresses, dates of birth, and party affiliations of
Illinois residents. By default, access to AWS bins is possible only
after authentication; however, for some unknown reason, the
settings on this device were misconfigured, and that made the
container accessible to the public.
 "We have a responsibility to protect your data, and if
we can't then we don't deserve to serve you,"
Zuckerberg said in a statement on his Facebook page.
Over 50 million Facebook profiles were harvested by an
app for data, which was then passed the information on
to Cambridge Analytica.
Data leakage detection
Data leakage detection
Data leakage detection
DATA LEAKAGE DETECTION
 To detect whether data been leaked by agents.
 To prevent data leakage.
Existing data leakage detection techniques
1. Watermarking
2. Steganography
1.Watermarking:
A unique code is embedded in each distributed copy. If
that copy is later discovered in the hands of an
unauthorized party, the leaker can be identified. The
watermark is difficult for an attacker to remove, even
when several individuals conspire together with
independently watermarked copies of the data.
HISTORY:
The term “water-marking” was coined by Andrew Tirkel and Charles Osborne
in December 1992. And the first successful embedding and extraction of it
was demonstrated in 1993 by Andrew Tirkel, Charles Osborne and Gerard
Rankin.
General water-marking procedure
Data leakage detection
Water-marking program
% Water Marking
clear all; close all
x=double(imread('greens.jpg'));
figure; imshow(x/255);
y=x
a=zeros(300,500);
a(100:250,100:350)=1
figure; imshow(a);
save m.dat a -ascii
x1=x(:,:,1);
x2=x(:,:,2);
x3=x(:,:,3);
dx1=dct2(x1); dx11=dx1; //discrete cosine transform
dx2=dct2(x2); dx22=dx2;
dx3=dct2(x3); dx33=dx3;
load m.dat
g=10; // to decide water-marking limit
[rm,cm]=size(m);
dx1(1:rm,1:cm)=dx1(1:rm,1:cm)g*m;
dx1(1:rm,1:cm)=dx1(1:rm,1:cm)g*m;
dx1(1:rm,1:cm)=dx1(1:rm,1:cm)g*m;
figure,imshow(dx1);
figure,imshow(dx2);
figure,imshow(dx3);
y1=idct2(dx1);
y2=idct2(dx2);
y2=idct2(dx3);
y(:,:,1)=y1;
y(:,:,2)=y2;
y(:,:,3)=y3;
figure,imshow(y1);
figure,imshow(y2);
figure,imshow(y3);
figure;imshow(y/255);
Image before & after water-marking
DRAWBACKS OF WATERMARKING
 It involves some modification of data that is making the
data less sensitive by altering attributes of the data.
 The second problem is that these watermarks can be
sometimes destroyed, if the recipient is malicious.
2.Steganography:
Steganography is a technique for hiding a secret message
within a larger one in such a way that others can’t discern
the presence or contents of the hidden message.
Future scope
 Future work includes the investigation of agent guilt
models that capture the leakage scenarios that are not
yet considered.
 The extension of data allocation strategies so that they
can handle agent requests in an online fashion.
Data leakage detection
APPLICATIONS OF DATA LEAKAGEDETECTION
 It helps in detecting whether the distributor’s sensitive
data has been leaked by the trustworthy or authorized
agents.
 It helps to identify the agents who leaked the data.
 Reduce cybercrime.
 Copy prevention & control.
 Source tracking.
Conclusion
 In the real scenario there is no need to hand over the
sensitive data to the agents who will unknowingly or
maliciously leak it.
 However, in many cases, we must indeed work with
agents that may not be 100 percent trusted, and we
may not be certain if a leaked object came from an
agent or from some other source.
 We can provide security to our data during its
distribution or transmission and even we can detect if
that gets leaked by using data leakage detection
techniques.
Submitted by:
Name: GAURAV KUMAR
Branch: CSE
Email-id:-
Gaurav.kumar8462@gmail.com

More Related Content

PPTX
Data leakage detection
PPTX
Data Leakage Detection
PDF
Big Data & Privacy
PPTX
case study on cyber crime
PPTX
Data Mining
PPTX
Real time analytics
PDF
Big Data: Issues and Challenges
PPTX
Netflix Recommender System : Big Data Case Study
Data leakage detection
Data Leakage Detection
Big Data & Privacy
case study on cyber crime
Data Mining
Real time analytics
Big Data: Issues and Challenges
Netflix Recommender System : Big Data Case Study

What's hot (20)

PPT
Data leakage detection Complete Seminar
PPTX
data-leakage-detection
PPT
Data leakage detection
DOC
Jpdcs1 data leakage detection
PDF
Data leakage detection
DOC
Data leakage detection (synopsis)
PPTX
Data leakage detection
PPTX
Data leakage detection
PDF
Unrestricted file upload
PPT
Data loss prevention (dlp)
PDF
Mobile Security 101
PPT
Malware
PPTX
Mobile Forensics
PPT
Mobile forensics
PPTX
Mobile security
PDF
Enterprise Identity and Access Management Use Cases
PPTX
Mobile_Forensics- General Introduction & Software.pptx
PPTX
AlienVault Brute Force Attacks- Keeping the Bots at Bay with AlienVault USM +...
PPTX
Cybersecurity Awareness Session by Adam
PPTX
Data Intensive Grid Service Model
Data leakage detection Complete Seminar
data-leakage-detection
Data leakage detection
Jpdcs1 data leakage detection
Data leakage detection
Data leakage detection (synopsis)
Data leakage detection
Data leakage detection
Unrestricted file upload
Data loss prevention (dlp)
Mobile Security 101
Malware
Mobile Forensics
Mobile forensics
Mobile security
Enterprise Identity and Access Management Use Cases
Mobile_Forensics- General Introduction & Software.pptx
AlienVault Brute Force Attacks- Keeping the Bots at Bay with AlienVault USM +...
Cybersecurity Awareness Session by Adam
Data Intensive Grid Service Model
Ad

Similar to Data leakage detection (20)

PDF
Top 10 Biggest Data Breaches of all Times.pdf
PDF
Data trawling and security strategies
PDF
wp-analyzing-breaches-by-industry
PDF
IRJET- Data Leakage Detection System
PPTX
NumaanHuq_Hackfest2015
PDF
Dealing with Data Breaches Amidst Changes In Technology
PDF
wp-follow-the-data
PDF
Data Leak Protection Using Text Mining and Social Network Analysis
PDF
10.1.1.436.3364.pdf
PDF
164788616_Data_Leakage_Detection_Complete_Project_Report__1_.docx.pdf
PPTX
Data breach
PPTX
Data Breach
PDF
Data Breach Visualization
PDF
finalPPT
PPTX
Data breach
PPT
Mist2012 panel discussion-ruo ando
PPT
83504808-Data-Leakage-Detection-1-Final.ppt
PDF
Data Privacy
PDF
A Survey On Data Leakage Detection
PDF
Top 10 Biggest Data Breaches of all Times.pdf
Data trawling and security strategies
wp-analyzing-breaches-by-industry
IRJET- Data Leakage Detection System
NumaanHuq_Hackfest2015
Dealing with Data Breaches Amidst Changes In Technology
wp-follow-the-data
Data Leak Protection Using Text Mining and Social Network Analysis
10.1.1.436.3364.pdf
164788616_Data_Leakage_Detection_Complete_Project_Report__1_.docx.pdf
Data breach
Data Breach
Data Breach Visualization
finalPPT
Data breach
Mist2012 panel discussion-ruo ando
83504808-Data-Leakage-Detection-1-Final.ppt
Data Privacy
A Survey On Data Leakage Detection
Ad

Recently uploaded (20)

PPTX
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
PPTX
IMPACT OF LANDSLIDE.....................
PDF
[EN] Industrial Machine Downtime Prediction
PDF
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PDF
Microsoft 365 products and services descrption
PDF
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
PPTX
Introduction to Inferential Statistics.pptx
PDF
Transcultural that can help you someday.
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
PDF
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
PPTX
Managing Community Partner Relationships
PPTX
DS-40-Pre-Engagement and Kickoff deck - v8.0.pptx
DOCX
Factor Analysis Word Document Presentation
PPTX
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PPTX
A Complete Guide to Streamlining Business Processes
PPTX
modul_python (1).pptx for professional and student
PPT
DU, AIS, Big Data and Data Analytics.ppt
PDF
Tetra Pak Index 2023 - The future of health and nutrition - Full report.pdf
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
IMPACT OF LANDSLIDE.....................
[EN] Industrial Machine Downtime Prediction
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
Microsoft 365 products and services descrption
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
Introduction to Inferential Statistics.pptx
Transcultural that can help you someday.
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
Managing Community Partner Relationships
DS-40-Pre-Engagement and Kickoff deck - v8.0.pptx
Factor Analysis Word Document Presentation
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
Optimise Shopper Experiences with a Strong Data Estate.pdf
A Complete Guide to Streamlining Business Processes
modul_python (1).pptx for professional and student
DU, AIS, Big Data and Data Analytics.ppt
Tetra Pak Index 2023 - The future of health and nutrition - Full report.pdf

Data leakage detection

  • 1. Data Leakage Detection Submitted by: Name: GauravKumar Sic:15cs0967 Branch: CSE Mob.No 9572343425
  • 2. Contents…..  Introduction  How data leakage takes place  Biggest data breaches of the 21st century  Existing data leakage detection techniques  Disadvantages of existing techniques  Future scope  Applications  Conclusion
  • 3. Introduction….. DATA LEAKAGE is the unauthorized transmission of sensitive data or information from within an organization to an external destination or recipient. SENSITIVE DATA of companies and organization includes  intellectual property,  Financial information,  Patient information,  Personal credit card data, and other information depending upon the business and the industry
  • 4. How data leakage takes place..??  In the course of doing business, sometimes data must be handed over to the trusted third parties for some enhancement or operations.  Sometimes these trusted third parties may act as points of data leakage.  Examples: A. A hospital may give patient records to researcher who will devise new treatments. B. A company may have partnership with other companies that require sharing of customer data. C. An enterprise may outsource its data processing, so data must be given to various other companies.
  • 5.  Owner of data is termed as the distributor and the third parties are called as the agents.  In case of data leakage, the distributor must access the likelihood that the leaked data come from one or more agents, as opposed to having been independently gathered by other means.
  • 6. Biggest data breaches of the 21st century
  • 7. 1. Yahoo Date: September 2016 Impact: 3 billion user accounts Details: In September 2016, the once dominant Internet giant, while in negotiations to sell itself to Verizon, announced it had been the victim of the biggest data breach in history, likely by “a state-sponsored actor,” in 2014. The attack compromised the real names, email addresses, dates of birth and telephone numbers of 500 million users. The company said the "vast majority" of the passwords involved had been hashed using the robust bcrypt algorithm.
  • 8. 2. eBay Date: May 2014 Impact: 145 million users compromised Details: The online auction giant reported a cyberattack in May 2014 that it said exposed names, addresses, dates of birth and encrypted passwords of all of its 145 million users. The company said hackers got into the company network using the credentials of three corporate employees, and had complete inside access for 229 days, during which time they were able to make their way to the user database.
  • 9. 3. Uber Date: Late 2016 Impact: Personal information of 57 million Uber users and 600,000 drivers exposed. 4. Election Systems & Software – 1.8 million accounts In August, IT security experts discovered an open Amazon Web Services (AWS) cloud container. It contained a backup copy of data from Election Systems & Software (ES&S), a company that manufactures voting machines and elections management systems. The data contained a total of almost 2 million accounts with names, addresses, dates of birth, and party affiliations of Illinois residents. By default, access to AWS bins is possible only after authentication; however, for some unknown reason, the settings on this device were misconfigured, and that made the container accessible to the public.
  • 10.  "We have a responsibility to protect your data, and if we can't then we don't deserve to serve you," Zuckerberg said in a statement on his Facebook page. Over 50 million Facebook profiles were harvested by an app for data, which was then passed the information on to Cambridge Analytica.
  • 14. DATA LEAKAGE DETECTION  To detect whether data been leaked by agents.  To prevent data leakage.
  • 15. Existing data leakage detection techniques 1. Watermarking 2. Steganography
  • 16. 1.Watermarking: A unique code is embedded in each distributed copy. If that copy is later discovered in the hands of an unauthorized party, the leaker can be identified. The watermark is difficult for an attacker to remove, even when several individuals conspire together with independently watermarked copies of the data. HISTORY: The term “water-marking” was coined by Andrew Tirkel and Charles Osborne in December 1992. And the first successful embedding and extraction of it was demonstrated in 1993 by Andrew Tirkel, Charles Osborne and Gerard Rankin.
  • 19. Water-marking program % Water Marking clear all; close all x=double(imread('greens.jpg')); figure; imshow(x/255); y=x a=zeros(300,500); a(100:250,100:350)=1 figure; imshow(a); save m.dat a -ascii x1=x(:,:,1); x2=x(:,:,2); x3=x(:,:,3);
  • 20. dx1=dct2(x1); dx11=dx1; //discrete cosine transform dx2=dct2(x2); dx22=dx2; dx3=dct2(x3); dx33=dx3; load m.dat g=10; // to decide water-marking limit [rm,cm]=size(m); dx1(1:rm,1:cm)=dx1(1:rm,1:cm)g*m; dx1(1:rm,1:cm)=dx1(1:rm,1:cm)g*m; dx1(1:rm,1:cm)=dx1(1:rm,1:cm)g*m; figure,imshow(dx1); figure,imshow(dx2); figure,imshow(dx3);
  • 22. Image before & after water-marking
  • 23. DRAWBACKS OF WATERMARKING  It involves some modification of data that is making the data less sensitive by altering attributes of the data.  The second problem is that these watermarks can be sometimes destroyed, if the recipient is malicious.
  • 24. 2.Steganography: Steganography is a technique for hiding a secret message within a larger one in such a way that others can’t discern the presence or contents of the hidden message.
  • 25. Future scope  Future work includes the investigation of agent guilt models that capture the leakage scenarios that are not yet considered.  The extension of data allocation strategies so that they can handle agent requests in an online fashion.
  • 27. APPLICATIONS OF DATA LEAKAGEDETECTION  It helps in detecting whether the distributor’s sensitive data has been leaked by the trustworthy or authorized agents.  It helps to identify the agents who leaked the data.  Reduce cybercrime.  Copy prevention & control.  Source tracking.
  • 28. Conclusion  In the real scenario there is no need to hand over the sensitive data to the agents who will unknowingly or maliciously leak it.  However, in many cases, we must indeed work with agents that may not be 100 percent trusted, and we may not be certain if a leaked object came from an agent or from some other source.  We can provide security to our data during its distribution or transmission and even we can detect if that gets leaked by using data leakage detection techniques.
  • 29. Submitted by: Name: GAURAV KUMAR Branch: CSE Email-id:- Gaurav.kumar8462@gmail.com