Data leakage detection

A
SEMINAR
ON
DATA LEAKAGE
DETECTION

Sankhadip Kundu (14501212023)
Saurabh Hazra (14501212020)
Shubham Seal (14501212009)
PRESENTED BY:

INTRODUCTION
 Data leakage is defined as the accidental or unintentional distribution
of private or sensitive data to an unauthorized entity .
 Data leakage poses a serious issue for companies as the number of
incidents and the cost to those experiencing them continue to increase.
 Data leakage is enhanced by the fact that transmitted data including
emails, instant messaging, website forms, and file transfers among
others, are largely unregulated and unmonitored on their way to their
destinations.

OBJECTIVE
 A data distributor has given sensitive data to a set of
supposedly trusted agents (third parties).
 Some of the data is leaked and found in an unauthorized place
(e.g., on the web or somebody’s laptop).
 The distributor must assess the likelihood that the leaked data
came from one or more agents, as opposed to having been
independently gathered by other means.
 We propose data allocation strategies (across the agents) that
improve the probability of identifying leakages.

EXISTING SYSTEM
 Traditionally, leakage detection is handled by watermarking,
e.g., a unique code is embedded in each distributed copy.
 If that copy is later discovered in the hands of an unauthorized
party, the leaker can be identified.

Disadvantages of Existing Systems
 Watermarks can be very useful in some cases, but again, involve
some modification of the original data.
 Furthermore, watermarks can sometimes be destroyed if the data
recipient is malicious. E.g. A hospital may give patient records to
researchers who will devise new treatments.
 Similarly, a company may have partnerships with other
companies that require sharing customer data.
 Another enterprise may outsource its data processing, so data
must be given to various other companies.
 We call the owner of the data the distributor and the supposedly
trusted third parties the agents.

PROPOSED SYSTEM
 Our goal is to detect when the distributor's sensitive data has
been leaked by agents, and if possible to identify the agent that
leaked the data.
 Perturbation is a very useful technique where the data is
modified and made "less sensitive" before being handed to agents.
We develop unobtrusive techniques for detecting leakage of a set
of objects or records.
 We develop a model for assessing the "guilt" of agents.
 We also present algorithms for distributing objects to agents, in
a way that improves our chances of identifying a leaker.

Types of employees that put our company at
risk
 The security illiterate
 The unlawful residents
 The malicious/disgruntled employees

IMPACT ON ORGANIZATIONS
 Financial & reputational loss
 Small leaks accumulate to big loss
 Loss of customer & employee private information
 Loss of competitive position
 Lawsuits or regulatory consequences

MODULES
Admin Module
 Administrator has to logon to the system.
 Admin can add/view/delete/edit the user details.
User Module
 A user must login to use the services.
 A user can accept/reject data sharing requests from other users.

DATA LOSS PREVENTION
 To protect against confidential data theft and loss, a multi-layered security
foundation is needed
 Control/limit access to the data –firewalls, remote access controls, network
access controls, physical security controls
 Secure information from threats –protect perimeter and endpoints from
malware, botnets, viruses, DoS, etc. with security technology
 Control use of sensitive data once access is granted –policy-based content
inspection, acceptable use, encryption
 Cisco’s Solution for Data Loss Prevention
 Build a secure foundation with a Self-Defending Network
 Integrate DLP controls into security devices to protect data and increase
visibility.

CONCLUSION
 In the real scenario there is no need to hand over the sensitive data to
the agents who will unknowingly or maliciously leak it.
 Though the leakers are identified using the traditional technique of
watermarking, certain data cannot admit watermarks.
 In spite of these difficulties, it is possible to assess the likelihood that
an agent is responsible for a leak, based on the overlap of his data
with the leaked data

REFERENCES
 www.google.com
 www.wikipedia.com
 www.about.com

Data leakage detection

More Related Content

What's hot (20)

Similar to Data leakage detection (20)

Recently uploaded (20)

Data leakage detection