SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 1187
Block-Level Message-Locked Encryption for Secure Large File De-
duplication
Bhagyashree Bhoyane1, Snehal Kalbhor2, Sneha Chamle3, Sandhya Itkapalle4 ,P. M. Gore5
1234 Student, Computer Department, Padmabhushan Vasantdada Patil Institute of Technology, Pune, Maharashtra
5 Professor, Computer Department, Padmabhushan Vasantdada Patil Institute of Technology, Pune, Maharashtra
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract: In order to reduce the burden of maintaining big
data, more and more enterprises and organizations have
chosen to outsource data storage to cloud storage providers.
This makes data management a critical challenge for the cloud
storage providers. Cloud computing is the long dreamed vision
of computing as a utility. Besides all the benefits of the cloud
computing security of the stored data need to be considered
while storing sensitive data on cloud. Cloud users cannot rely
only on cloud service provider for security of theirsensitivedata
stored on cloud.To achieve optimal usage of storage resources,
many cloud storage providers perform de-duplication, which
exploits data redundancy and avoids storing duplicated data
from multiple users.MLE scheme can be extended to obtain
secure de-duplication for large files, it requires a lot of
metadata maintained by the end user and the cloud server. The
technique of Third Party Auditor (TPA) checks integrity ofdata
stored on cloud for data owner.
KEYWORDS: NLP (Natural language processing),
Sentiment Analysis, synsets, Word Net
1. INTRODUCTION
Reducing the burden of maintaining big data, more and more
enterprises and organizations have chosen to outsource data
storage to cloud storage providers. This makes data
management a critical challenge for the cloud storage
providers. To achieve optimal usage of storage resources,
many cloud storage providers perform deduplication, which
exploits data redundancy and avoids storing duplicated data
from multiple users.
In terms of deduplication granularity, there are two
main deduplication strategies. File-level deduplication: the
data redundancy is exploited on the file level and thus only a
single copy of each file is stored on the server. Block-level
deduplication: each file is divided into blocks, and the sever
exploits data redundancy at the block level and hence
performs a more fine-grained deduplication.
In the traditional encryption providing dataconfidentiality,is
contradictory deduplication occurs file level and block level.
The duplicate copy of corresponding file eliminate by file
level deduplication .For the block level duplication which
eliminates duplicates blocks of data that occur in non-
identical files.
2. RELATED WORK
[1] Deduplication is a popular technique widely used to save
storage spaces in the cloud. To achieve secure deduplication
of encrypted files, Bellare et al. formal a new cryptographic
primitive named Message-Locked Encryption (MLE) in
Eurocrypt 2013. Although an MLE scheme can be extendedto
obtain secure deduplication for large files, it requires a lot of
metadata maintained by the end user and the cloud server. In
this paper, we propose a new approach to achieve more
efficient deduplication for (encrypted) large files. Our
approach, named Block-Level Message-Locked Encryption
(BL-MLE), can achieve file-levelandblock-leveldeduplication,
block key management, and proof of ownership
simultaneously using a small set of metadata. We also show
that our BL-MLE scheme can be easily extended to support
proof of storage, which makes it multi-purpose for secure
cloud storage.
[2] With the continuous and exponential increase of the
number of usersand the size of their data, data deduplication
becomes more and more a necessity for cloud storage
providers. By storing a only one of its kind copy of duplicate
data, cloud providers greatly shrink their storage and data
transfer costs. The advantages of deduplicationunfortunately
come with a high cost in terms of new security and privacy
challenges. We advise ClouDedup, a safe and well-organized
storage space check which assures block-level deduplication
and data confidentiality at the same time. Although based on
convergent encryption, ClouDedup remains secure thanks to
the definition of a component that implements an additional
encryption operation and an access control mechanism.
[3] In this paper, we describe a COM between the interior
enterprise application and public cloud storage platform
which is closer to the client we called-- -Cloudkey, which is
designed to take responsible for enterprise data backup
business. Cloudkey stores data persistently in a cloud storage
provider such as Amazon S3 or Windows Azure , allowing
users to take advantages of the reliability and large storage
capacity of cloud providers, also avoiding the need for
dedicated server hardware. Clients access to the storage
through Cloudkey running on-site, which provide lower-
latency responses and additional opportunities for
optimization through caches data.
[4] Data deduplication is one of the most important data
compression techniques, used forremovingidenticalcopiesof
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 1188
repetitive data. For reduce duplication of data authorized
duplication system is used. When a user uploads a file on the
cloud, the file is split into a number of blocks, each block
having a size of 4KB. Block is encrypted using a convergent
key and subsequently a token is generated for it by using
token generation algorithm. After encrypting the data using
convergent key, users retain the key before sending the
ciphertext to the cloud. Due to the deterministic nature of
encryption, if identical data copies are uploaded the same
convergent keys and the same cipher text will be produced
thus preventing the deduplication of data. Each block is then
compared with the database of cloud. After comparing, if a
match is found in the cloud database then only metadata of
the block is stored in DB profiler. This paper also prevents
unauthorized access by using a secure proof of ownership
protocol .The protocol uses authorize deduplicate check for
hybrid cloud architecture.
3. PROPOSED SYSTEM
Data Owner uploads document, metadata, checksumoncloud
after encryption using keys from Data Owner and Cloud
Service Provider. Also, a copy of metadata and checksum is
sent to Auditor.
Registered users send access request and receive encrypted
file if authorized. User calculates checksum to compare with
original and reports to Data Owner if checksum mismatch
occurs. Avoid De-duplication
Fig-1: Archtecture of Block-Level Message-Locked
Encryption for Secure Large File De-duplication
a. File Level
b. Block Level
Maintains the checksum of file data and block of file data and
compare at the time of file upload to avoid De-duplication.
Auditor Receivesmetadata after upload. Performsperiodicor
on-Demand integrity checks by sending challenges to Cloud
Service Provider. On response from Cloud Service Provider,
Auditor confirmsresponse and reports statusto Data Owner.
4. PROPOSED METHOD
Algorithm Used:
1. AES Algorithm
a. Encryption
You acquire the subsequent AES stepsofencryptionfora128-
bit block:
1. Derive the set of round keys from the cipher key.
2. Initialize the state array with the block data
(plaintext).
3. Add the initial round key to the starting state array.
4. Perform nine rounds of state manipulation.
5. Perform the tenth and final round of state
manipulation.
6. Copy the final state array out as the encrypted data
(cipher text).
Each round of the encryption process requires a series of
steps to alter the state array.
These steps involve four types of operations called:
1. Sub-Bytes
2. Shift-Rows
3. Mix-Columns
4. Xor-Round Key
b. Decryption
As you might expect, decryption involves reversing all the
steps taken in encryption using inverse functions:
1. InvSub-Bytes
2. InvShift-Rows
3. InvMix-Columns
Operation in decryption is:
1. Perform initial decryption round:
 Xor-Round Key
 InvShift-Rows
 InvSub-Bytes
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 1189
2. Perform nine full decryption rounds:
Xor-Round Key
InvMix-Columns
InvShift-Rows
InvSub-Bytes
3. Perform final Xor-Round Key
2. RSA Algorithm
The keys for the RSA algorithm are generated the
following way:
1. Choose two distinct prime numbers p and q.
For protection purposes, the integers p and q should be
favored at random, and should be comparable in
magnitude but 'differ in length by a few digits to make
factoring harder. Prime integers can be efficiently found
using a primality test.
2. Compute n = pq.
 n is used as the modulus for both the public and
private keys. Its length, frequently articulated in
bits, is the key length.
3. Compute φ(n) = φ(p)φ(q) = (p − 1)(q − 1) = n −
(p + q − 1), where φ is Euler's totient function. This
value is kept private.
4. prefer an integer e such that 1 < e < φ(n) and gcd(e,
φ(n)) = 1; i.e., e and φ(n) are coprime.
5. Determine d as d ≡ e−1 (mod φ(n)); i.e., d is
the modular multiplicative inverse of e (modulo
φ(n))
 This is more clearly stated as: solve
for d given d⋅e ≡ 1 (mod φ(n))
 e having a short bit-length and small Hamming
weight results in more efficient encryption –
most commonly 216 + 1 = 65,537. on the other
hand, a large amount of smaller values
of e (such as 3) have been shown to be fewer
protected in some settings.
 e is released as the public key exponent.
 d is kept as the private key exponent.
The public key consists of the modulus n and the public (or
encryption) exponent e. The confidential input consistsofthe
modulus n and the private (or decryption) example d, which
must be kept secret. p, q, and φ(n) must also be kept secret
because they can be used to calculate d.
3. SHA 512 Algorithm
a. Append Padding Bits and Length Value:
This step makes the input message an exact multiple of 1024
bits:
b. Initialize Hash Buffer with Initialization Vector:
Before we can process the first message block, we need to
initialize the hash buffer with IV, the Initialization Vector
c. Process Each 1024-bit (128 words) Message Block Mi:
Each message chunk is taken through 80 rounds of handing
out.
d. Finally:
After all the N message blocks have been processed, the
content of the hash buffer is the message digest.
5. CONCLUSION
In System Block-Level Message-Locked EncryptionforSecure
Large File De-duplication, is one of the most important data
compression techniques, used forremovingidenticalcopiesof
repetitive data. For reduce duplication of data authorized
duplication system is used.
REFERENCES
[1] Rongmao Chen*, Yi Mu*, Senior Member, IEEE, Guomin
Yang, Member, IEEE, and Fuchun Guo "BL-MLE: Block-
Level Message-Locked Encryption for Secure Large File
Deduplication" IEEE TRANSACTIONS ON INFORMATION
FORENSICS AND SECURITY - 2016
[2] Pasquale Puzio, Refik Molva, Melek O¨ nen, Sergio
Loureiro "ClouDedup: Secure Deduplication with
Encrypted Data for Cloud Storage" IEEE -2012
[3] Mr.Vinod B Jadhav ,Prof.Vinod S Wadne "Secured
Authorized De-duplication Based Hybrid Cloud
Approach" International Journal of AdvancedResearchin
Computer Science and Software Engineering – 2014
[4] Aparna Ajit Patil, Asst. Prof. Dhanashree Kulkarni "Block
Level Data Duplication on Hybrid Cloud Storage System"
International Journal of Advanced Research in Computer
Science and Software Engineering - 2015
[5] Chunlu Wang, Jun Ni, Tao Xu, Dapeng Ju "TH_Cloudkey:
Fast, Secure and lowcost backup system for using public
cloud storage" International Conference on Cloud and
Service Computing - 2013

More Related Content

PDF
IRJET-Block-Level Message Encryption for Secure Large File to Avoid De-Duplic...
PDF
A Hybrid Cloud Approach for Secure Authorized De-Duplication
DOCX
Hybrid Cloud Approach for Secure Authorized Deduplication
DOCX
A Hybrid Cloud Approach for Secure Authorized Deduplication
PPTX
A hybrid cloud approach for secure authorized deduplication.
PDF
An Efficient PDP Scheme for Distributed Cloud Storage
PDF
IRJET - A Secure Access Policies based on Data Deduplication System
PDF
An4201262267
IRJET-Block-Level Message Encryption for Secure Large File to Avoid De-Duplic...
A Hybrid Cloud Approach for Secure Authorized De-Duplication
Hybrid Cloud Approach for Secure Authorized Deduplication
A Hybrid Cloud Approach for Secure Authorized Deduplication
A hybrid cloud approach for secure authorized deduplication.
An Efficient PDP Scheme for Distributed Cloud Storage
IRJET - A Secure Access Policies based on Data Deduplication System
An4201262267

What's hot (20)

DOCX
Secure distributed deduplication systems with improved reliability
PDF
A hybrid cloud approach for secure authorized deduplication
PDF
Review and Analysis of Self Destruction of Data in Cloud Computing
PDF
A Hybrid Cloud Approach for Secure Authorized Deduplication
DOCX
SECURE AUDITING AND DEDUPLICATING DATA IN CLOUD
PDF
IRJET- A Secure Erasure Code-Based Cloud Storage Framework with Secure Inform...
PDF
Cooperative Demonstrable Data Retention for Integrity Verification in Multi-C...
PPTX
Secure deduplicaton with efficient and reliable convergent
PDF
Secure distributed deduplication systems with improved reliability 2
PDF
An Optimal Cooperative Provable Data Possession Scheme for Distributed Cloud ...
PDF
Improving Data Storage Security in Cloud using Hadoop
PDF
Jj3616251628
PDF
IRJET- Improving Data Spillage in Multi-Cloud Capacity Administration
PDF
Improving Efficiency of Security in Multi-Cloud
PDF
A Privacy Preserving Three-Layer Cloud Storage Scheme Based On Computational ...
PDF
IRJET- Privacy Preserving Cloud Storage based on a Three Layer Security M...
PDF
Revocation based De-duplication Systems for Improving Reliability in Cloud St...
PDF
A Comparative Analysis of Additional Overhead Imposed by Internet Protocol Se...
PDF
Microservices Architecture with Vortex — Part II
DOC
126689454 jv6
Secure distributed deduplication systems with improved reliability
A hybrid cloud approach for secure authorized deduplication
Review and Analysis of Self Destruction of Data in Cloud Computing
A Hybrid Cloud Approach for Secure Authorized Deduplication
SECURE AUDITING AND DEDUPLICATING DATA IN CLOUD
IRJET- A Secure Erasure Code-Based Cloud Storage Framework with Secure Inform...
Cooperative Demonstrable Data Retention for Integrity Verification in Multi-C...
Secure deduplicaton with efficient and reliable convergent
Secure distributed deduplication systems with improved reliability 2
An Optimal Cooperative Provable Data Possession Scheme for Distributed Cloud ...
Improving Data Storage Security in Cloud using Hadoop
Jj3616251628
IRJET- Improving Data Spillage in Multi-Cloud Capacity Administration
Improving Efficiency of Security in Multi-Cloud
A Privacy Preserving Three-Layer Cloud Storage Scheme Based On Computational ...
IRJET- Privacy Preserving Cloud Storage based on a Three Layer Security M...
Revocation based De-duplication Systems for Improving Reliability in Cloud St...
A Comparative Analysis of Additional Overhead Imposed by Internet Protocol Se...
Microservices Architecture with Vortex — Part II
126689454 jv6
Ad

Similar to Block-Level Message-Locked Encryption for Secure Large File De-duplication (20)

PDF
IRJET - Multi Authority based Integrity Auditing and Proof of Storage wit...
PDF
IJSRED-V2I2P10
PDF
IRJET- Data Security in Cloud Computing through AES under Drivehq
PDF
EXPLORING WOMEN SECURITY BY DEDUPLICATION OF DATA
PDF
IRJET- Secure Data Deduplication and Auditing for Cloud Data Storage
PDF
IRJET-2 Proxy-Oriented Data Uploading in Multi Cloud Storage
PDF
IRJET- Storage Security in Cloud Computing
PDF
International Journal of Computational Engineering Research(IJCER)
PDF
An Approach towards Shuffling of Data to Avoid Tampering in Cloud
PDF
Survey on Lightweight Secured Data Sharing Scheme for Cloud Computing
PDF
An efficient, secure deduplication data storing in cloud storage environment
PDF
IRJET - A Secure AMR Stganography Scheme based on Pulse Distribution Mode...
PDF
Implementation of De-Duplication Algorithm
PDF
IRJET - A Novel Approach Implementing Deduplication using Message Locked Encr...
PPTX
PUBLIC AUDITING FOR SECURE CLOUD STORAGE ...
PDF
Secure Data Sharing in Cloud Computing using Revocable Storage Identity- Base...
PDF
A Secure and Dynamic Multi Keyword Ranked Search over Encrypted Cloud Data
PDF
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
PDF
Efficient and Empiric Keyword Search Using Cloud
PDF
IRJET- Deduplication of Encrypted Bigdata on Cloud
IRJET - Multi Authority based Integrity Auditing and Proof of Storage wit...
IJSRED-V2I2P10
IRJET- Data Security in Cloud Computing through AES under Drivehq
EXPLORING WOMEN SECURITY BY DEDUPLICATION OF DATA
IRJET- Secure Data Deduplication and Auditing for Cloud Data Storage
IRJET-2 Proxy-Oriented Data Uploading in Multi Cloud Storage
IRJET- Storage Security in Cloud Computing
International Journal of Computational Engineering Research(IJCER)
An Approach towards Shuffling of Data to Avoid Tampering in Cloud
Survey on Lightweight Secured Data Sharing Scheme for Cloud Computing
An efficient, secure deduplication data storing in cloud storage environment
IRJET - A Secure AMR Stganography Scheme based on Pulse Distribution Mode...
Implementation of De-Duplication Algorithm
IRJET - A Novel Approach Implementing Deduplication using Message Locked Encr...
PUBLIC AUDITING FOR SECURE CLOUD STORAGE ...
Secure Data Sharing in Cloud Computing using Revocable Storage Identity- Base...
A Secure and Dynamic Multi Keyword Ranked Search over Encrypted Cloud Data
IRJET- A Survey on Remote Data Possession Verification Protocol in Cloud Storage
Efficient and Empiric Keyword Search Using Cloud
IRJET- Deduplication of Encrypted Bigdata on Cloud
Ad

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
PDF
Kiona – A Smart Society Automation Project
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
PDF
Breast Cancer Detection using Computer Vision
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Kiona – A Smart Society Automation Project
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
BRAIN TUMOUR DETECTION AND CLASSIFICATION
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Breast Cancer Detection using Computer Vision
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...

Recently uploaded (20)

PPTX
Lecture Notes Electrical Wiring System Components
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPTX
web development for engineering and engineering
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PPTX
Internet of Things (IOT) - A guide to understanding
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PPTX
OOP with Java - Java Introduction (Basics)
PPTX
Construction Project Organization Group 2.pptx
PDF
Well-logging-methods_new................
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
Geodesy 1.pptx...............................................
PPTX
Sustainable Sites - Green Building Construction
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Lecture Notes Electrical Wiring System Components
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Foundation to blockchain - A guide to Blockchain Tech
web development for engineering and engineering
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Embodied AI: Ushering in the Next Era of Intelligent Systems
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Internet of Things (IOT) - A guide to understanding
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
OOP with Java - Java Introduction (Basics)
Construction Project Organization Group 2.pptx
Well-logging-methods_new................
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Model Code of Practice - Construction Work - 21102022 .pdf
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
CH1 Production IntroductoryConcepts.pptx
Geodesy 1.pptx...............................................
Sustainable Sites - Green Building Construction
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT

Block-Level Message-Locked Encryption for Secure Large File De-duplication

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 1187 Block-Level Message-Locked Encryption for Secure Large File De- duplication Bhagyashree Bhoyane1, Snehal Kalbhor2, Sneha Chamle3, Sandhya Itkapalle4 ,P. M. Gore5 1234 Student, Computer Department, Padmabhushan Vasantdada Patil Institute of Technology, Pune, Maharashtra 5 Professor, Computer Department, Padmabhushan Vasantdada Patil Institute of Technology, Pune, Maharashtra ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract: In order to reduce the burden of maintaining big data, more and more enterprises and organizations have chosen to outsource data storage to cloud storage providers. This makes data management a critical challenge for the cloud storage providers. Cloud computing is the long dreamed vision of computing as a utility. Besides all the benefits of the cloud computing security of the stored data need to be considered while storing sensitive data on cloud. Cloud users cannot rely only on cloud service provider for security of theirsensitivedata stored on cloud.To achieve optimal usage of storage resources, many cloud storage providers perform de-duplication, which exploits data redundancy and avoids storing duplicated data from multiple users.MLE scheme can be extended to obtain secure de-duplication for large files, it requires a lot of metadata maintained by the end user and the cloud server. The technique of Third Party Auditor (TPA) checks integrity ofdata stored on cloud for data owner. KEYWORDS: NLP (Natural language processing), Sentiment Analysis, synsets, Word Net 1. INTRODUCTION Reducing the burden of maintaining big data, more and more enterprises and organizations have chosen to outsource data storage to cloud storage providers. This makes data management a critical challenge for the cloud storage providers. To achieve optimal usage of storage resources, many cloud storage providers perform deduplication, which exploits data redundancy and avoids storing duplicated data from multiple users. In terms of deduplication granularity, there are two main deduplication strategies. File-level deduplication: the data redundancy is exploited on the file level and thus only a single copy of each file is stored on the server. Block-level deduplication: each file is divided into blocks, and the sever exploits data redundancy at the block level and hence performs a more fine-grained deduplication. In the traditional encryption providing dataconfidentiality,is contradictory deduplication occurs file level and block level. The duplicate copy of corresponding file eliminate by file level deduplication .For the block level duplication which eliminates duplicates blocks of data that occur in non- identical files. 2. RELATED WORK [1] Deduplication is a popular technique widely used to save storage spaces in the cloud. To achieve secure deduplication of encrypted files, Bellare et al. formal a new cryptographic primitive named Message-Locked Encryption (MLE) in Eurocrypt 2013. Although an MLE scheme can be extendedto obtain secure deduplication for large files, it requires a lot of metadata maintained by the end user and the cloud server. In this paper, we propose a new approach to achieve more efficient deduplication for (encrypted) large files. Our approach, named Block-Level Message-Locked Encryption (BL-MLE), can achieve file-levelandblock-leveldeduplication, block key management, and proof of ownership simultaneously using a small set of metadata. We also show that our BL-MLE scheme can be easily extended to support proof of storage, which makes it multi-purpose for secure cloud storage. [2] With the continuous and exponential increase of the number of usersand the size of their data, data deduplication becomes more and more a necessity for cloud storage providers. By storing a only one of its kind copy of duplicate data, cloud providers greatly shrink their storage and data transfer costs. The advantages of deduplicationunfortunately come with a high cost in terms of new security and privacy challenges. We advise ClouDedup, a safe and well-organized storage space check which assures block-level deduplication and data confidentiality at the same time. Although based on convergent encryption, ClouDedup remains secure thanks to the definition of a component that implements an additional encryption operation and an access control mechanism. [3] In this paper, we describe a COM between the interior enterprise application and public cloud storage platform which is closer to the client we called-- -Cloudkey, which is designed to take responsible for enterprise data backup business. Cloudkey stores data persistently in a cloud storage provider such as Amazon S3 or Windows Azure , allowing users to take advantages of the reliability and large storage capacity of cloud providers, also avoiding the need for dedicated server hardware. Clients access to the storage through Cloudkey running on-site, which provide lower- latency responses and additional opportunities for optimization through caches data. [4] Data deduplication is one of the most important data compression techniques, used forremovingidenticalcopiesof
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 1188 repetitive data. For reduce duplication of data authorized duplication system is used. When a user uploads a file on the cloud, the file is split into a number of blocks, each block having a size of 4KB. Block is encrypted using a convergent key and subsequently a token is generated for it by using token generation algorithm. After encrypting the data using convergent key, users retain the key before sending the ciphertext to the cloud. Due to the deterministic nature of encryption, if identical data copies are uploaded the same convergent keys and the same cipher text will be produced thus preventing the deduplication of data. Each block is then compared with the database of cloud. After comparing, if a match is found in the cloud database then only metadata of the block is stored in DB profiler. This paper also prevents unauthorized access by using a secure proof of ownership protocol .The protocol uses authorize deduplicate check for hybrid cloud architecture. 3. PROPOSED SYSTEM Data Owner uploads document, metadata, checksumoncloud after encryption using keys from Data Owner and Cloud Service Provider. Also, a copy of metadata and checksum is sent to Auditor. Registered users send access request and receive encrypted file if authorized. User calculates checksum to compare with original and reports to Data Owner if checksum mismatch occurs. Avoid De-duplication Fig-1: Archtecture of Block-Level Message-Locked Encryption for Secure Large File De-duplication a. File Level b. Block Level Maintains the checksum of file data and block of file data and compare at the time of file upload to avoid De-duplication. Auditor Receivesmetadata after upload. Performsperiodicor on-Demand integrity checks by sending challenges to Cloud Service Provider. On response from Cloud Service Provider, Auditor confirmsresponse and reports statusto Data Owner. 4. PROPOSED METHOD Algorithm Used: 1. AES Algorithm a. Encryption You acquire the subsequent AES stepsofencryptionfora128- bit block: 1. Derive the set of round keys from the cipher key. 2. Initialize the state array with the block data (plaintext). 3. Add the initial round key to the starting state array. 4. Perform nine rounds of state manipulation. 5. Perform the tenth and final round of state manipulation. 6. Copy the final state array out as the encrypted data (cipher text). Each round of the encryption process requires a series of steps to alter the state array. These steps involve four types of operations called: 1. Sub-Bytes 2. Shift-Rows 3. Mix-Columns 4. Xor-Round Key b. Decryption As you might expect, decryption involves reversing all the steps taken in encryption using inverse functions: 1. InvSub-Bytes 2. InvShift-Rows 3. InvMix-Columns Operation in decryption is: 1. Perform initial decryption round:  Xor-Round Key  InvShift-Rows  InvSub-Bytes
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 1189 2. Perform nine full decryption rounds: Xor-Round Key InvMix-Columns InvShift-Rows InvSub-Bytes 3. Perform final Xor-Round Key 2. RSA Algorithm The keys for the RSA algorithm are generated the following way: 1. Choose two distinct prime numbers p and q. For protection purposes, the integers p and q should be favored at random, and should be comparable in magnitude but 'differ in length by a few digits to make factoring harder. Prime integers can be efficiently found using a primality test. 2. Compute n = pq.  n is used as the modulus for both the public and private keys. Its length, frequently articulated in bits, is the key length. 3. Compute φ(n) = φ(p)φ(q) = (p − 1)(q − 1) = n − (p + q − 1), where φ is Euler's totient function. This value is kept private. 4. prefer an integer e such that 1 < e < φ(n) and gcd(e, φ(n)) = 1; i.e., e and φ(n) are coprime. 5. Determine d as d ≡ e−1 (mod φ(n)); i.e., d is the modular multiplicative inverse of e (modulo φ(n))  This is more clearly stated as: solve for d given d⋅e ≡ 1 (mod φ(n))  e having a short bit-length and small Hamming weight results in more efficient encryption – most commonly 216 + 1 = 65,537. on the other hand, a large amount of smaller values of e (such as 3) have been shown to be fewer protected in some settings.  e is released as the public key exponent.  d is kept as the private key exponent. The public key consists of the modulus n and the public (or encryption) exponent e. The confidential input consistsofthe modulus n and the private (or decryption) example d, which must be kept secret. p, q, and φ(n) must also be kept secret because they can be used to calculate d. 3. SHA 512 Algorithm a. Append Padding Bits and Length Value: This step makes the input message an exact multiple of 1024 bits: b. Initialize Hash Buffer with Initialization Vector: Before we can process the first message block, we need to initialize the hash buffer with IV, the Initialization Vector c. Process Each 1024-bit (128 words) Message Block Mi: Each message chunk is taken through 80 rounds of handing out. d. Finally: After all the N message blocks have been processed, the content of the hash buffer is the message digest. 5. CONCLUSION In System Block-Level Message-Locked EncryptionforSecure Large File De-duplication, is one of the most important data compression techniques, used forremovingidenticalcopiesof repetitive data. For reduce duplication of data authorized duplication system is used. REFERENCES [1] Rongmao Chen*, Yi Mu*, Senior Member, IEEE, Guomin Yang, Member, IEEE, and Fuchun Guo "BL-MLE: Block- Level Message-Locked Encryption for Secure Large File Deduplication" IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY - 2016 [2] Pasquale Puzio, Refik Molva, Melek O¨ nen, Sergio Loureiro "ClouDedup: Secure Deduplication with Encrypted Data for Cloud Storage" IEEE -2012 [3] Mr.Vinod B Jadhav ,Prof.Vinod S Wadne "Secured Authorized De-duplication Based Hybrid Cloud Approach" International Journal of AdvancedResearchin Computer Science and Software Engineering – 2014 [4] Aparna Ajit Patil, Asst. Prof. Dhanashree Kulkarni "Block Level Data Duplication on Hybrid Cloud Storage System" International Journal of Advanced Research in Computer Science and Software Engineering - 2015 [5] Chunlu Wang, Jun Ni, Tao Xu, Dapeng Ju "TH_Cloudkey: Fast, Secure and lowcost backup system for using public cloud storage" International Conference on Cloud and Service Computing - 2013