SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 02 | Feb 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 3155
A SECURE ACCESS POLICIES BASED ON DATA DEDUPLICATION SYSTEM
Gayathri.S1, Ragavi.P2, Srilekha.R3
1Asst Professor, Dept. of Computer Science and Engineering, Jeppiaar SRR Engineering college, Padur.
2,3Final year student, Dept. of Computer Science and Engineering, Jeppiaar SRR Engineering college, Padur.
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Deduplication techniques are used to a back up
data on disk and minimize the storage overhead by detecting
and eliminating redundancy among data. It is crucial for
eliminating duplicate copies of identical data to save storage
space and network bandwidth. It presents an attribute-based
storage system with secure data deduplication access policies
in a hybrid clouds settings, using public cloud and private
cloud. Private cloud is used to detect duplication and a public
cloud maintains the storage. Instead of keeping data copies in
multiple with the similar content, in this system removes
surplus data by keeping only one physical copy and referring
other redundant data to that copy. User access policies defines
each of such copy, the user will upload the file with access
policies. Then similar file with the separate access policies are
set the particular file to replace the reference. The user’s
private key is associated with attribute set and a message is
encrypted under an access policy over a set of attributes. The
user can decrypt a cipher text with private key if the set of
attributes satisfies the accesspolicyassociatedwiththiscipher
text. Our system has two advantages. Thetwolevelofchecking
file is file level deduplication and signature match checking. It
reduced the time and cost in uploading and downloadingwith
storage space in Hadoop system.
KeyWords: Hadoop Software, Ciphertext, Deduplication,
Tomcat.
1. INTRODUCTION
Hadoop software library that permits for the distributed
processing of huge data setacrossclusterofcomputersusing
simple programming models. It is designed to proportion
from single server to the thousandofmachines, eachoffering
local computation and storage. It makes Use of the
commodity hardware Hadoop is very Scalable and Fault
Tolerant. This provides resource management and
scheduling for user applications; and Hadoop Map Reduce,
which provides the programming model used to tacklelarge
distributed data processing mapping data and reducing it to
a result. Big Data in most companies are processed by
Hadoop by submitting the roles to Master. Thistypeofusage
is best-suited to highly scalable public cloud services; The
Master distributes the job to its cluster and processmapand
reduces tasks sequentially. But nowadays the growing data
and the competition between Service Providers lead to the
increased submission of jobs to the Master. This Concurrent
job submission on Hadoop forces us to do Schedule on
Hadoop Cluster so that the response time will be acceptable
for each job.
2. PROPOSED SYSTEM
In this paper, we present an attribute-based storage system,
which uses the ciphertext-policy attribute-based encryption
(CP-ABE) and to support securededuplication.To enablethe
deduplication and distributed storage of the data across
HDFS. And then using two way cloudinourstoragesystemis
built under a hybrid cloud architecture, where a private
cloud manipulates the computation of the user and a public
cloud manages the storage. The private cloud is given a
trapdoor key related to the corresponding ciphertext, with
which it can transfer the ciphertext over one access policy
into ciphertexts of the same plaintext under the otheraccess
policies without being conscious of the underlyingplaintext.
After receiving a storage request from the user, the private
cloud first checks the validity of the upload item through the
attached proof. If the proof is valid, the private cloud runs a
tag matching algorithm to ascertain whether an equivalent
data underlying the ciphertext has been stored
3. HARDWARE AND SOFTWARE SPECIFICATION
HARDWARE REQUIREMENTS:
Hard disk : 500 GB and above.
Processor : i3 and above.
Ram : 4GB and above.
SOFTWARE REQUIREMENTS:
Operating System : Windows 7 and above (64-bit).
Java Version : JDK 1.7
Web Server :Tomcat 6.20
Web Server :Tomcat 7.0.11
Storage : Hadoop 2.7
4. TECHNOLOGY USED
 JAVA
 Cloud Computing
 JAVA Platform
5. APACHE TOMCAT SERVER
Apache Tomcat (formerly under the ApacheJakarta Project;
Tomcat is now a top level project) could also be an internet
container developed at the Apache Software Foundation.
Tomcat implements the servlet and thusthe JSPspecifyfrom
Sun Microsystems, providing an environment forJava torun
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 02 | Feb 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 3156
in cooperation with an online server. It adds tools for
configuration andmanagement butalsocanbeconfiguredby
editing configuration files that are normally XML-formatted.
Because Tomcat includes its own HTTP serverinternally,it's
also considered a standalone web server.
5.1 Purpose
The main aim of this project is to realize newdistributed de-
duplication systems we present an attribute-based storage
system with secure deduplication during a hybrid cloud
setting with higher reliability.
5.2 Project Scope
In this Deduplication techniques are employed to backup
data and minimize network of the data and storage of an
data overhead by detecting and eliminating redundancy
among data from the storage area.so which is crucial for
eliminating duplicate copies of identical data so as to save
lots of space for storing and network bandwidth.Wepresent
an attribute-based storage system with secure access
policies data deduplication in a hybrid cloud. Where a
personal cloud is liable for duplicate detection and a public
cloud manages the storage. Instead of keeping multiple data
copy with an equivalent content, during the technique
eliminate redundant data by keeping only physical copyand
referring other redundant datas thereto copy.
6. Algorithms used
 MD5 algorithm.
 Base 64.
 RSA algorithm.
7. SYSTEM DESIGN
7.1 Uploading a File
In this module, cloud user first register the user details and
then login the user credential details. Once user name and
password is valid open the user profile screen are going to
be displayed. A user is an entity whowantstooutsourcedata
storage to the HDFS Storage and access the info later. A user
register to the HDFS storage with necessaryinformationand
login the page for uploading the file. User chooses the file of
data and uploads to Storage in the given space where the
HDFS store the file in rapid storage system and file level
deduplication is checked.
7.2 Mastering File to HDFS Storage
Tag the file by using MD5 message-digest algorithm is
cryptographic hash function producing a 128-bit hash value
typically expressed in text format as 32 digit hex value in
order that files of same are deduplicated. Chunking the file
chosen for the fixed size given of data andgeneratingtags for
each blocks chunked. Then generate convergent keys for
every blocks split to verify block level deduplication.Provide
filename and password for file authorization in future. This
Encrypt the blocks by Triple encoding Standard (3DES)
algorithm. Here the plain text is encoded triple times with
convergent key then while decoding the primary content it
also needs an equivalent key to decode again by triple times.
Finally the first content is encrypted as cipher text and
stored in slave system. Blocks arestoredinDistributedHDFS
Storage Providers.
7.3 Enabling Deduplication Method
After encrypting the convergent keys are securely shared
with slave machines provider toKeyManagementmachines.
Key management slave checks duplicate copies of an data to
convergent keysinKMCSP. KeyManagementslavemaintains
Comma Separated Values file to see proof of verification of
data and store keys secure. Various users who share the
common keys are referred by their own ownership. User
request for deletion definitely got to prove to proof of
ownership to delete own contents.
7.4 Hash Value Based Decryption
The final model user request for downloading their
document which they need to upload in HDFS storage. This
download request needs proper ownership of the data
verification of the document here we create the ownership
by unique tag generated by MD5 algorithm and verifies
existing tag of user. After verification the original content of
the data is decrypted by requesting the Distributed HDFS
storage where HDFS storage request key management slave
for keys to decrypt the given data and finally the original
content is received by the user. The delete request will
delete only the reference of the content shared by common
users, and not the whole content.
Fig -1: Block diagram
8. CONCLUSIONS
In this project, the new distributed deduplication systems
with file-level and fine-grained block-level data
deduplication, higher reliability inwhichthedata chunks are
distributed across HDFSstorage,reliablekeymanagementin
secure deduplication and security of tag consistency and
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 02 | Feb 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 3157
integrity were achieved. The purpose of software
requirements specification is to provide a detailedoverview
of the software project and its purpose and its parameters
and goals. This describes the project target audience and its
user interface, hardware and software requirements of the
given data. It is defined how the client, team and audience
see the project and its functionality.
9. REFERENCES
[1] D. Quick, B. Martini, and K. R. Choo, Cloud Storage
Forensics. Syngress Publishing / Elsevier, 2014. [Online].
Available:http://www.elsevier.com/books/cloud-
storageforensics/ quick/978-0-12-419970-5
[2] K. R. Choo, J. Domingo-Ferrer, and L. Zhang, “Cloud
cryptography: Theory, practice and future research
directions,” Future Generation Comp. Syst., vol. 62, pp. 51–
53, 2016.
[3] K. R. Choo, M. Herman, M. Iorga, and B. Martini, “Cloud
forensics: State-of-the-art and future directions,” Digital
Investigation, vol. 18, pp. 77–78, 2016.
[4] Y. Yang, H. Zhu, H. Lu, J.Weng, Y. Zhang, and K. R. Choo,
“Cloud based data sharing with fine-grained proxy re-
encryption,” Pervasive and Mobile Computing, vol. 28, pp.
122–134, 2016.
[5] D. Quick and K. R. Choo, “Google drive: Forensic analysis
of data remnants,” J. Network and Computer Applications,
vol. 40, pp. 179– 193, 2014.
[6] A. Sahai and B. Waters, “Fuzzy identity-based
encryption,” in Advances in Cryptology - EUROCRYPT 2005,
24th Annual International Conference on the Theory and
ApplicationsofCryptographicTechniques,Aarhus,Denmark,
May 22-26, 2005, Proceedings, ser. Lecture Notes in
Computer Science, vol. 3494. Springer, 2005, pp. 457–473.

More Related Content

PDF
An efficient, secure deduplication data storing in cloud storage environment
PDF
El35782786
PDF
Enabling Integrity for the Compressed Files in Cloud Server
DOCX
Hybrid Cloud Approach for Secure Authorized Deduplication
PDF
International Journal of Computational Engineering Research(IJCER)
PDF
IRJET- Secured Hadoop Environment
DOCX
A Hybrid Cloud Approach for Secure Authorized Deduplication
PDF
A hybrid cloud approach for secure authorized deduplication
An efficient, secure deduplication data storing in cloud storage environment
El35782786
Enabling Integrity for the Compressed Files in Cloud Server
Hybrid Cloud Approach for Secure Authorized Deduplication
International Journal of Computational Engineering Research(IJCER)
IRJET- Secured Hadoop Environment
A Hybrid Cloud Approach for Secure Authorized Deduplication
A hybrid cloud approach for secure authorized deduplication

What's hot (19)

PDF
A Hybrid Cloud Approach for Secure Authorized Deduplication
PDF
L018137479
DOCX
Secure distributed deduplication systems with improved reliability
PDF
IRJET- Secure Data Deduplication for Cloud Server using HMAC Algorithm
PDF
IRJET- Cloud based Deduplication using Middleware Approach
PDF
Doc A hybrid cloud approach for secure authorized deduplication
PDF
The DDS Security Standard
DOCX
SECURE AUDITING AND DEDUPLICATING DATA IN CLOUD
PDF
A hybrid cloud approach for secure authorized deduplication
PDF
A Hybrid Cloud Approach for Secure Authorized De-Duplication
PDF
Block-Level Message-Locked Encryption for Secure Large File De-duplication
PDF
IRJET- Cross User Bigdata Deduplication
PDF
E031102034039
PDF
Secure distributed deduplication systems with improved reliability 2
PDF
iaetsd Controlling data deuplication in cloud storage
PDF
An4201262267
PDF
An Auditing Protocol for Protected Data Storage in Cloud Computing
PDF
An Optimal Cooperative Provable Data Possession Scheme for Distributed Cloud ...
PPTX
SECRY - Secure file storage on cloud using hybrid cryptography
A Hybrid Cloud Approach for Secure Authorized Deduplication
L018137479
Secure distributed deduplication systems with improved reliability
IRJET- Secure Data Deduplication for Cloud Server using HMAC Algorithm
IRJET- Cloud based Deduplication using Middleware Approach
Doc A hybrid cloud approach for secure authorized deduplication
The DDS Security Standard
SECURE AUDITING AND DEDUPLICATING DATA IN CLOUD
A hybrid cloud approach for secure authorized deduplication
A Hybrid Cloud Approach for Secure Authorized De-Duplication
Block-Level Message-Locked Encryption for Secure Large File De-duplication
IRJET- Cross User Bigdata Deduplication
E031102034039
Secure distributed deduplication systems with improved reliability 2
iaetsd Controlling data deuplication in cloud storage
An4201262267
An Auditing Protocol for Protected Data Storage in Cloud Computing
An Optimal Cooperative Provable Data Possession Scheme for Distributed Cloud ...
SECRY - Secure file storage on cloud using hybrid cryptography
Ad

Similar to IRJET - A Secure Access Policies based on Data Deduplication System (20)

PDF
Implementation of De-Duplication Algorithm
PPTX
Hasbe a hierarchical attribute based solution for flexible and scalable acces...
PDF
Comparative Analysis, Security Aspects & Optimization of Workload in Gfs Base...
PDF
H017144148
PDF
IJSRED-V2I2P10
PDF
IRJET- Secure Data Sharing Scheme for Mobile Cloud Computing using SEDASC
PDF
IRJET- Secure Data Sharing Scheme for Mobile Cloud Computing using SEDASC
PDF
Role Based Access Control Model (RBACM) With Efficient Genetic Algorithm (GA)...
PDF
IRJET- An Integrity Auditing &Data Dedupe withEffective Bandwidth in Cloud St...
PDF
Presentation on cloud computing security issues using HADOOP and HDFS ARCHITE...
PDF
IRJET- A Survey on File Storage and Retrieval using Blockchain Technology
PDF
IRJET- Adaptable Wildcard Searchable Encryption System
PPTX
A cloud enviroment for backup and data storage
PDF
Data Sharing: Ensure Accountability Distribution in the Cloud
PDF
IRJET- A Study of Comparatively Analysis for HDFS and Google File System ...
PPTX
A cloud environment for backup and data storage
PDF
Distributed Framework for Data Mining As a Service on Private Cloud
PDF
A robust and verifiable threshold multi authority access control system in pu...
PDF
IRJET - Confidential Image De-Duplication in Cloud Storage
PDF
Methodology for Optimizing Storage on Cloud Using Authorized De-Duplication –...
Implementation of De-Duplication Algorithm
Hasbe a hierarchical attribute based solution for flexible and scalable acces...
Comparative Analysis, Security Aspects & Optimization of Workload in Gfs Base...
H017144148
IJSRED-V2I2P10
IRJET- Secure Data Sharing Scheme for Mobile Cloud Computing using SEDASC
IRJET- Secure Data Sharing Scheme for Mobile Cloud Computing using SEDASC
Role Based Access Control Model (RBACM) With Efficient Genetic Algorithm (GA)...
IRJET- An Integrity Auditing &Data Dedupe withEffective Bandwidth in Cloud St...
Presentation on cloud computing security issues using HADOOP and HDFS ARCHITE...
IRJET- A Survey on File Storage and Retrieval using Blockchain Technology
IRJET- Adaptable Wildcard Searchable Encryption System
A cloud enviroment for backup and data storage
Data Sharing: Ensure Accountability Distribution in the Cloud
IRJET- A Study of Comparatively Analysis for HDFS and Google File System ...
A cloud environment for backup and data storage
Distributed Framework for Data Mining As a Service on Private Cloud
A robust and verifiable threshold multi authority access control system in pu...
IRJET - Confidential Image De-Duplication in Cloud Storage
Methodology for Optimizing Storage on Cloud Using Authorized De-Duplication –...
Ad

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
PDF
Kiona – A Smart Society Automation Project
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
PDF
Breast Cancer Detection using Computer Vision
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Kiona – A Smart Society Automation Project
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
BRAIN TUMOUR DETECTION AND CLASSIFICATION
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Breast Cancer Detection using Computer Vision
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...

Recently uploaded (20)

PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPTX
web development for engineering and engineering
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
Sustainable Sites - Green Building Construction
DOCX
573137875-Attendance-Management-System-original
PPT
Mechanical Engineering MATERIALS Selection
PPTX
Geodesy 1.pptx...............................................
PDF
PPT on Performance Review to get promotions
PPTX
OOP with Java - Java Introduction (Basics)
PDF
Well-logging-methods_new................
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
additive manufacturing of ss316l using mig welding
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
Operating System & Kernel Study Guide-1 - converted.pdf
web development for engineering and engineering
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Embodied AI: Ushering in the Next Era of Intelligent Systems
Lecture Notes Electrical Wiring System Components
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Sustainable Sites - Green Building Construction
573137875-Attendance-Management-System-original
Mechanical Engineering MATERIALS Selection
Geodesy 1.pptx...............................................
PPT on Performance Review to get promotions
OOP with Java - Java Introduction (Basics)
Well-logging-methods_new................
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
additive manufacturing of ss316l using mig welding
Model Code of Practice - Construction Work - 21102022 .pdf

IRJET - A Secure Access Policies based on Data Deduplication System

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 02 | Feb 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 3155 A SECURE ACCESS POLICIES BASED ON DATA DEDUPLICATION SYSTEM Gayathri.S1, Ragavi.P2, Srilekha.R3 1Asst Professor, Dept. of Computer Science and Engineering, Jeppiaar SRR Engineering college, Padur. 2,3Final year student, Dept. of Computer Science and Engineering, Jeppiaar SRR Engineering college, Padur. ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - Deduplication techniques are used to a back up data on disk and minimize the storage overhead by detecting and eliminating redundancy among data. It is crucial for eliminating duplicate copies of identical data to save storage space and network bandwidth. It presents an attribute-based storage system with secure data deduplication access policies in a hybrid clouds settings, using public cloud and private cloud. Private cloud is used to detect duplication and a public cloud maintains the storage. Instead of keeping data copies in multiple with the similar content, in this system removes surplus data by keeping only one physical copy and referring other redundant data to that copy. User access policies defines each of such copy, the user will upload the file with access policies. Then similar file with the separate access policies are set the particular file to replace the reference. The user’s private key is associated with attribute set and a message is encrypted under an access policy over a set of attributes. The user can decrypt a cipher text with private key if the set of attributes satisfies the accesspolicyassociatedwiththiscipher text. Our system has two advantages. Thetwolevelofchecking file is file level deduplication and signature match checking. It reduced the time and cost in uploading and downloadingwith storage space in Hadoop system. KeyWords: Hadoop Software, Ciphertext, Deduplication, Tomcat. 1. INTRODUCTION Hadoop software library that permits for the distributed processing of huge data setacrossclusterofcomputersusing simple programming models. It is designed to proportion from single server to the thousandofmachines, eachoffering local computation and storage. It makes Use of the commodity hardware Hadoop is very Scalable and Fault Tolerant. This provides resource management and scheduling for user applications; and Hadoop Map Reduce, which provides the programming model used to tacklelarge distributed data processing mapping data and reducing it to a result. Big Data in most companies are processed by Hadoop by submitting the roles to Master. Thistypeofusage is best-suited to highly scalable public cloud services; The Master distributes the job to its cluster and processmapand reduces tasks sequentially. But nowadays the growing data and the competition between Service Providers lead to the increased submission of jobs to the Master. This Concurrent job submission on Hadoop forces us to do Schedule on Hadoop Cluster so that the response time will be acceptable for each job. 2. PROPOSED SYSTEM In this paper, we present an attribute-based storage system, which uses the ciphertext-policy attribute-based encryption (CP-ABE) and to support securededuplication.To enablethe deduplication and distributed storage of the data across HDFS. And then using two way cloudinourstoragesystemis built under a hybrid cloud architecture, where a private cloud manipulates the computation of the user and a public cloud manages the storage. The private cloud is given a trapdoor key related to the corresponding ciphertext, with which it can transfer the ciphertext over one access policy into ciphertexts of the same plaintext under the otheraccess policies without being conscious of the underlyingplaintext. After receiving a storage request from the user, the private cloud first checks the validity of the upload item through the attached proof. If the proof is valid, the private cloud runs a tag matching algorithm to ascertain whether an equivalent data underlying the ciphertext has been stored 3. HARDWARE AND SOFTWARE SPECIFICATION HARDWARE REQUIREMENTS: Hard disk : 500 GB and above. Processor : i3 and above. Ram : 4GB and above. SOFTWARE REQUIREMENTS: Operating System : Windows 7 and above (64-bit). Java Version : JDK 1.7 Web Server :Tomcat 6.20 Web Server :Tomcat 7.0.11 Storage : Hadoop 2.7 4. TECHNOLOGY USED  JAVA  Cloud Computing  JAVA Platform 5. APACHE TOMCAT SERVER Apache Tomcat (formerly under the ApacheJakarta Project; Tomcat is now a top level project) could also be an internet container developed at the Apache Software Foundation. Tomcat implements the servlet and thusthe JSPspecifyfrom Sun Microsystems, providing an environment forJava torun
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 02 | Feb 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 3156 in cooperation with an online server. It adds tools for configuration andmanagement butalsocanbeconfiguredby editing configuration files that are normally XML-formatted. Because Tomcat includes its own HTTP serverinternally,it's also considered a standalone web server. 5.1 Purpose The main aim of this project is to realize newdistributed de- duplication systems we present an attribute-based storage system with secure deduplication during a hybrid cloud setting with higher reliability. 5.2 Project Scope In this Deduplication techniques are employed to backup data and minimize network of the data and storage of an data overhead by detecting and eliminating redundancy among data from the storage area.so which is crucial for eliminating duplicate copies of identical data so as to save lots of space for storing and network bandwidth.Wepresent an attribute-based storage system with secure access policies data deduplication in a hybrid cloud. Where a personal cloud is liable for duplicate detection and a public cloud manages the storage. Instead of keeping multiple data copy with an equivalent content, during the technique eliminate redundant data by keeping only physical copyand referring other redundant datas thereto copy. 6. Algorithms used  MD5 algorithm.  Base 64.  RSA algorithm. 7. SYSTEM DESIGN 7.1 Uploading a File In this module, cloud user first register the user details and then login the user credential details. Once user name and password is valid open the user profile screen are going to be displayed. A user is an entity whowantstooutsourcedata storage to the HDFS Storage and access the info later. A user register to the HDFS storage with necessaryinformationand login the page for uploading the file. User chooses the file of data and uploads to Storage in the given space where the HDFS store the file in rapid storage system and file level deduplication is checked. 7.2 Mastering File to HDFS Storage Tag the file by using MD5 message-digest algorithm is cryptographic hash function producing a 128-bit hash value typically expressed in text format as 32 digit hex value in order that files of same are deduplicated. Chunking the file chosen for the fixed size given of data andgeneratingtags for each blocks chunked. Then generate convergent keys for every blocks split to verify block level deduplication.Provide filename and password for file authorization in future. This Encrypt the blocks by Triple encoding Standard (3DES) algorithm. Here the plain text is encoded triple times with convergent key then while decoding the primary content it also needs an equivalent key to decode again by triple times. Finally the first content is encrypted as cipher text and stored in slave system. Blocks arestoredinDistributedHDFS Storage Providers. 7.3 Enabling Deduplication Method After encrypting the convergent keys are securely shared with slave machines provider toKeyManagementmachines. Key management slave checks duplicate copies of an data to convergent keysinKMCSP. KeyManagementslavemaintains Comma Separated Values file to see proof of verification of data and store keys secure. Various users who share the common keys are referred by their own ownership. User request for deletion definitely got to prove to proof of ownership to delete own contents. 7.4 Hash Value Based Decryption The final model user request for downloading their document which they need to upload in HDFS storage. This download request needs proper ownership of the data verification of the document here we create the ownership by unique tag generated by MD5 algorithm and verifies existing tag of user. After verification the original content of the data is decrypted by requesting the Distributed HDFS storage where HDFS storage request key management slave for keys to decrypt the given data and finally the original content is received by the user. The delete request will delete only the reference of the content shared by common users, and not the whole content. Fig -1: Block diagram 8. CONCLUSIONS In this project, the new distributed deduplication systems with file-level and fine-grained block-level data deduplication, higher reliability inwhichthedata chunks are distributed across HDFSstorage,reliablekeymanagementin secure deduplication and security of tag consistency and
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 02 | Feb 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 3157 integrity were achieved. The purpose of software requirements specification is to provide a detailedoverview of the software project and its purpose and its parameters and goals. This describes the project target audience and its user interface, hardware and software requirements of the given data. It is defined how the client, team and audience see the project and its functionality. 9. REFERENCES [1] D. Quick, B. Martini, and K. R. Choo, Cloud Storage Forensics. Syngress Publishing / Elsevier, 2014. [Online]. Available:http://www.elsevier.com/books/cloud- storageforensics/ quick/978-0-12-419970-5 [2] K. R. Choo, J. Domingo-Ferrer, and L. Zhang, “Cloud cryptography: Theory, practice and future research directions,” Future Generation Comp. Syst., vol. 62, pp. 51– 53, 2016. [3] K. R. Choo, M. Herman, M. Iorga, and B. Martini, “Cloud forensics: State-of-the-art and future directions,” Digital Investigation, vol. 18, pp. 77–78, 2016. [4] Y. Yang, H. Zhu, H. Lu, J.Weng, Y. Zhang, and K. R. Choo, “Cloud based data sharing with fine-grained proxy re- encryption,” Pervasive and Mobile Computing, vol. 28, pp. 122–134, 2016. [5] D. Quick and K. R. Choo, “Google drive: Forensic analysis of data remnants,” J. Network and Computer Applications, vol. 40, pp. 179– 193, 2014. [6] A. Sahai and B. Waters, “Fuzzy identity-based encryption,” in Advances in Cryptology - EUROCRYPT 2005, 24th Annual International Conference on the Theory and ApplicationsofCryptographicTechniques,Aarhus,Denmark, May 22-26, 2005, Proceedings, ser. Lecture Notes in Computer Science, vol. 3494. Springer, 2005, pp. 457–473.