SlideShare a Scribd company logo
2
Most read
International Journal of Trend in Scientific Research and Development (IJTSRD)
Volume 6 Issue 2, January-February 2022 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470
@ IJTSRD | Unique Paper ID – IJTSRD49416 | Volume – 6 | Issue – 2 | Jan-Feb 2022 Page 1331
File Sharing and Data Duplication Removal in
Cloud Using File Checksum
Gopi B, Murugan R
School of Computer Science and Information Technology,
Jain (deemed to be University), Bangalore, Karnataka, India
ABSTRACT
Data duplication uses file checksum technique to identify the
duplicate or redundant data rapidly and accurately. There may be the
chance of inaccurate result which can be avoided by comparing the
checksum of already exiting file with newly uploaded file. The file
can be stored using multiple attributes such as file name, date and
time, checksum, user id, and so on. When the user uploads the new
files the system will generates the checksum of the file and compare
it with the check of file that has already been stored. If the match is
found then it will update the old entry otherwise new entry will be
created into the database.
KEYWORDS: Database, Duplication, Entity, Data, Checksum,
Redundant, User id
How to cite this paper: Gopi B |
Murugan R "File Sharing and Data
Duplication Removal in Cloud Using
File Checksum" Published in
International
Journal of Trend in
Scientific Research
and Development
(ijtsrd), ISSN: 2456-
6470, Volume-6 |
Issue-2, February
2022, pp.1331-
1333, URL:
www.ijtsrd.com/papers/ijtsrd49416.pdf
Copyright © 2022 by author (s) and
International Journal of Trend in
Scientific Research and Development
Journal. This is an
Open Access article
distributed under the
terms of the Creative Commons
Attribution License (CC BY 4.0)
(http://creativecommons.org/licenses/by/4.0)
1. INTRODUCTION
The collection of information is known as data. The
data is increasing constantly in the digital universe. A
study suggests that at end of 2020 each person will
create 1.7 megabyte of data. It is also clear that the
rate of data production per day is about 2.5 quintillion
bytes of data. The reasons behind the growth of
multiple data are:
Multiple backup of data or file by single person.
Misuses of social media.
The hacking of the organisation system in 9/11 and
loss of data caused by illegal activity proved that loss
of data is major problem for the organization. This
event forces the organization to implement data back
of system in order to preserve their important data.
The organizations started keeping regular backup of
their data such as email, video audio etc. which
increase their storage unit. While backing the data
regularly, they end up with storing the duplicate data
multiple times which is the misuse of storage.
As the data is increasing constantly storing them and
managing them becomes more difficult. More data
requires more storage and more storage require more
cost as we have to increase the hardware or storage
unit. Only increasing the storage unit is not the
solution because we are not sure that how much
storage unit we have to add. Adding more number of
storage units makes system bulk and more costly.
So, the solution to above problem is proper
implementation of data duplication removal system.
The data duplication removal method stores the data
or file to the system if they are not stored previously.
If the match is found then it will update the old entry.
So this system will remove the duplicate data quickly
and saves the precious storage units.
2. SURVEY MOTIVATION
"Di Pietro, Roberto, and Alessandro Sorniotti"
discussed the security concern raised by de-
duplication and to address this security concern the
author utilizes the idea of Proof of Ownership
(POW). POW are intended to permit server to verify
whether a client possesses a file or not.
IJTSRD49416
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD49416 | Volume – 6 | Issue – 2 | Jan-Feb 2022 Page 1332
According To “Atishkathpal Matthew John Anf
Gauravmakkar”, data duplication removal is the
method of eliminating the duplicate data from the
storage devices in order to minimize the consumption
of memory in storage devices. Since, the concepts
were good but their system cannot work as they
intended due to poor management of hardware
devices and not easy to use which result in the under
performance of the system.
2.1. GOAL
Many work has been done in past in order to save the
storage problem that is caused by data duplication.
Data duplication has been the major problem and the
technology developed in past was not able to solve
the problem due to improper management of
technology.
2.2. LIMITATION
More processing time.
Chance of false result.
Not user friendly.
System maintenance is difficult.
2.3. KEYWORDS
Cloud computing, data storage, file checksum
algorithms, computational infrastructure, duplication.
3. SURVEY OUTCOMES
Data Deduplication increases the amount of unwanted
data in the storage unit by storing the multiple copy of
same file. Data duplication removal technique uses
file checksum technique to find duplicate or
redundant data quickly. The technique calculates the
checksum of the file when the file is uploaded and
checks the newly calculated checksum with the
checksum of file that are already store in database. If
the file is already present it will modify the file else it
will make new entry of file. In this system we are
going to use MD-5 hash algorithm, to detect the
duplicate file. MD-5 refers to Message Digest
algorithm which is 128 bit hash algorithm.
Advantages:
Faster file searching.
Reduce storage space by eliminating data
redundancy.
Ease to download and upload file.
4. CONCLUSION
This technique focus in developing web based
application that can find the redundant data quickly
and easily using file checksum technique. For
calculating the checksum of already existing files and
new file Message Digest (MD-5) algorithm is used.
MD-5 algorithm is used to calculate the checksum as
well as to provide the better security and encryption
to the valuable files of users. Hence, this system
removes duplicate file easily and quickly by
providing better security.
5. REFERENCES
[1] Di Pietro, Roberto and Alessandro Sorniotti,
"Proof of ownership for de-duplication
systems: A secure, scalable, and efficient
solution", Computer Communications, 15 May
2016.
[2] M. Bellare,S. Keelveedhi, and T. Ristenpart,
"Dupless: Server aided encryption for
deduplicated storage", USENIX Security
Symposium, 2013.
[3] Harnik, Danny, Alexandra Shulman-Peleg and
Benny Pinkas, "Side channels in cloud services,
the case of deduplication in cloud storage ",
IEEE Security & Privacy 8, 2014.
[4] Atishkathpal, Matthew John and
Gauravmakkar, "Distributed Duplicate
Detection in Post-Process Data De-
duplication", Conference: HiPC , 2011
[5] X. Zhao, Y. Zhang, Y. Wu, K. Chen, J.
Jiang, K. Li, "Liquid: A Scalable Deduplication
File System for Virtual Machine Images", IEEE
Transactions on Parallel and Distributed
Systems, January 2013.
[6] Stephen J. Bigelow, "Data Deduplication
Explained: http://searchgate.org", February,
2018
[7] http://www.computerweekly.com/report/Data-
duplication-technology-review
[8] https://nevonprojects.com
[9] Morris Dworkin, 2015; NIST Policy on Hash
Functions; Cryptographic Technology
group,https://csrc.nist.gov/projects/hash-
functions/nist-policy-on-hash-functions August
5, 2015; “National Institute of Standard and
Technology NIST Special Publication 800-145
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
@ IJTSRD | Unique Paper ID – IJTSRD49416 | Volume – 6 | Issue – 2 | Jan-Feb 2022 Page 1333
[10] NimalaBhadrappa, Mamatha G. S. 2017,
Implementation of De-Duplication Algorithm,
International Research Journal of Engineering
and Technology (IRJET), Volume 04, Issue
09.
[11] O’Brien, J. A. &Marakas, G. M. (2011).
Computer Software. Management Information
Systems 10th ed. 145. McGraw-Hill/Irwin
[12] Peter Mel; The NIST definition of Cloud
Computing, “National Institute of Standard and
Technology NIST Special Publication 800-145
[13] PHP 5 tutorials; W3Schools,
https://www.w3schools.com/pHP/default.asp
[14] Accessed June, 2018.
[15] Rivest R., 1992 The MD5 Message Digest
Algorithm. RFC 1321
http://www.ietf.org/rfc/rfc321.txt
[16] Sandeep Sharma, 2015; 15 Best PHP Libraries
Every Developer Should Know; published on;
https://www.programmableweb.com/news/15-
best-php-libraries-every-developer-should-
know/analysis/2015/11/18 ; accessed June 12,
2018.
[17] Single Instance Storage in Microsoft Windows
Storage Server 2003 R2Archived 2007-01-04
at the Way back Machine:
https://archive.org/webTechnical White Paper:
Published May 2006 access September, 2018.
[18] Stephen J. Bigelow, 2007 Data Deduplication
Explained: http://searchgate.org; Accessed
February, 2018
[19] Wenying Zeng, Yuelong K. O, Wei S., (2009)
Research on Cloud Storage Architecture and
Key Technologies, ICIS 2009 Proceedings of
the 2nd International Conference on Interaction
Sciences: Information Technology, Culture and
Human Pages 1044-1048.
[20] What is PHP? PHP User contributory notes;
http://php.net/manual/en/intro-whatis.php.
Accessed June 6, 2018
[21] X. Zhao, Y. Zhang, Y. Wu, K. Chen, J. Jiang,
K. Li, "Liquid: A scalable deduplication file
system for virtual machine
[22] Images", Parallel and Distributed Systems
IEEE Transactions on, vol. 25, no. 5, pp. 1257-
1266, May 2014.

More Related Content

PDF
A One stop APP for Personal Data management with enhanced Security using Inte...
PDF
Big Data: Privacy and Security Aspects
PDF
Content an Insight to Security Paradigm for BigData on Cloud: Current Trend a...
PDF
Data Sharing with Sensitive Information Hiding in Data Storage using Cloud Co...
PDF
Protection of big data privacy
PDF
A Survey: Enhanced Block Level Message Locked Encryption for data Deduplication
PDF
Fundamentals of data mining and its applications
PDF
Two-Phase TDS Approach for Data Anonymization To Preserving Bigdata Privacy
A One stop APP for Personal Data management with enhanced Security using Inte...
Big Data: Privacy and Security Aspects
Content an Insight to Security Paradigm for BigData on Cloud: Current Trend a...
Data Sharing with Sensitive Information Hiding in Data Storage using Cloud Co...
Protection of big data privacy
A Survey: Enhanced Block Level Message Locked Encryption for data Deduplication
Fundamentals of data mining and its applications
Two-Phase TDS Approach for Data Anonymization To Preserving Bigdata Privacy

Similar to File Sharing and Data Duplication Removal in Cloud Using File Checksum (20)

PDF
IRJET - K-Gram based Composite Secret Sign Search Over Encrypted Cloud In...
PDF
Rijndael Algorithm for Multiple File Encryption Development
PDF
Hybrid Cloud Approach with Security and Data Deduplication
PDF
Efficient Similarity Search Over Encrypted Data
PDF
Data Mining – A Perspective Approach
PDF
Big Data A Review
PDF
Study on potential capabilities of a nodb system
PDF
IRJET - Privacy Preserving Keyword Search over Encrypted Data in the Cloud
PDF
Data Mining in the World of BIG Data-A Survey
PDF
Data Search in Cloud using the Encrypted Keywords
PDF
Improved deduplication with keys and chunks in HDFS storage providers
PDF
IRJET- Cross User Bigdata Deduplication
DOC
V1_I1_2012_Paper4.doc
PDF
Privacy-Preserving Updates to Anonymous and Confidential Database
PDF
Efficient Similarity Search over Encrypted Data
PDF
An Efficient and Safe Data Sharing Scheme for Mobile Cloud Computing
PDF
Advancing integrity and privacy in cloud storage: challenges, current solutio...
PDF
Big Data in Bioinformatics & the Era of Cloud Computing
PDF
IRJET- Adaptable Wildcard Searchable Encryption System
DOCX
Discussion 1Knowledge-centric organizations have incorporated mo
IRJET - K-Gram based Composite Secret Sign Search Over Encrypted Cloud In...
Rijndael Algorithm for Multiple File Encryption Development
Hybrid Cloud Approach with Security and Data Deduplication
Efficient Similarity Search Over Encrypted Data
Data Mining – A Perspective Approach
Big Data A Review
Study on potential capabilities of a nodb system
IRJET - Privacy Preserving Keyword Search over Encrypted Data in the Cloud
Data Mining in the World of BIG Data-A Survey
Data Search in Cloud using the Encrypted Keywords
Improved deduplication with keys and chunks in HDFS storage providers
IRJET- Cross User Bigdata Deduplication
V1_I1_2012_Paper4.doc
Privacy-Preserving Updates to Anonymous and Confidential Database
Efficient Similarity Search over Encrypted Data
An Efficient and Safe Data Sharing Scheme for Mobile Cloud Computing
Advancing integrity and privacy in cloud storage: challenges, current solutio...
Big Data in Bioinformatics & the Era of Cloud Computing
IRJET- Adaptable Wildcard Searchable Encryption System
Discussion 1Knowledge-centric organizations have incorporated mo
Ad

More from ijtsrd (20)

PDF
A Study of School Dropout in Rural Districts of Darjeeling and Its Causes
PDF
Pre extension Demonstration and Evaluation of Soybean Technologies in Fedis D...
PDF
Pre extension Demonstration and Evaluation of Potato Technologies in Selected...
PDF
Pre extension Demonstration and Evaluation of Animal Drawn Potato Digger in S...
PDF
Pre extension Demonstration and Evaluation of Drought Tolerant and Early Matu...
PDF
Pre extension Demonstration and Evaluation of Double Cropping Practice Legume...
PDF
Pre extension Demonstration and Evaluation of Common Bean Technology in Low L...
PDF
Enhancing Image Quality in Compression and Fading Channels A Wavelet Based Ap...
PDF
Manpower Training and Employee Performance in Mellienium Ltdawka, Anambra State
PDF
A Statistical Analysis on the Growth Rate of Selected Sectors of Nigerian Eco...
PDF
Automatic Accident Detection and Emergency Alert System using IoT
PDF
Corporate Social Responsibility Dimensions and Corporate Image of Selected Up...
PDF
The Role of Media in Tribal Health and Educational Progress of Odisha
PDF
Advancements and Future Trends in Advanced Quantum Algorithms A Prompt Scienc...
PDF
A Study on Seismic Analysis of High Rise Building with Mass Irregularities, T...
PDF
Descriptive Study to Assess the Knowledge of B.Sc. Interns Regarding Biomedic...
PDF
Performance of Grid Connected Solar PV Power Plant at Clear Sky Day
PDF
Vitiligo Treated Homoeopathically A Case Report
PDF
Vitiligo Treated Homoeopathically A Case Report
PDF
Uterine Fibroids Homoeopathic Perspectives
A Study of School Dropout in Rural Districts of Darjeeling and Its Causes
Pre extension Demonstration and Evaluation of Soybean Technologies in Fedis D...
Pre extension Demonstration and Evaluation of Potato Technologies in Selected...
Pre extension Demonstration and Evaluation of Animal Drawn Potato Digger in S...
Pre extension Demonstration and Evaluation of Drought Tolerant and Early Matu...
Pre extension Demonstration and Evaluation of Double Cropping Practice Legume...
Pre extension Demonstration and Evaluation of Common Bean Technology in Low L...
Enhancing Image Quality in Compression and Fading Channels A Wavelet Based Ap...
Manpower Training and Employee Performance in Mellienium Ltdawka, Anambra State
A Statistical Analysis on the Growth Rate of Selected Sectors of Nigerian Eco...
Automatic Accident Detection and Emergency Alert System using IoT
Corporate Social Responsibility Dimensions and Corporate Image of Selected Up...
The Role of Media in Tribal Health and Educational Progress of Odisha
Advancements and Future Trends in Advanced Quantum Algorithms A Prompt Scienc...
A Study on Seismic Analysis of High Rise Building with Mass Irregularities, T...
Descriptive Study to Assess the Knowledge of B.Sc. Interns Regarding Biomedic...
Performance of Grid Connected Solar PV Power Plant at Clear Sky Day
Vitiligo Treated Homoeopathically A Case Report
Vitiligo Treated Homoeopathically A Case Report
Uterine Fibroids Homoeopathic Perspectives
Ad

Recently uploaded (20)

PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
Insiders guide to clinical Medicine.pdf
PPTX
Pharma ospi slides which help in ospi learning
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
Classroom Observation Tools for Teachers
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
RMMM.pdf make it easy to upload and study
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
Computing-Curriculum for Schools in Ghana
PDF
Basic Mud Logging Guide for educational purpose
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
Microbial disease of the cardiovascular and lymphatic systems
2.FourierTransform-ShortQuestionswithAnswers.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Insiders guide to clinical Medicine.pdf
Pharma ospi slides which help in ospi learning
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPH.pptx obstetrics and gynecology in nursing
Anesthesia in Laparoscopic Surgery in India
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
Classroom Observation Tools for Teachers
102 student loan defaulters named and shamed – Is someone you know on the list?
Module 4: Burden of Disease Tutorial Slides S2 2025
RMMM.pdf make it easy to upload and study
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Renaissance Architecture: A Journey from Faith to Humanism
Supply Chain Operations Speaking Notes -ICLT Program
Computing-Curriculum for Schools in Ghana
Basic Mud Logging Guide for educational purpose
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Microbial disease of the cardiovascular and lymphatic systems

File Sharing and Data Duplication Removal in Cloud Using File Checksum

  • 1. International Journal of Trend in Scientific Research and Development (IJTSRD) Volume 6 Issue 2, January-February 2022 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470 @ IJTSRD | Unique Paper ID – IJTSRD49416 | Volume – 6 | Issue – 2 | Jan-Feb 2022 Page 1331 File Sharing and Data Duplication Removal in Cloud Using File Checksum Gopi B, Murugan R School of Computer Science and Information Technology, Jain (deemed to be University), Bangalore, Karnataka, India ABSTRACT Data duplication uses file checksum technique to identify the duplicate or redundant data rapidly and accurately. There may be the chance of inaccurate result which can be avoided by comparing the checksum of already exiting file with newly uploaded file. The file can be stored using multiple attributes such as file name, date and time, checksum, user id, and so on. When the user uploads the new files the system will generates the checksum of the file and compare it with the check of file that has already been stored. If the match is found then it will update the old entry otherwise new entry will be created into the database. KEYWORDS: Database, Duplication, Entity, Data, Checksum, Redundant, User id How to cite this paper: Gopi B | Murugan R "File Sharing and Data Duplication Removal in Cloud Using File Checksum" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456- 6470, Volume-6 | Issue-2, February 2022, pp.1331- 1333, URL: www.ijtsrd.com/papers/ijtsrd49416.pdf Copyright © 2022 by author (s) and International Journal of Trend in Scientific Research and Development Journal. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0) (http://creativecommons.org/licenses/by/4.0) 1. INTRODUCTION The collection of information is known as data. The data is increasing constantly in the digital universe. A study suggests that at end of 2020 each person will create 1.7 megabyte of data. It is also clear that the rate of data production per day is about 2.5 quintillion bytes of data. The reasons behind the growth of multiple data are: Multiple backup of data or file by single person. Misuses of social media. The hacking of the organisation system in 9/11 and loss of data caused by illegal activity proved that loss of data is major problem for the organization. This event forces the organization to implement data back of system in order to preserve their important data. The organizations started keeping regular backup of their data such as email, video audio etc. which increase their storage unit. While backing the data regularly, they end up with storing the duplicate data multiple times which is the misuse of storage. As the data is increasing constantly storing them and managing them becomes more difficult. More data requires more storage and more storage require more cost as we have to increase the hardware or storage unit. Only increasing the storage unit is not the solution because we are not sure that how much storage unit we have to add. Adding more number of storage units makes system bulk and more costly. So, the solution to above problem is proper implementation of data duplication removal system. The data duplication removal method stores the data or file to the system if they are not stored previously. If the match is found then it will update the old entry. So this system will remove the duplicate data quickly and saves the precious storage units. 2. SURVEY MOTIVATION "Di Pietro, Roberto, and Alessandro Sorniotti" discussed the security concern raised by de- duplication and to address this security concern the author utilizes the idea of Proof of Ownership (POW). POW are intended to permit server to verify whether a client possesses a file or not. IJTSRD49416
  • 2. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD49416 | Volume – 6 | Issue – 2 | Jan-Feb 2022 Page 1332 According To “Atishkathpal Matthew John Anf Gauravmakkar”, data duplication removal is the method of eliminating the duplicate data from the storage devices in order to minimize the consumption of memory in storage devices. Since, the concepts were good but their system cannot work as they intended due to poor management of hardware devices and not easy to use which result in the under performance of the system. 2.1. GOAL Many work has been done in past in order to save the storage problem that is caused by data duplication. Data duplication has been the major problem and the technology developed in past was not able to solve the problem due to improper management of technology. 2.2. LIMITATION More processing time. Chance of false result. Not user friendly. System maintenance is difficult. 2.3. KEYWORDS Cloud computing, data storage, file checksum algorithms, computational infrastructure, duplication. 3. SURVEY OUTCOMES Data Deduplication increases the amount of unwanted data in the storage unit by storing the multiple copy of same file. Data duplication removal technique uses file checksum technique to find duplicate or redundant data quickly. The technique calculates the checksum of the file when the file is uploaded and checks the newly calculated checksum with the checksum of file that are already store in database. If the file is already present it will modify the file else it will make new entry of file. In this system we are going to use MD-5 hash algorithm, to detect the duplicate file. MD-5 refers to Message Digest algorithm which is 128 bit hash algorithm. Advantages: Faster file searching. Reduce storage space by eliminating data redundancy. Ease to download and upload file. 4. CONCLUSION This technique focus in developing web based application that can find the redundant data quickly and easily using file checksum technique. For calculating the checksum of already existing files and new file Message Digest (MD-5) algorithm is used. MD-5 algorithm is used to calculate the checksum as well as to provide the better security and encryption to the valuable files of users. Hence, this system removes duplicate file easily and quickly by providing better security. 5. REFERENCES [1] Di Pietro, Roberto and Alessandro Sorniotti, "Proof of ownership for de-duplication systems: A secure, scalable, and efficient solution", Computer Communications, 15 May 2016. [2] M. Bellare,S. Keelveedhi, and T. Ristenpart, "Dupless: Server aided encryption for deduplicated storage", USENIX Security Symposium, 2013. [3] Harnik, Danny, Alexandra Shulman-Peleg and Benny Pinkas, "Side channels in cloud services, the case of deduplication in cloud storage ", IEEE Security & Privacy 8, 2014. [4] Atishkathpal, Matthew John and Gauravmakkar, "Distributed Duplicate Detection in Post-Process Data De- duplication", Conference: HiPC , 2011 [5] X. Zhao, Y. Zhang, Y. Wu, K. Chen, J. Jiang, K. Li, "Liquid: A Scalable Deduplication File System for Virtual Machine Images", IEEE Transactions on Parallel and Distributed Systems, January 2013. [6] Stephen J. Bigelow, "Data Deduplication Explained: http://searchgate.org", February, 2018 [7] http://www.computerweekly.com/report/Data- duplication-technology-review [8] https://nevonprojects.com [9] Morris Dworkin, 2015; NIST Policy on Hash Functions; Cryptographic Technology group,https://csrc.nist.gov/projects/hash- functions/nist-policy-on-hash-functions August 5, 2015; “National Institute of Standard and Technology NIST Special Publication 800-145
  • 3. International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 @ IJTSRD | Unique Paper ID – IJTSRD49416 | Volume – 6 | Issue – 2 | Jan-Feb 2022 Page 1333 [10] NimalaBhadrappa, Mamatha G. S. 2017, Implementation of De-Duplication Algorithm, International Research Journal of Engineering and Technology (IRJET), Volume 04, Issue 09. [11] O’Brien, J. A. &Marakas, G. M. (2011). Computer Software. Management Information Systems 10th ed. 145. McGraw-Hill/Irwin [12] Peter Mel; The NIST definition of Cloud Computing, “National Institute of Standard and Technology NIST Special Publication 800-145 [13] PHP 5 tutorials; W3Schools, https://www.w3schools.com/pHP/default.asp [14] Accessed June, 2018. [15] Rivest R., 1992 The MD5 Message Digest Algorithm. RFC 1321 http://www.ietf.org/rfc/rfc321.txt [16] Sandeep Sharma, 2015; 15 Best PHP Libraries Every Developer Should Know; published on; https://www.programmableweb.com/news/15- best-php-libraries-every-developer-should- know/analysis/2015/11/18 ; accessed June 12, 2018. [17] Single Instance Storage in Microsoft Windows Storage Server 2003 R2Archived 2007-01-04 at the Way back Machine: https://archive.org/webTechnical White Paper: Published May 2006 access September, 2018. [18] Stephen J. Bigelow, 2007 Data Deduplication Explained: http://searchgate.org; Accessed February, 2018 [19] Wenying Zeng, Yuelong K. O, Wei S., (2009) Research on Cloud Storage Architecture and Key Technologies, ICIS 2009 Proceedings of the 2nd International Conference on Interaction Sciences: Information Technology, Culture and Human Pages 1044-1048. [20] What is PHP? PHP User contributory notes; http://php.net/manual/en/intro-whatis.php. Accessed June 6, 2018 [21] X. Zhao, Y. Zhang, Y. Wu, K. Chen, J. Jiang, K. Li, "Liquid: A scalable deduplication file system for virtual machine [22] Images", Parallel and Distributed Systems IEEE Transactions on, vol. 25, no. 5, pp. 1257- 1266, May 2014.