SlideShare a Scribd company logo
A SECURE AND DYNAMIC MULTI-KEYWORD RANKED SEARCH
SCHEME OVER ENCRYPTED CLOUD DATA
Abstract—Due to the increasing popularity of cloud computing, more and more
data owners are motivated to outsource their data to cloud servers for great
convenience and reduced cost in data management. However, sensitive data should
be encrypted before outsourcing for privacy requirements, which obsoletes data
utilization like keyword-based document retrieval. In this paper, we present a
secure multi-keyword ranked search scheme over encrypted cloud data, which
simultaneously supports dynamic update operations like deletion and insertion of
documents. Specifically, the vector space model and the widely-used TF_IDF
model are combined in the index construction and query generation. We construct
a special tree-based index structure and propose a “Greedy Depth-first Search”
algorithm to provide efficient multi-keyword ranked search. The secure kNN
algorithm is utilized to encrypt the index and query vectors, and meanwhile ensure
accurate relevance score calculation between encrypted index and query vectors. In
order to resist statistical attacks, phantom terms are added to the index vector for
blinding search results . Due to the use of our special tree-based index structure,
the proposed scheme can achieve sub-linear search time and deal with the deletion
and insertion of documents flexibly. Extensive experiments are conducted to
demonstrate the efficiency of the proposed scheme.
EXISTING SYSTEM:
Searchable encryption schemes enable the clients to store the encrypted data to the
cloud and execute keyword search over ciphertext domain. Due to different
cryptography primitives, searchable encryption schemes can be constructed using
public key based cryptography or symmetric key based cryptography. Song et al.
proposed the first symmetric searchable encryption (SSE) scheme, and the search
time of their scheme is linear to the size of the data collection. Goh proposed
formal security definitions for SSE and designed a scheme based on Bloom filter.
The search time of Goh’s scheme is O (n), where n is the cardinality of the
document collection. Curtmola et al. proposed two schemes (SSE-1 and SSE-2)
which achieve the optimal search time. Their SSE-1 scheme is secure against
chosen-keyword attacks (CKA1) and SSE-2 is secure against adaptive chosen-
keyword attacks (CKA2) schemes, which are very simple in terms of functionality.
Afterward, abundant works have been proposed under different threat models to
achieve various search functionality, such as single keyword search, similarity
search , multi-keyword boolean search , ranked search and multi-keyword ranked
search, etc. Multi-keyword boolean search allows the users to input multiple query
keywords to request suitable documents. Among these works, conjunctive keyword
search schemes only return the documents that contain all of the query keywords.
Disjunctive keyword search schemes return all of the documents that contain a
subset of the query keywords. Predicate search schemes are proposed to support
both conjunctive and disjunctive search. All these multikeyword search schemes
retrieve search results based on the existence of keywords, which cannot provide
acceptable result ranking functionality. Ranked search an enable quick search of
the most relevant data. Sending back only the top-k most relevant documents can
effectively decrease network traffic. Some early works have realized the ranked
search using order-preserving techniques, but they are designed only for single
keyword search. Cao et al. realized the first privacy-preserving multi-keyword
ranked search scheme, in which documents and queries are represented as vectors
of dictionary size. With the “coordinate matching”, the documents are ranked
according to the number of matched query keywords. However, Cao et al.’s
scheme does not consider the importance of the different keywords, and thus is not
accurate enough. In addition, the search efficiency of the scheme is linear with the
cardinality of document collection. Sun et al. presented a secure multi-keyword
search scheme that supports similarity-based ranking. The authors constructed a
searchable index tree based on vector space model and adopted cosine measure
together with TF×IDF to provide ranking results
PROPOSED SYSTEM:
Due to the special structure of our tree-based index, the proposed search scheme
can flexibly achieve sub-linear search time and deal with the deletion and insertion
of documents. The secure kNN algorithm is utilized to encrypt the index and query
vectors, and meanwhile ensure accurate relevance score calculation between
encrypted index and query vectors. To resist different attacks in different threat
models, we construct two secure search schemes: the basic dynamic multi-keyword
ranked search (BDMRS) scheme in the known ciphertext model, and the enhanced
dynamic multi-keyword ranked search (EDMRS) scheme in the known
background model. Our contributions are summarized as follows:
1) We design a searchable encryption scheme that supports both the accurate
multi-keyword ranked search and flexible dynamic operation on document
collection.
2) Due to the special structure of our tree-based index, the search complexity of the
proposed scheme is fundamentally kept to logarithmic. And in practice, the
proposed scheme can achieve higher search efficiency by executing our “Greedy
Depth-first Search” algorithm. Moreover, parallel search can be flexibly performed
to further reduce the time cost of search process.
Module 1
The System and Threat Models
The system model in this paper involves three different entities: data owner, data
user and cloud server.
Data owner has a collection of documents F ={f1; f2; :::; fn} that he wants to
outsource to the cloud server in encrypted form while still keeping the capability to
search on them for effective utilization. In our scheme, the data owner firstly builds
a secure searchable tree index I from document collection F, and then generates an
encrypted document collection C for F. Afterwards, the data owner outsources the
encrypted collection C and the secure index I to the cloud server, and securely
distributes the key information of trapdoor generation (including keyword IDF
values) and document decryption to the authorized data users. Besides, the data
owner is responsible for the update
operation of his documents stored in the cloud server. While updating, the data
owner generates the update information locally and sends it to the server.
Data users are authorized ones to access the documents of data owner. With t
query keywords, the authorized user can generate a trapdoor TD according to
search control mechanisms to fetch k encrypted documents from cloud server.
Then, the data user can decrypt the documents with the shared secret key.
Cloud server stores the encrypted document collection C and the encrypted
searchable tree index I for data owner. Upon receiving the trapdoor TD from the
data user, the cloud server executes search over the index tree I, and finally returns
the corresponding collection of top-k ranked encrypted documents. Besides, upon
receiving the update information from the data owner, the server needs to update
the index I and document collection C according to the received information. The
cloud server in the proposed scheme is considered as “honest-but-curious”, which
is employed by lots of works on secure cloud data search. Specifically, the cloud
server honestly and correctly executes instructions in the designated protocol.
Meanwhile, it is curious to infer and analyze received data, which helps it acquire
additional information. Depending on what information the cloud server knows,
we adopt the two threat models proposed byCao et al..
Known Ciphertext Model. In this model, the cloud server only knows the
encrypted document collection C, the searchable index tree I, and the search
trapdoor TD submitted by the authorized user. That is to say, the cloud server can
conductciphertext-only attack (COA) in this model.
Known Background Model. Compared with known ciphertext model, the cloud
server in this stronger model is equipped with more knowledge, such as the term
frequency (TF) statistics of the document collection. This statistical information
records how many documents are there for each term frequency of a specific
keyword in the whole document collection, as shown in Fig. 2, which could be
used as the keyword identity. Equipped with such statistical information, the cloud
server can conduct TF statistical attack to deduce or even identify certain keywords
through analyzing histogram and value range of the corresponding frequency
distributions.
Module 2
DesignGoals
To enable secure, efficient, accurate and dynamic multikeyword ranked search
over outsourced encrypted cloud data under the above models, our system has the
following design goals.
Dynamic: The proposed scheme is designed to provide not only multi-keyword
query and accurate result ranking, but also dynamic update on document
collections.
Search Efficiency: The scheme aims to achieve sublinear search efficiency by
exploring a special tree-based index and an efficient search algorithm.
Privacy-preserving: The scheme is designed to prevent the cloud server from
learning additional information about the document collection, the index tree, and
the query. The specific privacy requirements are summarized as follows,
1) Index Confidentiality and Query Confidentiality: The underlying plaintext
information, including keywords in the index and query, TF values of keywords
stored in the index, and IDF values of query keywords, should be protected from
cloud server;
2) Trapdoor Unlinkability: The cloud server should not be able to determine
whether two encrypted queries (trapdoors) are generated from the same search
request;
3) Keyword Privacy: The cloud server could not identify the specific keyword in
query, index or document collection by analyzing the statistical information like
term frequency. Note that our proposed scheme is not designed to protect access
pattern, i.e., the sequence of returned documents.
Module 3
SearchProcessofUDMRS Scheme
The search process of the UDMRS scheme is a recursive procedure upon the tree,
named as “Greedy Depthfirst Search (GDFS)” algorithm. We construct a result list
denoted as RList, whose element is defined as ⟨RScore; FID⟩. Here, the RScore is
the relevance score of the document fFID to the query, which is calculated
according to Formula (1). The RList stores the k accessed documents with the
largest relevance scores to the query. The elements of the list are ranked in
descending order according to the RScore, and will be updated timely during the
search process.
(2). RScore(Du;Q) – The function to calculate the relevance score for query vector
Q and index vector Du stored in node u.
kthscore – The smallest relevance score in current RList, which is initialized
as 0.
hchild – The child node of a tree node with higher relevance score.
lchild – The child node of a tree node with lower relevance score.
Since the possible largest relevance score of documents rooted by the node u can
be predicted, only a part of the nodes in the tree are accessed during the search
process.
Module 4
BDMRS Scheme
Based on the UDMRS scheme, we construct the basic dynamic multi-keyword
ranked search (BDMRS) scheme by using the secure kNN algorithm. The BDMRS
scheme is designed to achieve the goal of privacypreserving in the known
ciphertext model, and the four algorithms included are described as follows:
• SK ← Setup() Initially, the data owner generates the secret key set SK, including
1) a randomly generated m-bit vector S where m is equal to the cardinality of
dictionary, and 2) two (m×m) invertible matrices M1 and M2. Namely, SK =
{S;M1;M2}.
• I ← GenIndex(F; SK) First, the unencrypted index tree T is built on F by using T
← BuildIndexTree(F). Secondly, the data owner generates two random vectors
{Du′;Du′′} for index vectorDu in each node u, according to the secret vector S.
Specifically, if S[i] = 0, Du′[i] and Du′′[i] will be set equal to Du[i]; if S[i] = 1,
Du′[i] and Du′′[i] will be set as two random values whose sum equals to Du[i].
Finally, the encrypted index tree I is built where the node u stores two encrypted
index vectors Iu = {MT1 Du′;MT2Du′′}.
• TD ← GenTrapdoor(Wq; SK) With keyword set Wq, the unencrypted query
vector Q with length of m is generated. If wi ∈ Wq, Q[i] stores the normalized IDF
value of wi; else Q[i] is set to 0. Similarly, the query vector Q is split into two
random vectors Q′ and Q′′. The difference is that if S[i] = 0, Q′[i] and Q′′[i] are set
to two random values whose sum equals to Q[i]; else Q′[i] and Q′′[i] are set as the
same as Q[i]. Finally, the algorithm returns the trapdoor
TD = {M−11 Q′;M−12 Q′′}.
• RelevanceScore ← SRScore(Iu;TD) With the trapdoor TD, the cloud server
computes the relevance score of node u in the index tree I to the query. Note that
the relevance score calculated from encrypted vectors is equal to that from
unencrypted vectors.
Module 5
EDMRS Scheme
The BDMRS scheme can protect the Index Confidentiality and Query
Confidentiality in the known ciphertext model. However, the cloud server is able
to link the same search requests by tracking path of visited nodes. In addition, in
the known background model, it is possible for the cloud server to identify a
keyword as the normalized TF distribution of the keyword can be exactly obtained
from the final calculated relevance scores. The primary cause is that the relevance
score calculated from Iu and TD is exactly equal to that from Du and Q. A heuristic
method to further improve the security is to break such exact equality. Thus, we
can introduce some tunable randomness to disturb the relevance score calculation.
In addition, to suit different users’ preferences for higher accurate ranked results or
better protected keyword privacy, the randomness are set adjustable.
CONCLUSION AND FUTURE WORK
In this paper, a secure, efficient and dynamic search scheme is proposed, which
supports not only the accurate multi-keyword ranked search but also the dynamic
deletion and insertion of documents. We construct a special keyword balanced
binary tree as the index, and propose a “Greedy Depth-first Search” algorithm to
obtain better efficiency than linear search. In addition, the parallel search process
can be carried out to further reduce the time cost. The security of the scheme is
protected against two threat models by using the secure kNN algorithm.
Experimental results demonstrate the efficiency of our proposed scheme. There are
still many challenge problems in symmetric SE schemes. In the proposed scheme,
the data owner is responsible for generating updating information and sending
them to the cloud server. Thus, the data owner needs to store the unencrypted
index tree and the information that are necessary to recalculate the IDF values.
Such an active data owner may not be very suitable for the cloud computing
model. It could be a meaningful but difficult future work to design a dynamic
searchable encryption scheme whose updating operation can be completed by
cloud server only, meanwhile reserving the ability to support multi-keyword
ranked search. In addition, as the most of works about searchable encryption, our
scheme mainly considers the challenge from the cloud server. Actually, there are
many secure challenges in a multi-user scheme. Firstly, all the users usually keep
the same secure key for trapdoor generation in a symmetric SE scheme. In this
case, the revocation of the user is big challenge. If it is needed to revoke a user in
this scheme, we need to rebuild the index and distribute the new secure keys to all
the authorized users. Secondly, symmetric SE schemes usually assume that all the
data users are trustworthy. It is not practical and a dishonest data user will lead to
many secure problems. For example, a dishonest data user may search the
documents and distribute the decrypted documents to the unauthorized ones. Even
more, a dishonest data user may distribute his/her secure keys to the unauthorized
ones. In the future works, we will try to improve the SE scheme to handle these
challenge problems.
REFERENCES
[1] K. Ren, C.Wang, Q.Wang et al., “Security challenges for the public cloud,”
IEEE Internet Computing, vol. 16, no. 1, pp. 69–73, 2012.
[2] S. Kamara and K. Lauter, “Cryptographic cloud storage,” in Financial
Cryptographyand Data Security. Springer, 2010, pp. 136–149.
[3] C. Gentry, “A fully homomorphic encryption scheme,” Ph.D.dissertation,
Stanford University, 2009.[4] O. Goldreich and R. Ostrovsky, “Software protection
and simulation on oblivious rams,” Journal of the ACM (JACM), vol. 43, no. 3, pp.
431–473, 1996.
[5] D. Boneh, G. Di Crescenzo, R. Ostrovsky, and G. Persiano, “Public key
encryption with keyword search,” in Advances in Cryptology- Eurocrypt 2004.
Springer, 2004, pp. 506–522.
[6] D. Boneh, E. Kushilevitz, R. Ostrovsky, and W. E. Skeith III, “Public key
encryption that allows pir queries,” in Advances in Cryptology-CRYPTO 2007.
Springer, 2007, pp. 50–67.
[7] D. X. Song, D. Wagner, and A. Perrig, “Practical techniques for searches on
encrypted data,” in Security and Privacy, 2000. S&P 2000. Proceedings. 2000
IEEE Symposium on. IEEE, 2000, pp. 44– 55.
[8] E.-J. Goh et al., “Secure indexes.” IACR Cryptology ePrint Archive, vol. 2003,
p. 216, 2003.
[9] Y.-C. Chang and M. Mitzenmacher, “Privacy preserving keyword searches on
remote encrypted data,” in Proceedings of the Third international conference on
Applied Cryptography and Network Security. Springer-Verlag, 2005, pp. 442–455.
[10] R. Curtmola, J. Garay, S. Kamara, and R. Ostrovsky, “Searchable symmetric
encryption: improved definitions and efficient constructions,” in Proceedings of
the 13th ACM conference on Computer and communications security. ACM, 2006,
pp. 79–88.
[11] J. Li, Q. Wang, C. Wang, N. Cao, K. Ren, and W. Lou, “Fuzzy keyword
search over encrypted data in cloud computing,” in INFOCOM, 2010 Proceedings
IEEE. IEEE, 2010, pp. 1–5.

More Related Content

DOCX
SECURE AUDITING AND DEDUPLICATING DATA IN CLOUD
DOCX
A SECURE AND DYNAMIC MULTI-KEYWORD RANKED SEARCH SCHEME OVER ENCRYPTED CLOUD...
PDF
Ijarcet vol-2-issue-3-881-883
PPTX
Final 1st
PPTX
Privacy preserving multi-keyword ranked search over encrypted cloud data
PDF
Survey on Privacy- Preserving Multi keyword Ranked Search over Encrypted Clou...
PDF
Retrieving Secure Data from Cloud Using OTP
DOC
A secure and dynamic multi keyword ranked search scheme over encrypted cloud ...
SECURE AUDITING AND DEDUPLICATING DATA IN CLOUD
A SECURE AND DYNAMIC MULTI-KEYWORD RANKED SEARCH SCHEME OVER ENCRYPTED CLOUD...
Ijarcet vol-2-issue-3-881-883
Final 1st
Privacy preserving multi-keyword ranked search over encrypted cloud data
Survey on Privacy- Preserving Multi keyword Ranked Search over Encrypted Clou...
Retrieving Secure Data from Cloud Using OTP
A secure and dynamic multi keyword ranked search scheme over encrypted cloud ...

What's hot (20)

DOCX
Enabling secure and efficient ranked keyword
PDF
Enabling Secure and Efficient Multi-Keyword Ranked Search Scheme
PDF
Enabling efficient multi keyword ranked
DOCX
A secure and dynamic multi keyword ranked
PPTX
Privacy preserving multi-keyword ranked search over encrypted cloud data
PDF
A Survey on Efficient Privacy-Preserving Ranked Keyword Search Method
PDF
A secure and dynamic multi keyword ranked search scheme over encrypted cloud ...
PDF
Enabling efficient multi keyword ranked search over encrypted mobile cloud da...
DOCX
privacy preserving multi keyword ranked search over encrypted cloud data
DOCX
Privacy preserving multi-keyword ranked search over encrypted cloud data
PDF
Paper id 28201425
PDF
IRJET- A Survey on Searching of Keyword on Encrypted Data in Cloud using ...
PPT
Privacy preserving multi-keyword ranked search over encrypted cloud data
DOC
Privacy preserving multi-keyword ranked search over encrypted cloud data
PDF
IRJET- Empower Syntactic Exploration Based on Conceptual Graph using Searchab...
PDF
A secure and dynamic multi keyword ranked search scheme over encrypted cloud ...
PPTX
Privacy preservingmulti-keywordrankedsearchoverencryptedclouddata-14090213203...
PDF
Enabling efficient multi keyword ranked search over encrypted mobile cloud da...
PDF
Efficient Privacy Preserving Clustering Based Multi Keyword Search
PDF
IRJET- Proficient Recovery Over Records using Encryption in Cloud Computing
Enabling secure and efficient ranked keyword
Enabling Secure and Efficient Multi-Keyword Ranked Search Scheme
Enabling efficient multi keyword ranked
A secure and dynamic multi keyword ranked
Privacy preserving multi-keyword ranked search over encrypted cloud data
A Survey on Efficient Privacy-Preserving Ranked Keyword Search Method
A secure and dynamic multi keyword ranked search scheme over encrypted cloud ...
Enabling efficient multi keyword ranked search over encrypted mobile cloud da...
privacy preserving multi keyword ranked search over encrypted cloud data
Privacy preserving multi-keyword ranked search over encrypted cloud data
Paper id 28201425
IRJET- A Survey on Searching of Keyword on Encrypted Data in Cloud using ...
Privacy preserving multi-keyword ranked search over encrypted cloud data
Privacy preserving multi-keyword ranked search over encrypted cloud data
IRJET- Empower Syntactic Exploration Based on Conceptual Graph using Searchab...
A secure and dynamic multi keyword ranked search scheme over encrypted cloud ...
Privacy preservingmulti-keywordrankedsearchoverencryptedclouddata-14090213203...
Enabling efficient multi keyword ranked search over encrypted mobile cloud da...
Efficient Privacy Preserving Clustering Based Multi Keyword Search
IRJET- Proficient Recovery Over Records using Encryption in Cloud Computing
Ad

Viewers also liked (18)

PDF
Kult_InternationalIntro
DOCX
Cách Giảm Vòng Bụng đảm Bảo
PPTX
Leviatan hobbes
PPTX
08. Заходняя Беларусь пад уладай Польшчы
PPTX
01. Западная Европа в 1918 – 1923 гг.
PDF
4 ambiente vbe
PPTX
Neoclasico
PPT
реєстрація зно 2017
PDF
Orlando2
PPTX
Open Resources for Open Presentations
PDF
Assist avia solutions
PDF
Pdf proyecto
PDF
Environment_Francesco_Amendola_2015_combined_CV_PH_0433626007 (3)
DOCX
CAHR: A Contextually Adaptive Home-Based Rehabilitation Framework
PPTX
Top 8 health administrator resume samples
PPTX
rumah sehat
PDF
Leeway PWP Catalog Excerpt
PDF
LA COGNOTECNOLOGIA
Kult_InternationalIntro
Cách Giảm Vòng Bụng đảm Bảo
Leviatan hobbes
08. Заходняя Беларусь пад уладай Польшчы
01. Западная Европа в 1918 – 1923 гг.
4 ambiente vbe
Neoclasico
реєстрація зно 2017
Orlando2
Open Resources for Open Presentations
Assist avia solutions
Pdf proyecto
Environment_Francesco_Amendola_2015_combined_CV_PH_0433626007 (3)
CAHR: A Contextually Adaptive Home-Based Rehabilitation Framework
Top 8 health administrator resume samples
rumah sehat
Leeway PWP Catalog Excerpt
LA COGNOTECNOLOGIA
Ad

Similar to A SECURE AND DYNAMIC MULTI-KEYWORD RANKED SEARCH SCHEME OVER ENCRYPTED CLOUD DATA (20)

PDF
A Secure and Dynamic Multi-keyword Ranked Search Scheme over Encrypted Cloud ...
PDF
A Secure and Dynamic Multi-keyword Ranked Search Scheme over Encrypted Cloud ...
PDF
A secure and dynamic multi
PDF
Secure Multi-Keyword Top-K Retrieval Over Encrypted Cloud Data Using Homomorp...
PDF
O01761103112
PDF
Enabling Fine-grained Multi-keyword Search Supporting Classified Sub-dictiona...
PDF
Enabling Fine-grained Multi-keyword Search Supporting Classified Sub-dictiona...
PDF
Implementation and Review Paper of Secure and Dynamic Multi Keyword Search in...
PDF
2016 BE Final year Projects in chennai - 1 Crore Projects
DOCX
Privacy-Preserving Multi-keyword Top-k Similarity Search Over Encrypted Data
PDF
IRJET- An Efficient Ranked Multi-Keyword Search for Multiple Data Owners Over...
PDF
A Secure and Dynamic Multi Keyword Ranked Search over Encrypted Cloud Data
PDF
IRJET- Data Mining - Secure Keyword Manager
PDF
An proficient and Confidentiality-Preserving Multi- Keyword Ranked Search ove...
PDF
An efficeient privacy preserving ranked keyword search
DOCX
Towards secure multi keyword top-k retrieval over encrypted cloud data
PDF
E FFICIENT D ATA R ETRIEVAL F ROM C LOUD S TORAGE U SING D ATA M ININ...
PDF
J017547478
DOC
A secure and dynamic multi keyword ranked search scheme over encrypted cloud ...
A Secure and Dynamic Multi-keyword Ranked Search Scheme over Encrypted Cloud ...
A Secure and Dynamic Multi-keyword Ranked Search Scheme over Encrypted Cloud ...
A secure and dynamic multi
Secure Multi-Keyword Top-K Retrieval Over Encrypted Cloud Data Using Homomorp...
O01761103112
Enabling Fine-grained Multi-keyword Search Supporting Classified Sub-dictiona...
Enabling Fine-grained Multi-keyword Search Supporting Classified Sub-dictiona...
Implementation and Review Paper of Secure and Dynamic Multi Keyword Search in...
2016 BE Final year Projects in chennai - 1 Crore Projects
Privacy-Preserving Multi-keyword Top-k Similarity Search Over Encrypted Data
IRJET- An Efficient Ranked Multi-Keyword Search for Multiple Data Owners Over...
A Secure and Dynamic Multi Keyword Ranked Search over Encrypted Cloud Data
IRJET- Data Mining - Secure Keyword Manager
An proficient and Confidentiality-Preserving Multi- Keyword Ranked Search ove...
An efficeient privacy preserving ranked keyword search
Towards secure multi keyword top-k retrieval over encrypted cloud data
E FFICIENT D ATA R ETRIEVAL F ROM C LOUD S TORAGE U SING D ATA M ININ...
J017547478
A secure and dynamic multi keyword ranked search scheme over encrypted cloud ...

Recently uploaded (20)

PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
Complications of Minimal Access Surgery at WLH
PDF
RMMM.pdf make it easy to upload and study
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
PPH.pptx obstetrics and gynecology in nursing
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Sports Quiz easy sports quiz sports quiz
PDF
TR - Agricultural Crops Production NC III.pdf
PPTX
master seminar digital applications in india
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Complications of Minimal Access Surgery at WLH
RMMM.pdf make it easy to upload and study
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Pharmacology of Heart Failure /Pharmacotherapy of CHF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Renaissance Architecture: A Journey from Faith to Humanism
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Module 4: Burden of Disease Tutorial Slides S2 2025
PPH.pptx obstetrics and gynecology in nursing
human mycosis Human fungal infections are called human mycosis..pptx
FourierSeries-QuestionsWithAnswers(Part-A).pdf
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Sports Quiz easy sports quiz sports quiz
TR - Agricultural Crops Production NC III.pdf
master seminar digital applications in india
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Final Presentation General Medicine 03-08-2024.pptx

A SECURE AND DYNAMIC MULTI-KEYWORD RANKED SEARCH SCHEME OVER ENCRYPTED CLOUD DATA

  • 1. A SECURE AND DYNAMIC MULTI-KEYWORD RANKED SEARCH SCHEME OVER ENCRYPTED CLOUD DATA Abstract—Due to the increasing popularity of cloud computing, more and more data owners are motivated to outsource their data to cloud servers for great convenience and reduced cost in data management. However, sensitive data should be encrypted before outsourcing for privacy requirements, which obsoletes data utilization like keyword-based document retrieval. In this paper, we present a secure multi-keyword ranked search scheme over encrypted cloud data, which simultaneously supports dynamic update operations like deletion and insertion of documents. Specifically, the vector space model and the widely-used TF_IDF model are combined in the index construction and query generation. We construct a special tree-based index structure and propose a “Greedy Depth-first Search” algorithm to provide efficient multi-keyword ranked search. The secure kNN algorithm is utilized to encrypt the index and query vectors, and meanwhile ensure accurate relevance score calculation between encrypted index and query vectors. In order to resist statistical attacks, phantom terms are added to the index vector for blinding search results . Due to the use of our special tree-based index structure, the proposed scheme can achieve sub-linear search time and deal with the deletion and insertion of documents flexibly. Extensive experiments are conducted to demonstrate the efficiency of the proposed scheme.
  • 2. EXISTING SYSTEM: Searchable encryption schemes enable the clients to store the encrypted data to the cloud and execute keyword search over ciphertext domain. Due to different cryptography primitives, searchable encryption schemes can be constructed using public key based cryptography or symmetric key based cryptography. Song et al. proposed the first symmetric searchable encryption (SSE) scheme, and the search time of their scheme is linear to the size of the data collection. Goh proposed formal security definitions for SSE and designed a scheme based on Bloom filter. The search time of Goh’s scheme is O (n), where n is the cardinality of the document collection. Curtmola et al. proposed two schemes (SSE-1 and SSE-2) which achieve the optimal search time. Their SSE-1 scheme is secure against chosen-keyword attacks (CKA1) and SSE-2 is secure against adaptive chosen- keyword attacks (CKA2) schemes, which are very simple in terms of functionality. Afterward, abundant works have been proposed under different threat models to achieve various search functionality, such as single keyword search, similarity search , multi-keyword boolean search , ranked search and multi-keyword ranked search, etc. Multi-keyword boolean search allows the users to input multiple query keywords to request suitable documents. Among these works, conjunctive keyword search schemes only return the documents that contain all of the query keywords. Disjunctive keyword search schemes return all of the documents that contain a subset of the query keywords. Predicate search schemes are proposed to support
  • 3. both conjunctive and disjunctive search. All these multikeyword search schemes retrieve search results based on the existence of keywords, which cannot provide acceptable result ranking functionality. Ranked search an enable quick search of the most relevant data. Sending back only the top-k most relevant documents can effectively decrease network traffic. Some early works have realized the ranked search using order-preserving techniques, but they are designed only for single keyword search. Cao et al. realized the first privacy-preserving multi-keyword ranked search scheme, in which documents and queries are represented as vectors of dictionary size. With the “coordinate matching”, the documents are ranked according to the number of matched query keywords. However, Cao et al.’s scheme does not consider the importance of the different keywords, and thus is not accurate enough. In addition, the search efficiency of the scheme is linear with the cardinality of document collection. Sun et al. presented a secure multi-keyword search scheme that supports similarity-based ranking. The authors constructed a searchable index tree based on vector space model and adopted cosine measure together with TF×IDF to provide ranking results PROPOSED SYSTEM: Due to the special structure of our tree-based index, the proposed search scheme can flexibly achieve sub-linear search time and deal with the deletion and insertion of documents. The secure kNN algorithm is utilized to encrypt the index and query vectors, and meanwhile ensure accurate relevance score calculation between
  • 4. encrypted index and query vectors. To resist different attacks in different threat models, we construct two secure search schemes: the basic dynamic multi-keyword ranked search (BDMRS) scheme in the known ciphertext model, and the enhanced dynamic multi-keyword ranked search (EDMRS) scheme in the known background model. Our contributions are summarized as follows: 1) We design a searchable encryption scheme that supports both the accurate multi-keyword ranked search and flexible dynamic operation on document collection. 2) Due to the special structure of our tree-based index, the search complexity of the proposed scheme is fundamentally kept to logarithmic. And in practice, the proposed scheme can achieve higher search efficiency by executing our “Greedy Depth-first Search” algorithm. Moreover, parallel search can be flexibly performed to further reduce the time cost of search process. Module 1 The System and Threat Models The system model in this paper involves three different entities: data owner, data user and cloud server. Data owner has a collection of documents F ={f1; f2; :::; fn} that he wants to outsource to the cloud server in encrypted form while still keeping the capability to
  • 5. search on them for effective utilization. In our scheme, the data owner firstly builds a secure searchable tree index I from document collection F, and then generates an encrypted document collection C for F. Afterwards, the data owner outsources the encrypted collection C and the secure index I to the cloud server, and securely distributes the key information of trapdoor generation (including keyword IDF values) and document decryption to the authorized data users. Besides, the data owner is responsible for the update operation of his documents stored in the cloud server. While updating, the data owner generates the update information locally and sends it to the server. Data users are authorized ones to access the documents of data owner. With t query keywords, the authorized user can generate a trapdoor TD according to search control mechanisms to fetch k encrypted documents from cloud server. Then, the data user can decrypt the documents with the shared secret key. Cloud server stores the encrypted document collection C and the encrypted searchable tree index I for data owner. Upon receiving the trapdoor TD from the data user, the cloud server executes search over the index tree I, and finally returns the corresponding collection of top-k ranked encrypted documents. Besides, upon receiving the update information from the data owner, the server needs to update the index I and document collection C according to the received information. The cloud server in the proposed scheme is considered as “honest-but-curious”, which is employed by lots of works on secure cloud data search. Specifically, the cloud server honestly and correctly executes instructions in the designated protocol.
  • 6. Meanwhile, it is curious to infer and analyze received data, which helps it acquire additional information. Depending on what information the cloud server knows, we adopt the two threat models proposed byCao et al.. Known Ciphertext Model. In this model, the cloud server only knows the encrypted document collection C, the searchable index tree I, and the search trapdoor TD submitted by the authorized user. That is to say, the cloud server can conductciphertext-only attack (COA) in this model. Known Background Model. Compared with known ciphertext model, the cloud server in this stronger model is equipped with more knowledge, such as the term frequency (TF) statistics of the document collection. This statistical information records how many documents are there for each term frequency of a specific keyword in the whole document collection, as shown in Fig. 2, which could be used as the keyword identity. Equipped with such statistical information, the cloud server can conduct TF statistical attack to deduce or even identify certain keywords through analyzing histogram and value range of the corresponding frequency distributions. Module 2 DesignGoals
  • 7. To enable secure, efficient, accurate and dynamic multikeyword ranked search over outsourced encrypted cloud data under the above models, our system has the following design goals. Dynamic: The proposed scheme is designed to provide not only multi-keyword query and accurate result ranking, but also dynamic update on document collections. Search Efficiency: The scheme aims to achieve sublinear search efficiency by exploring a special tree-based index and an efficient search algorithm. Privacy-preserving: The scheme is designed to prevent the cloud server from learning additional information about the document collection, the index tree, and the query. The specific privacy requirements are summarized as follows, 1) Index Confidentiality and Query Confidentiality: The underlying plaintext information, including keywords in the index and query, TF values of keywords stored in the index, and IDF values of query keywords, should be protected from cloud server; 2) Trapdoor Unlinkability: The cloud server should not be able to determine whether two encrypted queries (trapdoors) are generated from the same search request; 3) Keyword Privacy: The cloud server could not identify the specific keyword in query, index or document collection by analyzing the statistical information like term frequency. Note that our proposed scheme is not designed to protect access pattern, i.e., the sequence of returned documents.
  • 8. Module 3 SearchProcessofUDMRS Scheme The search process of the UDMRS scheme is a recursive procedure upon the tree, named as “Greedy Depthfirst Search (GDFS)” algorithm. We construct a result list denoted as RList, whose element is defined as ⟨RScore; FID⟩. Here, the RScore is the relevance score of the document fFID to the query, which is calculated according to Formula (1). The RList stores the k accessed documents with the largest relevance scores to the query. The elements of the list are ranked in descending order according to the RScore, and will be updated timely during the search process. (2). RScore(Du;Q) – The function to calculate the relevance score for query vector Q and index vector Du stored in node u. kthscore – The smallest relevance score in current RList, which is initialized as 0. hchild – The child node of a tree node with higher relevance score. lchild – The child node of a tree node with lower relevance score. Since the possible largest relevance score of documents rooted by the node u can be predicted, only a part of the nodes in the tree are accessed during the search process.
  • 9. Module 4 BDMRS Scheme Based on the UDMRS scheme, we construct the basic dynamic multi-keyword ranked search (BDMRS) scheme by using the secure kNN algorithm. The BDMRS scheme is designed to achieve the goal of privacypreserving in the known ciphertext model, and the four algorithms included are described as follows: • SK ← Setup() Initially, the data owner generates the secret key set SK, including 1) a randomly generated m-bit vector S where m is equal to the cardinality of dictionary, and 2) two (m×m) invertible matrices M1 and M2. Namely, SK = {S;M1;M2}. • I ← GenIndex(F; SK) First, the unencrypted index tree T is built on F by using T ← BuildIndexTree(F). Secondly, the data owner generates two random vectors {Du′;Du′′} for index vectorDu in each node u, according to the secret vector S. Specifically, if S[i] = 0, Du′[i] and Du′′[i] will be set equal to Du[i]; if S[i] = 1, Du′[i] and Du′′[i] will be set as two random values whose sum equals to Du[i]. Finally, the encrypted index tree I is built where the node u stores two encrypted index vectors Iu = {MT1 Du′;MT2Du′′}. • TD ← GenTrapdoor(Wq; SK) With keyword set Wq, the unencrypted query vector Q with length of m is generated. If wi ∈ Wq, Q[i] stores the normalized IDF
  • 10. value of wi; else Q[i] is set to 0. Similarly, the query vector Q is split into two random vectors Q′ and Q′′. The difference is that if S[i] = 0, Q′[i] and Q′′[i] are set to two random values whose sum equals to Q[i]; else Q′[i] and Q′′[i] are set as the same as Q[i]. Finally, the algorithm returns the trapdoor TD = {M−11 Q′;M−12 Q′′}. • RelevanceScore ← SRScore(Iu;TD) With the trapdoor TD, the cloud server computes the relevance score of node u in the index tree I to the query. Note that the relevance score calculated from encrypted vectors is equal to that from unencrypted vectors. Module 5 EDMRS Scheme The BDMRS scheme can protect the Index Confidentiality and Query Confidentiality in the known ciphertext model. However, the cloud server is able to link the same search requests by tracking path of visited nodes. In addition, in the known background model, it is possible for the cloud server to identify a keyword as the normalized TF distribution of the keyword can be exactly obtained from the final calculated relevance scores. The primary cause is that the relevance score calculated from Iu and TD is exactly equal to that from Du and Q. A heuristic method to further improve the security is to break such exact equality. Thus, we
  • 11. can introduce some tunable randomness to disturb the relevance score calculation. In addition, to suit different users’ preferences for higher accurate ranked results or better protected keyword privacy, the randomness are set adjustable. CONCLUSION AND FUTURE WORK In this paper, a secure, efficient and dynamic search scheme is proposed, which supports not only the accurate multi-keyword ranked search but also the dynamic deletion and insertion of documents. We construct a special keyword balanced binary tree as the index, and propose a “Greedy Depth-first Search” algorithm to obtain better efficiency than linear search. In addition, the parallel search process can be carried out to further reduce the time cost. The security of the scheme is protected against two threat models by using the secure kNN algorithm. Experimental results demonstrate the efficiency of our proposed scheme. There are still many challenge problems in symmetric SE schemes. In the proposed scheme, the data owner is responsible for generating updating information and sending them to the cloud server. Thus, the data owner needs to store the unencrypted index tree and the information that are necessary to recalculate the IDF values. Such an active data owner may not be very suitable for the cloud computing model. It could be a meaningful but difficult future work to design a dynamic searchable encryption scheme whose updating operation can be completed by cloud server only, meanwhile reserving the ability to support multi-keyword ranked search. In addition, as the most of works about searchable encryption, our
  • 12. scheme mainly considers the challenge from the cloud server. Actually, there are many secure challenges in a multi-user scheme. Firstly, all the users usually keep the same secure key for trapdoor generation in a symmetric SE scheme. In this case, the revocation of the user is big challenge. If it is needed to revoke a user in this scheme, we need to rebuild the index and distribute the new secure keys to all the authorized users. Secondly, symmetric SE schemes usually assume that all the data users are trustworthy. It is not practical and a dishonest data user will lead to many secure problems. For example, a dishonest data user may search the documents and distribute the decrypted documents to the unauthorized ones. Even more, a dishonest data user may distribute his/her secure keys to the unauthorized ones. In the future works, we will try to improve the SE scheme to handle these challenge problems. REFERENCES [1] K. Ren, C.Wang, Q.Wang et al., “Security challenges for the public cloud,” IEEE Internet Computing, vol. 16, no. 1, pp. 69–73, 2012. [2] S. Kamara and K. Lauter, “Cryptographic cloud storage,” in Financial Cryptographyand Data Security. Springer, 2010, pp. 136–149. [3] C. Gentry, “A fully homomorphic encryption scheme,” Ph.D.dissertation, Stanford University, 2009.[4] O. Goldreich and R. Ostrovsky, “Software protection and simulation on oblivious rams,” Journal of the ACM (JACM), vol. 43, no. 3, pp. 431–473, 1996.
  • 13. [5] D. Boneh, G. Di Crescenzo, R. Ostrovsky, and G. Persiano, “Public key encryption with keyword search,” in Advances in Cryptology- Eurocrypt 2004. Springer, 2004, pp. 506–522. [6] D. Boneh, E. Kushilevitz, R. Ostrovsky, and W. E. Skeith III, “Public key encryption that allows pir queries,” in Advances in Cryptology-CRYPTO 2007. Springer, 2007, pp. 50–67. [7] D. X. Song, D. Wagner, and A. Perrig, “Practical techniques for searches on encrypted data,” in Security and Privacy, 2000. S&P 2000. Proceedings. 2000 IEEE Symposium on. IEEE, 2000, pp. 44– 55. [8] E.-J. Goh et al., “Secure indexes.” IACR Cryptology ePrint Archive, vol. 2003, p. 216, 2003. [9] Y.-C. Chang and M. Mitzenmacher, “Privacy preserving keyword searches on remote encrypted data,” in Proceedings of the Third international conference on Applied Cryptography and Network Security. Springer-Verlag, 2005, pp. 442–455. [10] R. Curtmola, J. Garay, S. Kamara, and R. Ostrovsky, “Searchable symmetric encryption: improved definitions and efficient constructions,” in Proceedings of the 13th ACM conference on Computer and communications security. ACM, 2006, pp. 79–88. [11] J. Li, Q. Wang, C. Wang, N. Cao, K. Ren, and W. Lou, “Fuzzy keyword search over encrypted data in cloud computing,” in INFOCOM, 2010 Proceedings IEEE. IEEE, 2010, pp. 1–5.