Limits...
Secure Genomic Computation through Site-Wise Encryption.

Zhao Y, Wang X, Tang H - AMIA Jt Summits Transl Sci Proc (2015)

Bottom Line: To address this issue, here we present a site-wise encryption approach to encrypt whole human genome sequences, which can be subject to secure searching of genomic signatures on public clouds.We implemented this method within the Hadoop framework, and tested it on the case of searching disease markers retrieved from the ClinVar database against patients' genomic sequences.The secure search runs only one order of magnitude slower than the simple search without encryption, indicating our method is ready to be used for secure genomic computation on public clouds.

View Article: PubMed Central - PubMed

Affiliation: School of Informatics and Computing, Indiana University, Bloomington, IN 47405: primary advisor.

ABSTRACT
Commercial clouds provide on-demand IT services for big-data analysis, which have become an attractive option for users who have no access to comparable infrastructure. However, utilizing these services for human genome analysis is highly risky, as human genomic data contains identifiable information of human individuals and their disease susceptibility. Therefore, currently, no computation on personal human genomic data is conducted on public clouds. To address this issue, here we present a site-wise encryption approach to encrypt whole human genome sequences, which can be subject to secure searching of genomic signatures on public clouds. We implemented this method within the Hadoop framework, and tested it on the case of searching disease markers retrieved from the ClinVar database against patients' genomic sequences. The secure search runs only one order of magnitude slower than the simple search without encryption, indicating our method is ready to be used for secure genomic computation on public clouds.

No MeSH data available.


The randomize-verification scheme has three steps: (a) reference records encryption; (b) query records encryption; and (c) search on public clouds.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4525260&req=5

f2-2091646: The randomize-verification scheme has three steps: (a) reference records encryption; (b) query records encryption; and (c) search on public clouds.

Mentions: A problem of the basic scheme is that an attacker who monitors queries observes the frequencies of different queries, which could be an information leak of concern. Figure 2 illustrates an enhanced approach capable of withstanding this threat, called randomize-verify scheme. The idea is to make the ciphertexts for the different instances of the same query look different, thereby thwarting the attempt to accumulate the frequency of a specific query. Specifically, to encrypt a reference record g (Fig 2a), the data owner generates a n-bit random string (rsR) by using RD, in addition to the key KAES. He first applies E with the key KAES to g, resulting in the encrypted record , and then computes the cyphertext by using the XOR operation between the encrypted record and the n-bit random string concatenated with its first m-bit hash value (H0:m(rsR)) computed by the hash function H, where ▪ denote the concatenation operation on strings.


Secure Genomic Computation through Site-Wise Encryption.

Zhao Y, Wang X, Tang H - AMIA Jt Summits Transl Sci Proc (2015)

The randomize-verification scheme has three steps: (a) reference records encryption; (b) query records encryption; and (c) search on public clouds.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4525260&req=5

f2-2091646: The randomize-verification scheme has three steps: (a) reference records encryption; (b) query records encryption; and (c) search on public clouds.
Mentions: A problem of the basic scheme is that an attacker who monitors queries observes the frequencies of different queries, which could be an information leak of concern. Figure 2 illustrates an enhanced approach capable of withstanding this threat, called randomize-verify scheme. The idea is to make the ciphertexts for the different instances of the same query look different, thereby thwarting the attempt to accumulate the frequency of a specific query. Specifically, to encrypt a reference record g (Fig 2a), the data owner generates a n-bit random string (rsR) by using RD, in addition to the key KAES. He first applies E with the key KAES to g, resulting in the encrypted record , and then computes the cyphertext by using the XOR operation between the encrypted record and the n-bit random string concatenated with its first m-bit hash value (H0:m(rsR)) computed by the hash function H, where ▪ denote the concatenation operation on strings.

Bottom Line: To address this issue, here we present a site-wise encryption approach to encrypt whole human genome sequences, which can be subject to secure searching of genomic signatures on public clouds.We implemented this method within the Hadoop framework, and tested it on the case of searching disease markers retrieved from the ClinVar database against patients' genomic sequences.The secure search runs only one order of magnitude slower than the simple search without encryption, indicating our method is ready to be used for secure genomic computation on public clouds.

View Article: PubMed Central - PubMed

Affiliation: School of Informatics and Computing, Indiana University, Bloomington, IN 47405: primary advisor.

ABSTRACT
Commercial clouds provide on-demand IT services for big-data analysis, which have become an attractive option for users who have no access to comparable infrastructure. However, utilizing these services for human genome analysis is highly risky, as human genomic data contains identifiable information of human individuals and their disease susceptibility. Therefore, currently, no computation on personal human genomic data is conducted on public clouds. To address this issue, here we present a site-wise encryption approach to encrypt whole human genome sequences, which can be subject to secure searching of genomic signatures on public clouds. We implemented this method within the Hadoop framework, and tested it on the case of searching disease markers retrieved from the ClinVar database against patients' genomic sequences. The secure search runs only one order of magnitude slower than the simple search without encryption, indicating our method is ready to be used for secure genomic computation on public clouds.

No MeSH data available.