Files

Abstract

Secure storage of genomic data is of great and increasing importance. The scientific community's improving ability to interpret individuals' genetic materials and the growing size of genetic database populations have been aggravating the potential consequences of data breaches. The prevalent use of passwords to generate encryption keys thus poses an especially serious problem when applied to genetic data. Weak passwords can jeopardize genetic data in the short term; given the multi-decade lifespan of genetic data, even the use of strong passwords with conventional encryption can lead to compromise. We present a tool called {\em GenoGuard} to provide strong protection for genomic data both today and in the long term. GenoGuard incorporates a new theoretical framework for encryption called honey encryption (HE) that can provide information-theoretic confidentiality guarantees for encrypted data. Previously proposed HE schemes, however, can unfortunately be applied to messages from only a very restricted set of probability distributions. GenoGuard thus addresses the open problem of applying HE techniques to the highly non-uniform probability distributions characterizing sequences of genetic data. In GenoGuard, a potential adversary can attempt to guess keys or passwords exhaustively and decrypt via a brute-force attack. We prove that decryption under any key, however, will yield a plausible genome sequence; thus GenoGuard offers an information-theoretic security guarantee against message-recovery attacks. We also explore attacks using side information. Finally, we present an efficient and parallelized software implementation of GenoGuard.

Details

Actions

Preview