Privacy-Preserving Processing of Raw Genomic Data

Ayday, Erman; Raisaro, Jean Louis; Hengartner, Urs; Molyneaux, Adam; Hubaux, Jean-Pierre

report

Ayday, Erman

•

Raisaro, Jean Louis

•

Hengartner, Urs

2013

Geneticists prefer to store patients' aligned, raw genomic data, in addition to their variant calls (compact and summarized form of the raw data), mainly because of the immaturity of bioinformatic algorithms and sequencing platforms. Thus, we propose a privacy-preserving system to protect the privacy of aligned, raw genomic data. The raw genomic data of a patient includes millions of short reads, each comprised of between 100 and 400 nucleotides (genomic letters). We propose storing these short reads at a biobank in encrypted form. The proposed scheme enables a medical unit (e.g., a pharmaceutical company or a hospital) to privately retrieve a subset of the short reads of the patients (which include a definite range of nucleotides depending on the type of the genetic test) without revealing the nature of the genetic test to the biobank. Furthermore, the proposed scheme lets the biobank mask particular parts of the retrieved short reads if (i) some parts of the provided short reads are out of the requested range, or (ii) the patient does not give consent to some parts of the provided short reads (e.g., parts revealing sensitive diseases). We evaluate the proposed scheme to show the amount of unauthorized genomic data leakage it prevents. Finally, we implement the proposed scheme and assess its practicality.

Name

DPM_13_tech_report.pdf

Type

Postprint

Version

Accepted version

Access type

openaccess

Size

685.41 KB

Format

Adobe PDF

Checksum (MD5)

a463dbfd14189a83200851b096ffcb76