Cargando…

Genomics and Privacy: Implications of the New Reality of Closed Data for the Field

Open source and open data have been driving forces in bioinformatics in the past. However, privacy concerns may soon change the landscape, limiting future access to important data sets, including personal genomics data. Here we survey this situation in some detail, describing, in particular, how the...

Descripción completa

Detalles Bibliográficos
Autores principales: Greenbaum, Dov, Sboner, Andrea, Mu, Xinmeng Jasmine, Gerstein, Mark
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3228779/
https://www.ncbi.nlm.nih.gov/pubmed/22144881
http://dx.doi.org/10.1371/journal.pcbi.1002278
_version_ 1782217869846642688
author Greenbaum, Dov
Sboner, Andrea
Mu, Xinmeng Jasmine
Gerstein, Mark
author_facet Greenbaum, Dov
Sboner, Andrea
Mu, Xinmeng Jasmine
Gerstein, Mark
author_sort Greenbaum, Dov
collection PubMed
description Open source and open data have been driving forces in bioinformatics in the past. However, privacy concerns may soon change the landscape, limiting future access to important data sets, including personal genomics data. Here we survey this situation in some detail, describing, in particular, how the large scale of the data from personal genomic sequencing makes it especially hard to share data, exacerbating the privacy problem. We also go over various aspects of genomic privacy: first, there is basic identifiability of subjects having their genome sequenced. However, even for individuals who have consented to be identified, there is the prospect of very detailed future characterization of their genotype, which, unanticipated at the time of their consent, may be more personal and invasive than the release of their medical records. We go over various computational strategies for dealing with the issue of genomic privacy. One can “slice” and reformat datasets to allow them to be partially shared while securing the most private variants. This is particularly applicable to functional genomics information, which can be largely processed without variant information. For handling the most private data there are a number of legal and technological approaches—for example, modifying the informed consent procedure to acknowledge that privacy cannot be guaranteed, and/or employing a secure cloud computing environment. Cloud computing in particular may allow access to the data in a more controlled fashion than the current practice of downloading and computing on large datasets. Furthermore, it may be particularly advantageous for small labs, given that the burden of many privacy issues falls disproportionately on them in comparison to large corporations and genome centers. Finally, we discuss how education of future genetics researchers will be important, with curriculums emphasizing privacy and data security. However, teaching personal genomics with identifiable subjects in the university setting will, in turn, create additional privacy issues and social conundrums.
format Online
Article
Text
id pubmed-3228779
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-32287792011-12-05 Genomics and Privacy: Implications of the New Reality of Closed Data for the Field Greenbaum, Dov Sboner, Andrea Mu, Xinmeng Jasmine Gerstein, Mark PLoS Comput Biol Review Open source and open data have been driving forces in bioinformatics in the past. However, privacy concerns may soon change the landscape, limiting future access to important data sets, including personal genomics data. Here we survey this situation in some detail, describing, in particular, how the large scale of the data from personal genomic sequencing makes it especially hard to share data, exacerbating the privacy problem. We also go over various aspects of genomic privacy: first, there is basic identifiability of subjects having their genome sequenced. However, even for individuals who have consented to be identified, there is the prospect of very detailed future characterization of their genotype, which, unanticipated at the time of their consent, may be more personal and invasive than the release of their medical records. We go over various computational strategies for dealing with the issue of genomic privacy. One can “slice” and reformat datasets to allow them to be partially shared while securing the most private variants. This is particularly applicable to functional genomics information, which can be largely processed without variant information. For handling the most private data there are a number of legal and technological approaches—for example, modifying the informed consent procedure to acknowledge that privacy cannot be guaranteed, and/or employing a secure cloud computing environment. Cloud computing in particular may allow access to the data in a more controlled fashion than the current practice of downloading and computing on large datasets. Furthermore, it may be particularly advantageous for small labs, given that the burden of many privacy issues falls disproportionately on them in comparison to large corporations and genome centers. Finally, we discuss how education of future genetics researchers will be important, with curriculums emphasizing privacy and data security. However, teaching personal genomics with identifiable subjects in the university setting will, in turn, create additional privacy issues and social conundrums. Public Library of Science 2011-12-01 /pmc/articles/PMC3228779/ /pubmed/22144881 http://dx.doi.org/10.1371/journal.pcbi.1002278 Text en Greenbaum et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Review
Greenbaum, Dov
Sboner, Andrea
Mu, Xinmeng Jasmine
Gerstein, Mark
Genomics and Privacy: Implications of the New Reality of Closed Data for the Field
title Genomics and Privacy: Implications of the New Reality of Closed Data for the Field
title_full Genomics and Privacy: Implications of the New Reality of Closed Data for the Field
title_fullStr Genomics and Privacy: Implications of the New Reality of Closed Data for the Field
title_full_unstemmed Genomics and Privacy: Implications of the New Reality of Closed Data for the Field
title_short Genomics and Privacy: Implications of the New Reality of Closed Data for the Field
title_sort genomics and privacy: implications of the new reality of closed data for the field
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3228779/
https://www.ncbi.nlm.nih.gov/pubmed/22144881
http://dx.doi.org/10.1371/journal.pcbi.1002278
work_keys_str_mv AT greenbaumdov genomicsandprivacyimplicationsofthenewrealityofcloseddataforthefield
AT sbonerandrea genomicsandprivacyimplicationsofthenewrealityofcloseddataforthefield
AT muxinmengjasmine genomicsandprivacyimplicationsofthenewrealityofcloseddataforthefield
AT gersteinmark genomicsandprivacyimplicationsofthenewrealityofcloseddataforthefield