Cargando…

Crypt4GH: a file format standard enabling native access to encrypted data

MOTIVATION: The majority of genome analysis tools and pipelines require data to be decrypted for access. This potentially leaves sensitive genetic data exposed, either because the unencrypted data is not removed after analysis, or because the data leaves traces on the permanent storage medium. RESUL...

Descripción completa

Detalles Bibliográficos
Autores principales: Senf, Alexander, Davies, Robert, Haziza, Frédéric, Marshall, John, Troncoso-Pastoriza, Juan, Hofmann, Oliver, Keane, Thomas M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8522443/
https://www.ncbi.nlm.nih.gov/pubmed/33543751
http://dx.doi.org/10.1093/bioinformatics/btab087
_version_ 1784585086857904128
author Senf, Alexander
Davies, Robert
Haziza, Frédéric
Marshall, John
Troncoso-Pastoriza, Juan
Hofmann, Oliver
Keane, Thomas M.
author_facet Senf, Alexander
Davies, Robert
Haziza, Frédéric
Marshall, John
Troncoso-Pastoriza, Juan
Hofmann, Oliver
Keane, Thomas M.
author_sort Senf, Alexander
collection PubMed
description MOTIVATION: The majority of genome analysis tools and pipelines require data to be decrypted for access. This potentially leaves sensitive genetic data exposed, either because the unencrypted data is not removed after analysis, or because the data leaves traces on the permanent storage medium. RESULTS: : We defined a file container specification enabling direct byte-level compatible random access to encrypted genetic data stored in community standards such as SAM/BAM/CRAM/VCF/BCF. By standardizing this format, we show how it can be added as a native file format to genomic libraries, enabling direct analysis of encrypted data without the need to create a decrypted copy. AVAILABILITY AND IMPLEMENTATION: The Crypt4GH specification can be found at: http://samtools.github.io/hts-specs/crypt4gh.pdf. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-8522443
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-85224432021-10-19 Crypt4GH: a file format standard enabling native access to encrypted data Senf, Alexander Davies, Robert Haziza, Frédéric Marshall, John Troncoso-Pastoriza, Juan Hofmann, Oliver Keane, Thomas M. Bioinformatics Applications Notes MOTIVATION: The majority of genome analysis tools and pipelines require data to be decrypted for access. This potentially leaves sensitive genetic data exposed, either because the unencrypted data is not removed after analysis, or because the data leaves traces on the permanent storage medium. RESULTS: : We defined a file container specification enabling direct byte-level compatible random access to encrypted genetic data stored in community standards such as SAM/BAM/CRAM/VCF/BCF. By standardizing this format, we show how it can be added as a native file format to genomic libraries, enabling direct analysis of encrypted data without the need to create a decrypted copy. AVAILABILITY AND IMPLEMENTATION: The Crypt4GH specification can be found at: http://samtools.github.io/hts-specs/crypt4gh.pdf. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-02-05 /pmc/articles/PMC8522443/ /pubmed/33543751 http://dx.doi.org/10.1093/bioinformatics/btab087 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Applications Notes
Senf, Alexander
Davies, Robert
Haziza, Frédéric
Marshall, John
Troncoso-Pastoriza, Juan
Hofmann, Oliver
Keane, Thomas M.
Crypt4GH: a file format standard enabling native access to encrypted data
title Crypt4GH: a file format standard enabling native access to encrypted data
title_full Crypt4GH: a file format standard enabling native access to encrypted data
title_fullStr Crypt4GH: a file format standard enabling native access to encrypted data
title_full_unstemmed Crypt4GH: a file format standard enabling native access to encrypted data
title_short Crypt4GH: a file format standard enabling native access to encrypted data
title_sort crypt4gh: a file format standard enabling native access to encrypted data
topic Applications Notes
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8522443/
https://www.ncbi.nlm.nih.gov/pubmed/33543751
http://dx.doi.org/10.1093/bioinformatics/btab087
work_keys_str_mv AT senfalexander crypt4ghafileformatstandardenablingnativeaccesstoencrypteddata
AT daviesrobert crypt4ghafileformatstandardenablingnativeaccesstoencrypteddata
AT hazizafrederic crypt4ghafileformatstandardenablingnativeaccesstoencrypteddata
AT marshalljohn crypt4ghafileformatstandardenablingnativeaccesstoencrypteddata
AT troncosopastorizajuan crypt4ghafileformatstandardenablingnativeaccesstoencrypteddata
AT hofmannoliver crypt4ghafileformatstandardenablingnativeaccesstoencrypteddata
AT keanethomasm crypt4ghafileformatstandardenablingnativeaccesstoencrypteddata