Cargando…
Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects
Hundreds of thousands of human whole genome sequencing (WGS) datasets will be generated over the next few years. These data are more valuable in aggregate: joint analysis of genomes from many sources increases sample size and statistical power. A central challenge for joint analysis is that differen...
Autores principales: | , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6168605/ https://www.ncbi.nlm.nih.gov/pubmed/30279509 http://dx.doi.org/10.1038/s41467-018-06159-4 |
_version_ | 1783360385357709312 |
---|---|
author | Regier, Allison A. Farjoun, Yossi Larson, David E. Krasheninina, Olga Kang, Hyun Min Howrigan, Daniel P. Chen, Bo-Juen Kher, Manisha Banks, Eric Ames, Darren C. English, Adam C. Li, Heng Xing, Jinchuan Zhang, Yeting Matise, Tara Abecasis, Goncalo R. Salerno, Will Zody, Michael C. Neale, Benjamin M. Hall, Ira M. |
author_facet | Regier, Allison A. Farjoun, Yossi Larson, David E. Krasheninina, Olga Kang, Hyun Min Howrigan, Daniel P. Chen, Bo-Juen Kher, Manisha Banks, Eric Ames, Darren C. English, Adam C. Li, Heng Xing, Jinchuan Zhang, Yeting Matise, Tara Abecasis, Goncalo R. Salerno, Will Zody, Michael C. Neale, Benjamin M. Hall, Ira M. |
author_sort | Regier, Allison A. |
collection | PubMed |
description | Hundreds of thousands of human whole genome sequencing (WGS) datasets will be generated over the next few years. These data are more valuable in aggregate: joint analysis of genomes from many sources increases sample size and statistical power. A central challenge for joint analysis is that different WGS data processing pipelines cause substantial differences in variant calling in combined datasets, necessitating computationally expensive reprocessing. This approach is no longer tenable given the scale of current studies and data volumes. Here, we define WGS data processing standards that allow different groups to produce functionally equivalent (FE) results, yet still innovate on data processing pipelines. We present initial FE pipelines developed at five genome centers and show that they yield similar variant calling results and produce significantly less variability than sequencing replicates. This work alleviates a key technical bottleneck for genome aggregation and helps lay the foundation for community-wide human genetics studies. |
format | Online Article Text |
id | pubmed-6168605 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-61686052018-10-04 Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects Regier, Allison A. Farjoun, Yossi Larson, David E. Krasheninina, Olga Kang, Hyun Min Howrigan, Daniel P. Chen, Bo-Juen Kher, Manisha Banks, Eric Ames, Darren C. English, Adam C. Li, Heng Xing, Jinchuan Zhang, Yeting Matise, Tara Abecasis, Goncalo R. Salerno, Will Zody, Michael C. Neale, Benjamin M. Hall, Ira M. Nat Commun Article Hundreds of thousands of human whole genome sequencing (WGS) datasets will be generated over the next few years. These data are more valuable in aggregate: joint analysis of genomes from many sources increases sample size and statistical power. A central challenge for joint analysis is that different WGS data processing pipelines cause substantial differences in variant calling in combined datasets, necessitating computationally expensive reprocessing. This approach is no longer tenable given the scale of current studies and data volumes. Here, we define WGS data processing standards that allow different groups to produce functionally equivalent (FE) results, yet still innovate on data processing pipelines. We present initial FE pipelines developed at five genome centers and show that they yield similar variant calling results and produce significantly less variability than sequencing replicates. This work alleviates a key technical bottleneck for genome aggregation and helps lay the foundation for community-wide human genetics studies. Nature Publishing Group UK 2018-10-02 /pmc/articles/PMC6168605/ /pubmed/30279509 http://dx.doi.org/10.1038/s41467-018-06159-4 Text en © The Author(s) 2018 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Regier, Allison A. Farjoun, Yossi Larson, David E. Krasheninina, Olga Kang, Hyun Min Howrigan, Daniel P. Chen, Bo-Juen Kher, Manisha Banks, Eric Ames, Darren C. English, Adam C. Li, Heng Xing, Jinchuan Zhang, Yeting Matise, Tara Abecasis, Goncalo R. Salerno, Will Zody, Michael C. Neale, Benjamin M. Hall, Ira M. Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects |
title | Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects |
title_full | Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects |
title_fullStr | Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects |
title_full_unstemmed | Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects |
title_short | Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects |
title_sort | functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6168605/ https://www.ncbi.nlm.nih.gov/pubmed/30279509 http://dx.doi.org/10.1038/s41467-018-06159-4 |
work_keys_str_mv | AT regierallisona functionalequivalenceofgenomesequencinganalysispipelinesenablesharmonizedvariantcallingacrosshumangeneticsprojects AT farjounyossi functionalequivalenceofgenomesequencinganalysispipelinesenablesharmonizedvariantcallingacrosshumangeneticsprojects AT larsondavide functionalequivalenceofgenomesequencinganalysispipelinesenablesharmonizedvariantcallingacrosshumangeneticsprojects AT krashenininaolga functionalequivalenceofgenomesequencinganalysispipelinesenablesharmonizedvariantcallingacrosshumangeneticsprojects AT kanghyunmin functionalequivalenceofgenomesequencinganalysispipelinesenablesharmonizedvariantcallingacrosshumangeneticsprojects AT howrigandanielp functionalequivalenceofgenomesequencinganalysispipelinesenablesharmonizedvariantcallingacrosshumangeneticsprojects AT chenbojuen functionalequivalenceofgenomesequencinganalysispipelinesenablesharmonizedvariantcallingacrosshumangeneticsprojects AT khermanisha functionalequivalenceofgenomesequencinganalysispipelinesenablesharmonizedvariantcallingacrosshumangeneticsprojects AT bankseric functionalequivalenceofgenomesequencinganalysispipelinesenablesharmonizedvariantcallingacrosshumangeneticsprojects AT amesdarrenc functionalequivalenceofgenomesequencinganalysispipelinesenablesharmonizedvariantcallingacrosshumangeneticsprojects AT englishadamc functionalequivalenceofgenomesequencinganalysispipelinesenablesharmonizedvariantcallingacrosshumangeneticsprojects AT liheng functionalequivalenceofgenomesequencinganalysispipelinesenablesharmonizedvariantcallingacrosshumangeneticsprojects AT xingjinchuan functionalequivalenceofgenomesequencinganalysispipelinesenablesharmonizedvariantcallingacrosshumangeneticsprojects AT zhangyeting functionalequivalenceofgenomesequencinganalysispipelinesenablesharmonizedvariantcallingacrosshumangeneticsprojects AT matisetara functionalequivalenceofgenomesequencinganalysispipelinesenablesharmonizedvariantcallingacrosshumangeneticsprojects AT abecasisgoncalor functionalequivalenceofgenomesequencinganalysispipelinesenablesharmonizedvariantcallingacrosshumangeneticsprojects AT salernowill functionalequivalenceofgenomesequencinganalysispipelinesenablesharmonizedvariantcallingacrosshumangeneticsprojects AT zodymichaelc functionalequivalenceofgenomesequencinganalysispipelinesenablesharmonizedvariantcallingacrosshumangeneticsprojects AT nealebenjaminm functionalequivalenceofgenomesequencinganalysispipelinesenablesharmonizedvariantcallingacrosshumangeneticsprojects AT halliram functionalequivalenceofgenomesequencinganalysispipelinesenablesharmonizedvariantcallingacrosshumangeneticsprojects |