Cargando…

A framework for the detection of de novo mutations in family-based sequencing data

Germline mutation detection from human DNA sequence data is challenging due to the rarity of such events relative to the intrinsic error rates of sequencing technologies and the uneven coverage across the genome. We developed PhaseByTransmission (PBT) to identify de novo single nucleotide variants a...

Descripción completa

Detalles Bibliográficos
Autores principales: Francioli, Laurent C, Cretu-Stancu, Mircea, Garimella, Kiran V, Fromer, Menachem, Kloosterman, Wigard P, Samocha, Kaitlin E, Neale, Benjamin M, Daly, Mark J, Banks, Eric, DePristo, Mark A, de Bakker, Paul IW
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5255947/
https://www.ncbi.nlm.nih.gov/pubmed/27876817
http://dx.doi.org/10.1038/ejhg.2016.147
_version_ 1782498619881947136
author Francioli, Laurent C
Cretu-Stancu, Mircea
Garimella, Kiran V
Fromer, Menachem
Kloosterman, Wigard P
Samocha, Kaitlin E
Neale, Benjamin M
Daly, Mark J
Banks, Eric
DePristo, Mark A
de Bakker, Paul IW
author_facet Francioli, Laurent C
Cretu-Stancu, Mircea
Garimella, Kiran V
Fromer, Menachem
Kloosterman, Wigard P
Samocha, Kaitlin E
Neale, Benjamin M
Daly, Mark J
Banks, Eric
DePristo, Mark A
de Bakker, Paul IW
author_sort Francioli, Laurent C
collection PubMed
description Germline mutation detection from human DNA sequence data is challenging due to the rarity of such events relative to the intrinsic error rates of sequencing technologies and the uneven coverage across the genome. We developed PhaseByTransmission (PBT) to identify de novo single nucleotide variants and short insertions and deletions (indels) from sequence data collected in parent-offspring trios. We compute the joint probability of the data given the genotype likelihoods in the individual family members, the known familial relationships and a prior probability for the mutation rate. Candidate de novo mutations (DNMs) are reported along with their posterior probability, providing a systematic way to prioritize them for validation. Our tool is integrated in the Genome Analysis Toolkit and can be used together with the ReadBackedPhasing module to infer the parental origin of DNMs based on phase-informative reads. Using simulated data, we show that PBT outperforms existing tools, especially in low coverage data and on the X chromosome. We further show that PBT displays high validation rates on empirical parent-offspring sequencing data for whole-exome data from 104 trios and X-chromosome data from 249 parent-offspring families. Finally, we demonstrate an association between father's age at conception and the number of DNMs in female offspring's X chromosome, consistent with previous literature reports.
format Online
Article
Text
id pubmed-5255947
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-52559472017-02-03 A framework for the detection of de novo mutations in family-based sequencing data Francioli, Laurent C Cretu-Stancu, Mircea Garimella, Kiran V Fromer, Menachem Kloosterman, Wigard P Samocha, Kaitlin E Neale, Benjamin M Daly, Mark J Banks, Eric DePristo, Mark A de Bakker, Paul IW Eur J Hum Genet Article Germline mutation detection from human DNA sequence data is challenging due to the rarity of such events relative to the intrinsic error rates of sequencing technologies and the uneven coverage across the genome. We developed PhaseByTransmission (PBT) to identify de novo single nucleotide variants and short insertions and deletions (indels) from sequence data collected in parent-offspring trios. We compute the joint probability of the data given the genotype likelihoods in the individual family members, the known familial relationships and a prior probability for the mutation rate. Candidate de novo mutations (DNMs) are reported along with their posterior probability, providing a systematic way to prioritize them for validation. Our tool is integrated in the Genome Analysis Toolkit and can be used together with the ReadBackedPhasing module to infer the parental origin of DNMs based on phase-informative reads. Using simulated data, we show that PBT outperforms existing tools, especially in low coverage data and on the X chromosome. We further show that PBT displays high validation rates on empirical parent-offspring sequencing data for whole-exome data from 104 trios and X-chromosome data from 249 parent-offspring families. Finally, we demonstrate an association between father's age at conception and the number of DNMs in female offspring's X chromosome, consistent with previous literature reports. Nature Publishing Group 2017-02 2016-11-23 /pmc/articles/PMC5255947/ /pubmed/27876817 http://dx.doi.org/10.1038/ejhg.2016.147 Text en Copyright © 2017 The Author(s) http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Francioli, Laurent C
Cretu-Stancu, Mircea
Garimella, Kiran V
Fromer, Menachem
Kloosterman, Wigard P
Samocha, Kaitlin E
Neale, Benjamin M
Daly, Mark J
Banks, Eric
DePristo, Mark A
de Bakker, Paul IW
A framework for the detection of de novo mutations in family-based sequencing data
title A framework for the detection of de novo mutations in family-based sequencing data
title_full A framework for the detection of de novo mutations in family-based sequencing data
title_fullStr A framework for the detection of de novo mutations in family-based sequencing data
title_full_unstemmed A framework for the detection of de novo mutations in family-based sequencing data
title_short A framework for the detection of de novo mutations in family-based sequencing data
title_sort framework for the detection of de novo mutations in family-based sequencing data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5255947/
https://www.ncbi.nlm.nih.gov/pubmed/27876817
http://dx.doi.org/10.1038/ejhg.2016.147
work_keys_str_mv AT franciolilaurentc aframeworkforthedetectionofdenovomutationsinfamilybasedsequencingdata
AT cretustancumircea aframeworkforthedetectionofdenovomutationsinfamilybasedsequencingdata
AT garimellakiranv aframeworkforthedetectionofdenovomutationsinfamilybasedsequencingdata
AT fromermenachem aframeworkforthedetectionofdenovomutationsinfamilybasedsequencingdata
AT kloostermanwigardp aframeworkforthedetectionofdenovomutationsinfamilybasedsequencingdata
AT aframeworkforthedetectionofdenovomutationsinfamilybasedsequencingdata
AT samochakaitline aframeworkforthedetectionofdenovomutationsinfamilybasedsequencingdata
AT nealebenjaminm aframeworkforthedetectionofdenovomutationsinfamilybasedsequencingdata
AT dalymarkj aframeworkforthedetectionofdenovomutationsinfamilybasedsequencingdata
AT bankseric aframeworkforthedetectionofdenovomutationsinfamilybasedsequencingdata
AT depristomarka aframeworkforthedetectionofdenovomutationsinfamilybasedsequencingdata
AT debakkerpauliw aframeworkforthedetectionofdenovomutationsinfamilybasedsequencingdata
AT franciolilaurentc frameworkforthedetectionofdenovomutationsinfamilybasedsequencingdata
AT cretustancumircea frameworkforthedetectionofdenovomutationsinfamilybasedsequencingdata
AT garimellakiranv frameworkforthedetectionofdenovomutationsinfamilybasedsequencingdata
AT fromermenachem frameworkforthedetectionofdenovomutationsinfamilybasedsequencingdata
AT kloostermanwigardp frameworkforthedetectionofdenovomutationsinfamilybasedsequencingdata
AT frameworkforthedetectionofdenovomutationsinfamilybasedsequencingdata
AT samochakaitline frameworkforthedetectionofdenovomutationsinfamilybasedsequencingdata
AT nealebenjaminm frameworkforthedetectionofdenovomutationsinfamilybasedsequencingdata
AT dalymarkj frameworkforthedetectionofdenovomutationsinfamilybasedsequencingdata
AT bankseric frameworkforthedetectionofdenovomutationsinfamilybasedsequencingdata
AT depristomarka frameworkforthedetectionofdenovomutationsinfamilybasedsequencingdata
AT debakkerpauliw frameworkforthedetectionofdenovomutationsinfamilybasedsequencingdata