Cargando…

Leveraging Identity-by-Descent for Accurate Genotype Inference in Family Sequencing Data

Sequencing family DNA samples provides an attractive alternative to population based designs to identify rare variants associated with human disease due to the enrichment of causal variants in pedigrees. Previous studies showed that genotype calling accuracy can be improved by modeling family relate...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Bingshan, Wei, Qiang, Zhan, Xiaowei, Zhong, Xue, Chen, Wei, Li, Chun, Haines, Jonathan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4456389/
https://www.ncbi.nlm.nih.gov/pubmed/26043085
http://dx.doi.org/10.1371/journal.pgen.1005271
_version_ 1782374832678109184
author Li, Bingshan
Wei, Qiang
Zhan, Xiaowei
Zhong, Xue
Chen, Wei
Li, Chun
Haines, Jonathan
author_facet Li, Bingshan
Wei, Qiang
Zhan, Xiaowei
Zhong, Xue
Chen, Wei
Li, Chun
Haines, Jonathan
author_sort Li, Bingshan
collection PubMed
description Sequencing family DNA samples provides an attractive alternative to population based designs to identify rare variants associated with human disease due to the enrichment of causal variants in pedigrees. Previous studies showed that genotype calling accuracy can be improved by modeling family relatedness compared to standard calling algorithms. Current family-based variant calling methods use sequencing data on single variants and ignore the identity-by-descent (IBD) sharing along the genome. In this study we describe a new computational framework to accurately estimate the IBD sharing from the sequencing data, and to utilize the inferred IBD among family members to jointly call genotypes in pedigrees. Through simulations and application to real data, we showed that IBD can be reliably estimated across the genome, even at very low coverage (e.g. 2X), and genotype accuracy can be dramatically improved. Moreover, the improvement is more pronounced for variants with low frequencies, especially at low to intermediate coverage (e.g. 10X to 20X), making our approach effective in studying rare variants in cost-effective whole genome sequencing in pedigrees. We hope that our tool is useful to the research community for identifying rare variants for human disease through family-based sequencing.
format Online
Article
Text
id pubmed-4456389
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-44563892015-06-09 Leveraging Identity-by-Descent for Accurate Genotype Inference in Family Sequencing Data Li, Bingshan Wei, Qiang Zhan, Xiaowei Zhong, Xue Chen, Wei Li, Chun Haines, Jonathan PLoS Genet Research Article Sequencing family DNA samples provides an attractive alternative to population based designs to identify rare variants associated with human disease due to the enrichment of causal variants in pedigrees. Previous studies showed that genotype calling accuracy can be improved by modeling family relatedness compared to standard calling algorithms. Current family-based variant calling methods use sequencing data on single variants and ignore the identity-by-descent (IBD) sharing along the genome. In this study we describe a new computational framework to accurately estimate the IBD sharing from the sequencing data, and to utilize the inferred IBD among family members to jointly call genotypes in pedigrees. Through simulations and application to real data, we showed that IBD can be reliably estimated across the genome, even at very low coverage (e.g. 2X), and genotype accuracy can be dramatically improved. Moreover, the improvement is more pronounced for variants with low frequencies, especially at low to intermediate coverage (e.g. 10X to 20X), making our approach effective in studying rare variants in cost-effective whole genome sequencing in pedigrees. We hope that our tool is useful to the research community for identifying rare variants for human disease through family-based sequencing. Public Library of Science 2015-06-04 /pmc/articles/PMC4456389/ /pubmed/26043085 http://dx.doi.org/10.1371/journal.pgen.1005271 Text en © 2015 Li et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Li, Bingshan
Wei, Qiang
Zhan, Xiaowei
Zhong, Xue
Chen, Wei
Li, Chun
Haines, Jonathan
Leveraging Identity-by-Descent for Accurate Genotype Inference in Family Sequencing Data
title Leveraging Identity-by-Descent for Accurate Genotype Inference in Family Sequencing Data
title_full Leveraging Identity-by-Descent for Accurate Genotype Inference in Family Sequencing Data
title_fullStr Leveraging Identity-by-Descent for Accurate Genotype Inference in Family Sequencing Data
title_full_unstemmed Leveraging Identity-by-Descent for Accurate Genotype Inference in Family Sequencing Data
title_short Leveraging Identity-by-Descent for Accurate Genotype Inference in Family Sequencing Data
title_sort leveraging identity-by-descent for accurate genotype inference in family sequencing data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4456389/
https://www.ncbi.nlm.nih.gov/pubmed/26043085
http://dx.doi.org/10.1371/journal.pgen.1005271
work_keys_str_mv AT libingshan leveragingidentitybydescentforaccurategenotypeinferenceinfamilysequencingdata
AT weiqiang leveragingidentitybydescentforaccurategenotypeinferenceinfamilysequencingdata
AT zhanxiaowei leveragingidentitybydescentforaccurategenotypeinferenceinfamilysequencingdata
AT zhongxue leveragingidentitybydescentforaccurategenotypeinferenceinfamilysequencingdata
AT chenwei leveragingidentitybydescentforaccurategenotypeinferenceinfamilysequencingdata
AT lichun leveragingidentitybydescentforaccurategenotypeinferenceinfamilysequencingdata
AT hainesjonathan leveragingidentitybydescentforaccurategenotypeinferenceinfamilysequencingdata