Cargando…
Leveraging Identity-by-Descent for Accurate Genotype Inference in Family Sequencing Data
Sequencing family DNA samples provides an attractive alternative to population based designs to identify rare variants associated with human disease due to the enrichment of causal variants in pedigrees. Previous studies showed that genotype calling accuracy can be improved by modeling family relate...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4456389/ https://www.ncbi.nlm.nih.gov/pubmed/26043085 http://dx.doi.org/10.1371/journal.pgen.1005271 |
_version_ | 1782374832678109184 |
---|---|
author | Li, Bingshan Wei, Qiang Zhan, Xiaowei Zhong, Xue Chen, Wei Li, Chun Haines, Jonathan |
author_facet | Li, Bingshan Wei, Qiang Zhan, Xiaowei Zhong, Xue Chen, Wei Li, Chun Haines, Jonathan |
author_sort | Li, Bingshan |
collection | PubMed |
description | Sequencing family DNA samples provides an attractive alternative to population based designs to identify rare variants associated with human disease due to the enrichment of causal variants in pedigrees. Previous studies showed that genotype calling accuracy can be improved by modeling family relatedness compared to standard calling algorithms. Current family-based variant calling methods use sequencing data on single variants and ignore the identity-by-descent (IBD) sharing along the genome. In this study we describe a new computational framework to accurately estimate the IBD sharing from the sequencing data, and to utilize the inferred IBD among family members to jointly call genotypes in pedigrees. Through simulations and application to real data, we showed that IBD can be reliably estimated across the genome, even at very low coverage (e.g. 2X), and genotype accuracy can be dramatically improved. Moreover, the improvement is more pronounced for variants with low frequencies, especially at low to intermediate coverage (e.g. 10X to 20X), making our approach effective in studying rare variants in cost-effective whole genome sequencing in pedigrees. We hope that our tool is useful to the research community for identifying rare variants for human disease through family-based sequencing. |
format | Online Article Text |
id | pubmed-4456389 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-44563892015-06-09 Leveraging Identity-by-Descent for Accurate Genotype Inference in Family Sequencing Data Li, Bingshan Wei, Qiang Zhan, Xiaowei Zhong, Xue Chen, Wei Li, Chun Haines, Jonathan PLoS Genet Research Article Sequencing family DNA samples provides an attractive alternative to population based designs to identify rare variants associated with human disease due to the enrichment of causal variants in pedigrees. Previous studies showed that genotype calling accuracy can be improved by modeling family relatedness compared to standard calling algorithms. Current family-based variant calling methods use sequencing data on single variants and ignore the identity-by-descent (IBD) sharing along the genome. In this study we describe a new computational framework to accurately estimate the IBD sharing from the sequencing data, and to utilize the inferred IBD among family members to jointly call genotypes in pedigrees. Through simulations and application to real data, we showed that IBD can be reliably estimated across the genome, even at very low coverage (e.g. 2X), and genotype accuracy can be dramatically improved. Moreover, the improvement is more pronounced for variants with low frequencies, especially at low to intermediate coverage (e.g. 10X to 20X), making our approach effective in studying rare variants in cost-effective whole genome sequencing in pedigrees. We hope that our tool is useful to the research community for identifying rare variants for human disease through family-based sequencing. Public Library of Science 2015-06-04 /pmc/articles/PMC4456389/ /pubmed/26043085 http://dx.doi.org/10.1371/journal.pgen.1005271 Text en © 2015 Li et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Li, Bingshan Wei, Qiang Zhan, Xiaowei Zhong, Xue Chen, Wei Li, Chun Haines, Jonathan Leveraging Identity-by-Descent for Accurate Genotype Inference in Family Sequencing Data |
title | Leveraging Identity-by-Descent for Accurate Genotype Inference in Family Sequencing Data |
title_full | Leveraging Identity-by-Descent for Accurate Genotype Inference in Family Sequencing Data |
title_fullStr | Leveraging Identity-by-Descent for Accurate Genotype Inference in Family Sequencing Data |
title_full_unstemmed | Leveraging Identity-by-Descent for Accurate Genotype Inference in Family Sequencing Data |
title_short | Leveraging Identity-by-Descent for Accurate Genotype Inference in Family Sequencing Data |
title_sort | leveraging identity-by-descent for accurate genotype inference in family sequencing data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4456389/ https://www.ncbi.nlm.nih.gov/pubmed/26043085 http://dx.doi.org/10.1371/journal.pgen.1005271 |
work_keys_str_mv | AT libingshan leveragingidentitybydescentforaccurategenotypeinferenceinfamilysequencingdata AT weiqiang leveragingidentitybydescentforaccurategenotypeinferenceinfamilysequencingdata AT zhanxiaowei leveragingidentitybydescentforaccurategenotypeinferenceinfamilysequencingdata AT zhongxue leveragingidentitybydescentforaccurategenotypeinferenceinfamilysequencingdata AT chenwei leveragingidentitybydescentforaccurategenotypeinferenceinfamilysequencingdata AT lichun leveragingidentitybydescentforaccurategenotypeinferenceinfamilysequencingdata AT hainesjonathan leveragingidentitybydescentforaccurategenotypeinferenceinfamilysequencingdata |