Cargando…

Linkage Disequilibrium Estimation in Low Coverage High-Throughput Sequencing Data

High-throughput sequencing methods that multiplex a large number of individuals have provided a cost-effective approach for discovering genome-wide genetic variation in large populations. These sequencing methods are increasingly being utilized in population genetic studies across a diverse range of...

Descripción completa

Detalles Bibliográficos
Autores principales: Bilton, Timothy P., McEwan, John C., Clarke, Shannon M., Brauning, Rudiger, van Stijn, Tracey C., Rowe, Suzanne J., Dodds, Ken G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Genetics Society of America 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5972415/
https://www.ncbi.nlm.nih.gov/pubmed/29588288
http://dx.doi.org/10.1534/genetics.118.300831
_version_ 1783326431763234816
author Bilton, Timothy P.
McEwan, John C.
Clarke, Shannon M.
Brauning, Rudiger
van Stijn, Tracey C.
Rowe, Suzanne J.
Dodds, Ken G.
author_facet Bilton, Timothy P.
McEwan, John C.
Clarke, Shannon M.
Brauning, Rudiger
van Stijn, Tracey C.
Rowe, Suzanne J.
Dodds, Ken G.
author_sort Bilton, Timothy P.
collection PubMed
description High-throughput sequencing methods that multiplex a large number of individuals have provided a cost-effective approach for discovering genome-wide genetic variation in large populations. These sequencing methods are increasingly being utilized in population genetic studies across a diverse range of species. Two side-effects of these methods, however, are (1) sequencing errors and (2) heterozygous genotypes called as homozygous due to only one allele at a particular locus being sequenced, which occurs when the sequencing depth is insufficient. Both of these errors have a profound effect on the estimation of linkage disequilibrium (LD) and, if not taken into account, lead to inaccurate estimates. We developed a new likelihood method, GUS-LD, to estimate pairwise linkage disequilibrium using low coverage sequencing data that accounts for undercalled heterozygous genotypes and sequencing errors. Our findings show that accurate estimates were obtained using GUS-LD, whereas underestimation of LD results if no adjustment is made for the errors.
format Online
Article
Text
id pubmed-5972415
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Genetics Society of America
record_format MEDLINE/PubMed
spelling pubmed-59724152018-05-30 Linkage Disequilibrium Estimation in Low Coverage High-Throughput Sequencing Data Bilton, Timothy P. McEwan, John C. Clarke, Shannon M. Brauning, Rudiger van Stijn, Tracey C. Rowe, Suzanne J. Dodds, Ken G. Genetics Investigations High-throughput sequencing methods that multiplex a large number of individuals have provided a cost-effective approach for discovering genome-wide genetic variation in large populations. These sequencing methods are increasingly being utilized in population genetic studies across a diverse range of species. Two side-effects of these methods, however, are (1) sequencing errors and (2) heterozygous genotypes called as homozygous due to only one allele at a particular locus being sequenced, which occurs when the sequencing depth is insufficient. Both of these errors have a profound effect on the estimation of linkage disequilibrium (LD) and, if not taken into account, lead to inaccurate estimates. We developed a new likelihood method, GUS-LD, to estimate pairwise linkage disequilibrium using low coverage sequencing data that accounts for undercalled heterozygous genotypes and sequencing errors. Our findings show that accurate estimates were obtained using GUS-LD, whereas underestimation of LD results if no adjustment is made for the errors. Genetics Society of America 2018-06 2018-03-26 /pmc/articles/PMC5972415/ /pubmed/29588288 http://dx.doi.org/10.1534/genetics.118.300831 Text en Copyright © 2018 Bilton et al. Available freely online through the author-supported open access option. This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Investigations
Bilton, Timothy P.
McEwan, John C.
Clarke, Shannon M.
Brauning, Rudiger
van Stijn, Tracey C.
Rowe, Suzanne J.
Dodds, Ken G.
Linkage Disequilibrium Estimation in Low Coverage High-Throughput Sequencing Data
title Linkage Disequilibrium Estimation in Low Coverage High-Throughput Sequencing Data
title_full Linkage Disequilibrium Estimation in Low Coverage High-Throughput Sequencing Data
title_fullStr Linkage Disequilibrium Estimation in Low Coverage High-Throughput Sequencing Data
title_full_unstemmed Linkage Disequilibrium Estimation in Low Coverage High-Throughput Sequencing Data
title_short Linkage Disequilibrium Estimation in Low Coverage High-Throughput Sequencing Data
title_sort linkage disequilibrium estimation in low coverage high-throughput sequencing data
topic Investigations
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5972415/
https://www.ncbi.nlm.nih.gov/pubmed/29588288
http://dx.doi.org/10.1534/genetics.118.300831
work_keys_str_mv AT biltontimothyp linkagedisequilibriumestimationinlowcoveragehighthroughputsequencingdata
AT mcewanjohnc linkagedisequilibriumestimationinlowcoveragehighthroughputsequencingdata
AT clarkeshannonm linkagedisequilibriumestimationinlowcoveragehighthroughputsequencingdata
AT brauningrudiger linkagedisequilibriumestimationinlowcoveragehighthroughputsequencingdata
AT vanstijntraceyc linkagedisequilibriumestimationinlowcoveragehighthroughputsequencingdata
AT rowesuzannej linkagedisequilibriumestimationinlowcoveragehighthroughputsequencingdata
AT doddskeng linkagedisequilibriumestimationinlowcoveragehighthroughputsequencingdata