Cargando…

An Efficient Estimator of the Mutation Parameter and Analysis of Polymorphism from the 1000 Genomes Project

The mutation parameter θ is fundamental and ubiquitous in the analysis of population samples of DNA sequences. This paper presents a new highly efficient estimator of θ by utilizing the phylogenetic information among distinct alleles in a sample of DNA sequences. The new estimator, called Allelic BL...

Descripción completa

Detalles Bibliográficos
Autor principal: Fu, Yunxin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4198917/
https://www.ncbi.nlm.nih.gov/pubmed/25055200
http://dx.doi.org/10.3390/genes5030561
_version_ 1782339818289627136
author Fu, Yunxin
author_facet Fu, Yunxin
author_sort Fu, Yunxin
collection PubMed
description The mutation parameter θ is fundamental and ubiquitous in the analysis of population samples of DNA sequences. This paper presents a new highly efficient estimator of θ by utilizing the phylogenetic information among distinct alleles in a sample of DNA sequences. The new estimator, called Allelic BLUE, is derived from a generalized linear model about the mutations in the allelic genealogy. This estimator is not only highly accurate, but also computational efficient, which makes it particularly useful for estimating θ for large samples, as well as for a large number of cases, such as the situation of analyzing sequence data from a large genome project, such as the 1000 Genomes Project. Simulation shows that Allelic BLUE is nearly unbiased, with variance nearly as small as the minimum achievable variance, and in many situations, it can be hundreds- or thousands-fold more efficient than a previous method, which was already quite efficient compared to other approaches. One useful feature of the new estimator is its applicability to collections of distinct alleles without detailed frequencies. The utility of the new estimator is demonstrated by analyzing the pattern of θ in the data from the 1000 Genomes Project.
format Online
Article
Text
id pubmed-4198917
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-41989172014-10-16 An Efficient Estimator of the Mutation Parameter and Analysis of Polymorphism from the 1000 Genomes Project Fu, Yunxin Genes (Basel) Article The mutation parameter θ is fundamental and ubiquitous in the analysis of population samples of DNA sequences. This paper presents a new highly efficient estimator of θ by utilizing the phylogenetic information among distinct alleles in a sample of DNA sequences. The new estimator, called Allelic BLUE, is derived from a generalized linear model about the mutations in the allelic genealogy. This estimator is not only highly accurate, but also computational efficient, which makes it particularly useful for estimating θ for large samples, as well as for a large number of cases, such as the situation of analyzing sequence data from a large genome project, such as the 1000 Genomes Project. Simulation shows that Allelic BLUE is nearly unbiased, with variance nearly as small as the minimum achievable variance, and in many situations, it can be hundreds- or thousands-fold more efficient than a previous method, which was already quite efficient compared to other approaches. One useful feature of the new estimator is its applicability to collections of distinct alleles without detailed frequencies. The utility of the new estimator is demonstrated by analyzing the pattern of θ in the data from the 1000 Genomes Project. MDPI 2014-07-22 /pmc/articles/PMC4198917/ /pubmed/25055200 http://dx.doi.org/10.3390/genes5030561 Text en © 2014 by the authors; licensee MDPI, Basel, Switzerland. http://creativecommons.org/licenses/by/3.0/ This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).
spellingShingle Article
Fu, Yunxin
An Efficient Estimator of the Mutation Parameter and Analysis of Polymorphism from the 1000 Genomes Project
title An Efficient Estimator of the Mutation Parameter and Analysis of Polymorphism from the 1000 Genomes Project
title_full An Efficient Estimator of the Mutation Parameter and Analysis of Polymorphism from the 1000 Genomes Project
title_fullStr An Efficient Estimator of the Mutation Parameter and Analysis of Polymorphism from the 1000 Genomes Project
title_full_unstemmed An Efficient Estimator of the Mutation Parameter and Analysis of Polymorphism from the 1000 Genomes Project
title_short An Efficient Estimator of the Mutation Parameter and Analysis of Polymorphism from the 1000 Genomes Project
title_sort efficient estimator of the mutation parameter and analysis of polymorphism from the 1000 genomes project
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4198917/
https://www.ncbi.nlm.nih.gov/pubmed/25055200
http://dx.doi.org/10.3390/genes5030561
work_keys_str_mv AT fuyunxin anefficientestimatorofthemutationparameterandanalysisofpolymorphismfromthe1000genomesproject
AT fuyunxin efficientestimatorofthemutationparameterandanalysisofpolymorphismfromthe1000genomesproject