Cargando…
An Efficient Estimator of the Mutation Parameter and Analysis of Polymorphism from the 1000 Genomes Project
The mutation parameter θ is fundamental and ubiquitous in the analysis of population samples of DNA sequences. This paper presents a new highly efficient estimator of θ by utilizing the phylogenetic information among distinct alleles in a sample of DNA sequences. The new estimator, called Allelic BL...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4198917/ https://www.ncbi.nlm.nih.gov/pubmed/25055200 http://dx.doi.org/10.3390/genes5030561 |
_version_ | 1782339818289627136 |
---|---|
author | Fu, Yunxin |
author_facet | Fu, Yunxin |
author_sort | Fu, Yunxin |
collection | PubMed |
description | The mutation parameter θ is fundamental and ubiquitous in the analysis of population samples of DNA sequences. This paper presents a new highly efficient estimator of θ by utilizing the phylogenetic information among distinct alleles in a sample of DNA sequences. The new estimator, called Allelic BLUE, is derived from a generalized linear model about the mutations in the allelic genealogy. This estimator is not only highly accurate, but also computational efficient, which makes it particularly useful for estimating θ for large samples, as well as for a large number of cases, such as the situation of analyzing sequence data from a large genome project, such as the 1000 Genomes Project. Simulation shows that Allelic BLUE is nearly unbiased, with variance nearly as small as the minimum achievable variance, and in many situations, it can be hundreds- or thousands-fold more efficient than a previous method, which was already quite efficient compared to other approaches. One useful feature of the new estimator is its applicability to collections of distinct alleles without detailed frequencies. The utility of the new estimator is demonstrated by analyzing the pattern of θ in the data from the 1000 Genomes Project. |
format | Online Article Text |
id | pubmed-4198917 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-41989172014-10-16 An Efficient Estimator of the Mutation Parameter and Analysis of Polymorphism from the 1000 Genomes Project Fu, Yunxin Genes (Basel) Article The mutation parameter θ is fundamental and ubiquitous in the analysis of population samples of DNA sequences. This paper presents a new highly efficient estimator of θ by utilizing the phylogenetic information among distinct alleles in a sample of DNA sequences. The new estimator, called Allelic BLUE, is derived from a generalized linear model about the mutations in the allelic genealogy. This estimator is not only highly accurate, but also computational efficient, which makes it particularly useful for estimating θ for large samples, as well as for a large number of cases, such as the situation of analyzing sequence data from a large genome project, such as the 1000 Genomes Project. Simulation shows that Allelic BLUE is nearly unbiased, with variance nearly as small as the minimum achievable variance, and in many situations, it can be hundreds- or thousands-fold more efficient than a previous method, which was already quite efficient compared to other approaches. One useful feature of the new estimator is its applicability to collections of distinct alleles without detailed frequencies. The utility of the new estimator is demonstrated by analyzing the pattern of θ in the data from the 1000 Genomes Project. MDPI 2014-07-22 /pmc/articles/PMC4198917/ /pubmed/25055200 http://dx.doi.org/10.3390/genes5030561 Text en © 2014 by the authors; licensee MDPI, Basel, Switzerland. http://creativecommons.org/licenses/by/3.0/ This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/). |
spellingShingle | Article Fu, Yunxin An Efficient Estimator of the Mutation Parameter and Analysis of Polymorphism from the 1000 Genomes Project |
title | An Efficient Estimator of the Mutation Parameter and Analysis of Polymorphism from the 1000 Genomes Project |
title_full | An Efficient Estimator of the Mutation Parameter and Analysis of Polymorphism from the 1000 Genomes Project |
title_fullStr | An Efficient Estimator of the Mutation Parameter and Analysis of Polymorphism from the 1000 Genomes Project |
title_full_unstemmed | An Efficient Estimator of the Mutation Parameter and Analysis of Polymorphism from the 1000 Genomes Project |
title_short | An Efficient Estimator of the Mutation Parameter and Analysis of Polymorphism from the 1000 Genomes Project |
title_sort | efficient estimator of the mutation parameter and analysis of polymorphism from the 1000 genomes project |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4198917/ https://www.ncbi.nlm.nih.gov/pubmed/25055200 http://dx.doi.org/10.3390/genes5030561 |
work_keys_str_mv | AT fuyunxin anefficientestimatorofthemutationparameterandanalysisofpolymorphismfromthe1000genomesproject AT fuyunxin efficientestimatorofthemutationparameterandanalysisofpolymorphismfromthe1000genomesproject |