Cargando…
Toward Reducing Phylostratigraphic Errors and Biases
Phylostratigraphy is a method for estimating gene age, usually applied to large numbers of genes in order to detect nonrandom age-distributions of gene properties that could shed light on mechanisms of gene origination and evolution. However, phylostratigraphy underestimates gene age with a nonnegli...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6105108/ https://www.ncbi.nlm.nih.gov/pubmed/30060201 http://dx.doi.org/10.1093/gbe/evy161 |
_version_ | 1783349602371502080 |
---|---|
author | Moyers, Bryan A Zhang, Jianzhi |
author_facet | Moyers, Bryan A Zhang, Jianzhi |
author_sort | Moyers, Bryan A |
collection | PubMed |
description | Phylostratigraphy is a method for estimating gene age, usually applied to large numbers of genes in order to detect nonrandom age-distributions of gene properties that could shed light on mechanisms of gene origination and evolution. However, phylostratigraphy underestimates gene age with a nonnegligible probability. The underestimation is severer for genes with certain properties, creating spurious age distributions of these properties and those correlated with these properties. Here we explore three strategies to reduce phylostratigraphic error/bias. First, we test several alternative homology detection methods (PSIBLAST, HMMER, PHMMER, OMA, and GLAM2Scan) in phylostratigraphy, but fail to find any that noticeably outperforms the commonly used BLASTP. Second, using machine learning, we look for predictors of error-prone genes to exclude from phylostratigraphy, but cannot identify reliable predictors. Finally, we remove from phylostratigraphic analysis genes exhibiting errors in simulation, which by definition minimizes error/bias if the simulation is sufficiently realistic. Using this last approach, we show that some previously reported phylostratigraphic trends (e.g., younger proteins tend to evolve more rapidly and be shorter) disappear or even reverse, reconfirming the necessity of controlling phylostratigraphic error/bias. Taken together, our analyses demonstrate that phylostratigraphic errors/biases are refractory to several potential solutions but can be controlled at least partially by the exclusion of error-prone genes identified via realistic simulations. These results are expected to stimulate the judicious use of error-aware phylostratigraphy and reevaluation of previous phylostratigraphic findings. |
format | Online Article Text |
id | pubmed-6105108 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-61051082018-08-27 Toward Reducing Phylostratigraphic Errors and Biases Moyers, Bryan A Zhang, Jianzhi Genome Biol Evol Research Article Phylostratigraphy is a method for estimating gene age, usually applied to large numbers of genes in order to detect nonrandom age-distributions of gene properties that could shed light on mechanisms of gene origination and evolution. However, phylostratigraphy underestimates gene age with a nonnegligible probability. The underestimation is severer for genes with certain properties, creating spurious age distributions of these properties and those correlated with these properties. Here we explore three strategies to reduce phylostratigraphic error/bias. First, we test several alternative homology detection methods (PSIBLAST, HMMER, PHMMER, OMA, and GLAM2Scan) in phylostratigraphy, but fail to find any that noticeably outperforms the commonly used BLASTP. Second, using machine learning, we look for predictors of error-prone genes to exclude from phylostratigraphy, but cannot identify reliable predictors. Finally, we remove from phylostratigraphic analysis genes exhibiting errors in simulation, which by definition minimizes error/bias if the simulation is sufficiently realistic. Using this last approach, we show that some previously reported phylostratigraphic trends (e.g., younger proteins tend to evolve more rapidly and be shorter) disappear or even reverse, reconfirming the necessity of controlling phylostratigraphic error/bias. Taken together, our analyses demonstrate that phylostratigraphic errors/biases are refractory to several potential solutions but can be controlled at least partially by the exclusion of error-prone genes identified via realistic simulations. These results are expected to stimulate the judicious use of error-aware phylostratigraphy and reevaluation of previous phylostratigraphic findings. Oxford University Press 2018-07-30 /pmc/articles/PMC6105108/ /pubmed/30060201 http://dx.doi.org/10.1093/gbe/evy161 Text en © The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Research Article Moyers, Bryan A Zhang, Jianzhi Toward Reducing Phylostratigraphic Errors and Biases |
title | Toward Reducing Phylostratigraphic Errors and Biases |
title_full | Toward Reducing Phylostratigraphic Errors and Biases |
title_fullStr | Toward Reducing Phylostratigraphic Errors and Biases |
title_full_unstemmed | Toward Reducing Phylostratigraphic Errors and Biases |
title_short | Toward Reducing Phylostratigraphic Errors and Biases |
title_sort | toward reducing phylostratigraphic errors and biases |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6105108/ https://www.ncbi.nlm.nih.gov/pubmed/30060201 http://dx.doi.org/10.1093/gbe/evy161 |
work_keys_str_mv | AT moyersbryana towardreducingphylostratigraphicerrorsandbiases AT zhangjianzhi towardreducingphylostratigraphicerrorsandbiases |