Cargando…

The Effect of Sample Bias and Experimental Artefacts on the Statistical Phylogenetic Analysis of Picornaviruses

Statistical phylogenetic methods are a powerful tool for inferring the evolutionary history of viruses through time and space. The selection of mathematical models and analysis parameters has a major impact on the outcome, and has been relatively well-described in the literature. The preparation of...

Descripción completa

Detalles Bibliográficos
Autores principales: Vakulenko, Yulia, Deviatkin, Andrei, Lukashev, Alexander
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6893659/
https://www.ncbi.nlm.nih.gov/pubmed/31698764
http://dx.doi.org/10.3390/v11111032
_version_ 1783476250743930880
author Vakulenko, Yulia
Deviatkin, Andrei
Lukashev, Alexander
author_facet Vakulenko, Yulia
Deviatkin, Andrei
Lukashev, Alexander
author_sort Vakulenko, Yulia
collection PubMed
description Statistical phylogenetic methods are a powerful tool for inferring the evolutionary history of viruses through time and space. The selection of mathematical models and analysis parameters has a major impact on the outcome, and has been relatively well-described in the literature. The preparation of a sequence dataset is less formalized, but its impact can be even more profound. This article used simulated datasets of enterovirus sequences to evaluate the effect of sample bias on picornavirus phylogenetic studies. Possible approaches to the reduction of large datasets and their potential for introducing additional artefacts were demonstrated. The most consistent results were obtained using “smart sampling”, which reduced sequence subsets from large studies more than those from smaller ones in order to preserve the rare sequences in a dataset. The effect of sequences with technical or annotation errors in the Bayesian framework was also analyzed. Sequences with about 0.5% sequencing errors or incorrect isolation dates altered by just 5 years could be detected by various approaches, but the efficiency of identification depended upon sequence position in a phylogenetic tree. Even a single erroneous sequence could profoundly destabilize the whole analysis by increasing the variance of the inferred evolutionary parameters.
format Online
Article
Text
id pubmed-6893659
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-68936592019-12-23 The Effect of Sample Bias and Experimental Artefacts on the Statistical Phylogenetic Analysis of Picornaviruses Vakulenko, Yulia Deviatkin, Andrei Lukashev, Alexander Viruses Article Statistical phylogenetic methods are a powerful tool for inferring the evolutionary history of viruses through time and space. The selection of mathematical models and analysis parameters has a major impact on the outcome, and has been relatively well-described in the literature. The preparation of a sequence dataset is less formalized, but its impact can be even more profound. This article used simulated datasets of enterovirus sequences to evaluate the effect of sample bias on picornavirus phylogenetic studies. Possible approaches to the reduction of large datasets and their potential for introducing additional artefacts were demonstrated. The most consistent results were obtained using “smart sampling”, which reduced sequence subsets from large studies more than those from smaller ones in order to preserve the rare sequences in a dataset. The effect of sequences with technical or annotation errors in the Bayesian framework was also analyzed. Sequences with about 0.5% sequencing errors or incorrect isolation dates altered by just 5 years could be detected by various approaches, but the efficiency of identification depended upon sequence position in a phylogenetic tree. Even a single erroneous sequence could profoundly destabilize the whole analysis by increasing the variance of the inferred evolutionary parameters. MDPI 2019-11-06 /pmc/articles/PMC6893659/ /pubmed/31698764 http://dx.doi.org/10.3390/v11111032 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Vakulenko, Yulia
Deviatkin, Andrei
Lukashev, Alexander
The Effect of Sample Bias and Experimental Artefacts on the Statistical Phylogenetic Analysis of Picornaviruses
title The Effect of Sample Bias and Experimental Artefacts on the Statistical Phylogenetic Analysis of Picornaviruses
title_full The Effect of Sample Bias and Experimental Artefacts on the Statistical Phylogenetic Analysis of Picornaviruses
title_fullStr The Effect of Sample Bias and Experimental Artefacts on the Statistical Phylogenetic Analysis of Picornaviruses
title_full_unstemmed The Effect of Sample Bias and Experimental Artefacts on the Statistical Phylogenetic Analysis of Picornaviruses
title_short The Effect of Sample Bias and Experimental Artefacts on the Statistical Phylogenetic Analysis of Picornaviruses
title_sort effect of sample bias and experimental artefacts on the statistical phylogenetic analysis of picornaviruses
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6893659/
https://www.ncbi.nlm.nih.gov/pubmed/31698764
http://dx.doi.org/10.3390/v11111032
work_keys_str_mv AT vakulenkoyulia theeffectofsamplebiasandexperimentalartefactsonthestatisticalphylogeneticanalysisofpicornaviruses
AT deviatkinandrei theeffectofsamplebiasandexperimentalartefactsonthestatisticalphylogeneticanalysisofpicornaviruses
AT lukashevalexander theeffectofsamplebiasandexperimentalartefactsonthestatisticalphylogeneticanalysisofpicornaviruses
AT vakulenkoyulia effectofsamplebiasandexperimentalartefactsonthestatisticalphylogeneticanalysisofpicornaviruses
AT deviatkinandrei effectofsamplebiasandexperimentalartefactsonthestatisticalphylogeneticanalysisofpicornaviruses
AT lukashevalexander effectofsamplebiasandexperimentalartefactsonthestatisticalphylogeneticanalysisofpicornaviruses