Cargando…

paPAML: An Improved Computational Tool to Explore Selection Pressure on Protein-Coding Sequences

Evolution is change over time. Although neutral changes promoted by drift effects are most reliable for phylogenetic reconstructions, selection-relevant changes are of only limited use to reconstruct phylogenies. On the other hand, comparative analyses of neutral and selected changes of protein-codi...

Descripción completa

Detalles Bibliográficos
Autores principales: Steffen, Raphael, Ogoniak, Lynn, Grundmann, Norbert, Pawluchin, Anna, Soehnlein, Oliver, Schmitz, Jürgen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9222883/
https://www.ncbi.nlm.nih.gov/pubmed/35741852
http://dx.doi.org/10.3390/genes13061090
_version_ 1784732984567398400
author Steffen, Raphael
Ogoniak, Lynn
Grundmann, Norbert
Pawluchin, Anna
Soehnlein, Oliver
Schmitz, Jürgen
author_facet Steffen, Raphael
Ogoniak, Lynn
Grundmann, Norbert
Pawluchin, Anna
Soehnlein, Oliver
Schmitz, Jürgen
author_sort Steffen, Raphael
collection PubMed
description Evolution is change over time. Although neutral changes promoted by drift effects are most reliable for phylogenetic reconstructions, selection-relevant changes are of only limited use to reconstruct phylogenies. On the other hand, comparative analyses of neutral and selected changes of protein-coding DNA sequences (CDS) retrospectively tell us about episodic constrained, relaxed, and adaptive incidences. The ratio of sites with nonsynonymous (amino acid altering) versus synonymous (not altering) mutations directly measures selection pressure and can be analysed by using the Phylogenetic Analysis by Maximum Likelihood (PAML) software package. We developed a CDS extractor for compiling protein-coding sequences (CDS-extractor) and parallel PAML (paPAML) to simplify, amplify, and accelerate selection analyses via parallel processing, including detection of negatively selected sites. paPAML compiles results of site, branch-site, and branch models and detects site-specific negative selection with the output of a codon list labelling significance values. The tool simplifies selection analyses for casual and inexperienced users and accelerates computing speeds up to the number of allocated computer threads. We then applied paPAML to examine the evolutionary impact on a new GINS Complex Subunit 3 exon, and neutrophil-associated as well as lysin and apolipoprotein genes. Compared with codeml (PAML version 4.9j) and HyPhy (HyPhy FEL version 2.5.26), all paPAML test runs performed with 10 computing threads led to identical selection pressure results, whereas the total selection analysis via paPAML, including all model comparisons, was about 3 to 5 times faster than the longest running codeml model and about 7 to 15 times faster than the entire processing time of these codeml runs.
format Online
Article
Text
id pubmed-9222883
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-92228832022-06-24 paPAML: An Improved Computational Tool to Explore Selection Pressure on Protein-Coding Sequences Steffen, Raphael Ogoniak, Lynn Grundmann, Norbert Pawluchin, Anna Soehnlein, Oliver Schmitz, Jürgen Genes (Basel) Article Evolution is change over time. Although neutral changes promoted by drift effects are most reliable for phylogenetic reconstructions, selection-relevant changes are of only limited use to reconstruct phylogenies. On the other hand, comparative analyses of neutral and selected changes of protein-coding DNA sequences (CDS) retrospectively tell us about episodic constrained, relaxed, and adaptive incidences. The ratio of sites with nonsynonymous (amino acid altering) versus synonymous (not altering) mutations directly measures selection pressure and can be analysed by using the Phylogenetic Analysis by Maximum Likelihood (PAML) software package. We developed a CDS extractor for compiling protein-coding sequences (CDS-extractor) and parallel PAML (paPAML) to simplify, amplify, and accelerate selection analyses via parallel processing, including detection of negatively selected sites. paPAML compiles results of site, branch-site, and branch models and detects site-specific negative selection with the output of a codon list labelling significance values. The tool simplifies selection analyses for casual and inexperienced users and accelerates computing speeds up to the number of allocated computer threads. We then applied paPAML to examine the evolutionary impact on a new GINS Complex Subunit 3 exon, and neutrophil-associated as well as lysin and apolipoprotein genes. Compared with codeml (PAML version 4.9j) and HyPhy (HyPhy FEL version 2.5.26), all paPAML test runs performed with 10 computing threads led to identical selection pressure results, whereas the total selection analysis via paPAML, including all model comparisons, was about 3 to 5 times faster than the longest running codeml model and about 7 to 15 times faster than the entire processing time of these codeml runs. MDPI 2022-06-18 /pmc/articles/PMC9222883/ /pubmed/35741852 http://dx.doi.org/10.3390/genes13061090 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Steffen, Raphael
Ogoniak, Lynn
Grundmann, Norbert
Pawluchin, Anna
Soehnlein, Oliver
Schmitz, Jürgen
paPAML: An Improved Computational Tool to Explore Selection Pressure on Protein-Coding Sequences
title paPAML: An Improved Computational Tool to Explore Selection Pressure on Protein-Coding Sequences
title_full paPAML: An Improved Computational Tool to Explore Selection Pressure on Protein-Coding Sequences
title_fullStr paPAML: An Improved Computational Tool to Explore Selection Pressure on Protein-Coding Sequences
title_full_unstemmed paPAML: An Improved Computational Tool to Explore Selection Pressure on Protein-Coding Sequences
title_short paPAML: An Improved Computational Tool to Explore Selection Pressure on Protein-Coding Sequences
title_sort papaml: an improved computational tool to explore selection pressure on protein-coding sequences
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9222883/
https://www.ncbi.nlm.nih.gov/pubmed/35741852
http://dx.doi.org/10.3390/genes13061090
work_keys_str_mv AT steffenraphael papamlanimprovedcomputationaltooltoexploreselectionpressureonproteincodingsequences
AT ogoniaklynn papamlanimprovedcomputationaltooltoexploreselectionpressureonproteincodingsequences
AT grundmannnorbert papamlanimprovedcomputationaltooltoexploreselectionpressureonproteincodingsequences
AT pawluchinanna papamlanimprovedcomputationaltooltoexploreselectionpressureonproteincodingsequences
AT soehnleinoliver papamlanimprovedcomputationaltooltoexploreselectionpressureonproteincodingsequences
AT schmitzjurgen papamlanimprovedcomputationaltooltoexploreselectionpressureonproteincodingsequences