Cargando…

A laid-back trip through the Hennigian Forests

BACKGROUND: This paper is a comment on the idea of matrix-free Cladistics. Demonstration of this idea’s efficiency is a major goal of the study. Within the proposed framework, the ordinary (phenetic) matrix is necessary only as “source” of Hennigian trees, not as a primary subject of the analysis. S...

Descripción completa

Detalles Bibliográficos
Autores principales: Mavrodiev, Evgeny V., Dell, Christopher, Schroder, Laura
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5522724/
https://www.ncbi.nlm.nih.gov/pubmed/28740753
http://dx.doi.org/10.7717/peerj.3578
_version_ 1783252217484017664
author Mavrodiev, Evgeny V.
Dell, Christopher
Schroder, Laura
author_facet Mavrodiev, Evgeny V.
Dell, Christopher
Schroder, Laura
author_sort Mavrodiev, Evgeny V.
collection PubMed
description BACKGROUND: This paper is a comment on the idea of matrix-free Cladistics. Demonstration of this idea’s efficiency is a major goal of the study. Within the proposed framework, the ordinary (phenetic) matrix is necessary only as “source” of Hennigian trees, not as a primary subject of the analysis. Switching from the matrix-based thinking to the matrix-free Cladistic approach clearly reveals that optimizations of the character-state changes are related not to the real processes, but to the form of the data representation. METHODS: We focused our study on the binary data. We wrote the simple ruby-based script FORESTER version 1.0 that helps represent a binary matrix as an array of the rooted trees (as a “Hennigian forest”). The binary representations of the genomic (DNA) data have been made by script 1001. The Average Consensus method as well as the standard Maximum Parsimony (MP) approach has been used to analyze the data. PRINCIPLE FINDINGS: The binary matrix may be easily re-written as a set of rooted trees (maximal relationships). The latter might be analyzed by the Average Consensus method. Paradoxically, this method, if applied to the Hennigian forests, in principle can help to identify clades despite the absence of the direct evidence from the primary data. Our approach may handle the clock- or non clock-like matrices, as well as the hypothetical, molecular or morphological data. DISCUSSION: Our proposal clearly differs from the numerous phenetic alignment-free techniques of the construction of the phylogenetic trees. Dealing with the relations, not with the actual “data” also distinguishes our approach from all optimization-based methods, if the optimization is defined as a way to reconstruct the sequences of the character-state changes on a tree, either the standard alignment-based techniques or the “direct” alignment-free procedure. We are not viewing our recent framework as an alternative to the three-taxon statement analysis (3TA), but there are two major differences between our recent proposal and the 3TA, as originally designed and implemented: (1) the 3TA deals with the three-taxon statements or minimal relationships. According to the logic of 3TA, the set of the minimal trees must be established as a binary matrix and used as an input for the parsimony program. In this paper, we operate directly with maximal relationships written just as trees, not as binary matrices, while also using the Average Consensus method instead of the MP analysis. The solely ‘reversal’-based groups can always be found by our method without the separate scoring of the putative reversals before analyses.
format Online
Article
Text
id pubmed-5522724
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-55227242017-07-24 A laid-back trip through the Hennigian Forests Mavrodiev, Evgeny V. Dell, Christopher Schroder, Laura PeerJ Bioinformatics BACKGROUND: This paper is a comment on the idea of matrix-free Cladistics. Demonstration of this idea’s efficiency is a major goal of the study. Within the proposed framework, the ordinary (phenetic) matrix is necessary only as “source” of Hennigian trees, not as a primary subject of the analysis. Switching from the matrix-based thinking to the matrix-free Cladistic approach clearly reveals that optimizations of the character-state changes are related not to the real processes, but to the form of the data representation. METHODS: We focused our study on the binary data. We wrote the simple ruby-based script FORESTER version 1.0 that helps represent a binary matrix as an array of the rooted trees (as a “Hennigian forest”). The binary representations of the genomic (DNA) data have been made by script 1001. The Average Consensus method as well as the standard Maximum Parsimony (MP) approach has been used to analyze the data. PRINCIPLE FINDINGS: The binary matrix may be easily re-written as a set of rooted trees (maximal relationships). The latter might be analyzed by the Average Consensus method. Paradoxically, this method, if applied to the Hennigian forests, in principle can help to identify clades despite the absence of the direct evidence from the primary data. Our approach may handle the clock- or non clock-like matrices, as well as the hypothetical, molecular or morphological data. DISCUSSION: Our proposal clearly differs from the numerous phenetic alignment-free techniques of the construction of the phylogenetic trees. Dealing with the relations, not with the actual “data” also distinguishes our approach from all optimization-based methods, if the optimization is defined as a way to reconstruct the sequences of the character-state changes on a tree, either the standard alignment-based techniques or the “direct” alignment-free procedure. We are not viewing our recent framework as an alternative to the three-taxon statement analysis (3TA), but there are two major differences between our recent proposal and the 3TA, as originally designed and implemented: (1) the 3TA deals with the three-taxon statements or minimal relationships. According to the logic of 3TA, the set of the minimal trees must be established as a binary matrix and used as an input for the parsimony program. In this paper, we operate directly with maximal relationships written just as trees, not as binary matrices, while also using the Average Consensus method instead of the MP analysis. The solely ‘reversal’-based groups can always be found by our method without the separate scoring of the putative reversals before analyses. PeerJ Inc. 2017-07-21 /pmc/articles/PMC5522724/ /pubmed/28740753 http://dx.doi.org/10.7717/peerj.3578 Text en ©2017 Mavrodiev et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Mavrodiev, Evgeny V.
Dell, Christopher
Schroder, Laura
A laid-back trip through the Hennigian Forests
title A laid-back trip through the Hennigian Forests
title_full A laid-back trip through the Hennigian Forests
title_fullStr A laid-back trip through the Hennigian Forests
title_full_unstemmed A laid-back trip through the Hennigian Forests
title_short A laid-back trip through the Hennigian Forests
title_sort laid-back trip through the hennigian forests
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5522724/
https://www.ncbi.nlm.nih.gov/pubmed/28740753
http://dx.doi.org/10.7717/peerj.3578
work_keys_str_mv AT mavrodievevgenyv alaidbacktripthroughthehennigianforests
AT dellchristopher alaidbacktripthroughthehennigianforests
AT schroderlaura alaidbacktripthroughthehennigianforests
AT mavrodievevgenyv laidbacktripthroughthehennigianforests
AT dellchristopher laidbacktripthroughthehennigianforests
AT schroderlaura laidbacktripthroughthehennigianforests