Cargando…
Combining Distance Matrices on Identical Taxon Sets for Multi-Gene Analysis with Singular Value Decomposition
We present a simple and effective method for combining distance matrices from multiple genes on identical taxon sets to obtain a single representative distance matrix from which to derive a combined-gene phylogenetic tree. The method applies singular value decomposition (SVD) to extract the greatest...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3986248/ https://www.ncbi.nlm.nih.gov/pubmed/24732341 http://dx.doi.org/10.1371/journal.pone.0094279 |
_version_ | 1782311680901906432 |
---|---|
author | Abeysundera, Melanie Kenney, Toby Field, Chris Gu, Hong |
author_facet | Abeysundera, Melanie Kenney, Toby Field, Chris Gu, Hong |
author_sort | Abeysundera, Melanie |
collection | PubMed |
description | We present a simple and effective method for combining distance matrices from multiple genes on identical taxon sets to obtain a single representative distance matrix from which to derive a combined-gene phylogenetic tree. The method applies singular value decomposition (SVD) to extract the greatest common signal present in the distances obtained from each gene. The first right eigenvector of the SVD, which corresponds to a weighted average of the distance matrices of all genes, can thus be used to derive a representative tree from multiple genes. We apply our method to three well known data sets and estimate the uncertainty using bootstrap methods. Our results show that this method works well for these three data sets and that the uncertainty in these estimates is small. A simulation study is conducted to compare the performance of our method with several other distance based approaches (namely SDM, SDM* and ACS97), and we find the performances of all these approaches are comparable in the consensus setting. The computational complexity of our method is similar to that of SDM. Besides constructing a representative tree from multiple genes, we also demonstrate how the subsequent eigenvalues and eigenvectors may be used to identify if there are conflicting signals in the data and which genes might be influential or outliers for the estimated combined-gene tree. |
format | Online Article Text |
id | pubmed-3986248 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-39862482014-04-15 Combining Distance Matrices on Identical Taxon Sets for Multi-Gene Analysis with Singular Value Decomposition Abeysundera, Melanie Kenney, Toby Field, Chris Gu, Hong PLoS One Research Article We present a simple and effective method for combining distance matrices from multiple genes on identical taxon sets to obtain a single representative distance matrix from which to derive a combined-gene phylogenetic tree. The method applies singular value decomposition (SVD) to extract the greatest common signal present in the distances obtained from each gene. The first right eigenvector of the SVD, which corresponds to a weighted average of the distance matrices of all genes, can thus be used to derive a representative tree from multiple genes. We apply our method to three well known data sets and estimate the uncertainty using bootstrap methods. Our results show that this method works well for these three data sets and that the uncertainty in these estimates is small. A simulation study is conducted to compare the performance of our method with several other distance based approaches (namely SDM, SDM* and ACS97), and we find the performances of all these approaches are comparable in the consensus setting. The computational complexity of our method is similar to that of SDM. Besides constructing a representative tree from multiple genes, we also demonstrate how the subsequent eigenvalues and eigenvectors may be used to identify if there are conflicting signals in the data and which genes might be influential or outliers for the estimated combined-gene tree. Public Library of Science 2014-04-14 /pmc/articles/PMC3986248/ /pubmed/24732341 http://dx.doi.org/10.1371/journal.pone.0094279 Text en © 2014 Abeysundera et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Abeysundera, Melanie Kenney, Toby Field, Chris Gu, Hong Combining Distance Matrices on Identical Taxon Sets for Multi-Gene Analysis with Singular Value Decomposition |
title | Combining Distance Matrices on Identical Taxon Sets for Multi-Gene Analysis with Singular Value Decomposition |
title_full | Combining Distance Matrices on Identical Taxon Sets for Multi-Gene Analysis with Singular Value Decomposition |
title_fullStr | Combining Distance Matrices on Identical Taxon Sets for Multi-Gene Analysis with Singular Value Decomposition |
title_full_unstemmed | Combining Distance Matrices on Identical Taxon Sets for Multi-Gene Analysis with Singular Value Decomposition |
title_short | Combining Distance Matrices on Identical Taxon Sets for Multi-Gene Analysis with Singular Value Decomposition |
title_sort | combining distance matrices on identical taxon sets for multi-gene analysis with singular value decomposition |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3986248/ https://www.ncbi.nlm.nih.gov/pubmed/24732341 http://dx.doi.org/10.1371/journal.pone.0094279 |
work_keys_str_mv | AT abeysunderamelanie combiningdistancematricesonidenticaltaxonsetsformultigeneanalysiswithsingularvaluedecomposition AT kenneytoby combiningdistancematricesonidenticaltaxonsetsformultigeneanalysiswithsingularvaluedecomposition AT fieldchris combiningdistancematricesonidenticaltaxonsetsformultigeneanalysiswithsingularvaluedecomposition AT guhong combiningdistancematricesonidenticaltaxonsetsformultigeneanalysiswithsingularvaluedecomposition |