Cargando…
The Generalized Matrix Decomposition Biplot and Its Application to Microbiome Data
Exploratory analysis of human microbiome data is often based on dimension-reduced graphical displays derived from similarities based on non-Euclidean distances, such as UniFrac or Bray-Curtis. However, a display of this type, often referred to as the principal-coordinate analysis (PCoA) plot, does n...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Society for Microbiology
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6918030/ https://www.ncbi.nlm.nih.gov/pubmed/31848304 http://dx.doi.org/10.1128/mSystems.00504-19 |
_version_ | 1783480498964660224 |
---|---|
author | Wang, Yue Randolph, Timothy W. Shojaie, Ali Ma, Jing |
author_facet | Wang, Yue Randolph, Timothy W. Shojaie, Ali Ma, Jing |
author_sort | Wang, Yue |
collection | PubMed |
description | Exploratory analysis of human microbiome data is often based on dimension-reduced graphical displays derived from similarities based on non-Euclidean distances, such as UniFrac or Bray-Curtis. However, a display of this type, often referred to as the principal-coordinate analysis (PCoA) plot, does not reveal which taxa are related to the observed clustering because the configuration of samples is not based on a coordinate system in which both the samples and variables can be represented. The reason is that the PCoA plot is based on the eigen-decomposition of a similarity matrix and not the singular value decomposition (SVD) of the sample-by-abundance matrix. We propose a novel biplot that is based on an extension of the SVD, called the generalized matrix decomposition biplot (GMD-biplot), which involves an arbitrary matrix of similarities and the original matrix of variable measures, such as taxon abundances. As in a traditional biplot, points represent the samples, and arrows represent the variables. The proposed GMD-biplot is illustrated by analyzing multiple real and simulated data sets which demonstrate that the GMD-biplot provides improved clustering capability and a more meaningful relationship between the arrows and points. IMPORTANCE Biplots that simultaneously display the sample clustering and the important taxa have gained popularity in the exploratory analysis of human microbiome data. Traditional biplots, assuming Euclidean distances between samples, are not appropriate for microbiome data, when non-Euclidean distances are used to characterize dissimilarities among microbial communities. Thus, incorporating information from non-Euclidean distances into a biplot becomes useful for graphical displays of microbiome data. The proposed GMD-biplot accounts for any arbitrary non-Euclidean distances and provides a robust and computationally efficient approach for graphical visualization of microbiome data. In addition, the proposed GMD-biplot displays both the samples and taxa with respect to the same coordinate system, which further allows the configuration of future samples. |
format | Online Article Text |
id | pubmed-6918030 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | American Society for Microbiology |
record_format | MEDLINE/PubMed |
spelling | pubmed-69180302019-12-23 The Generalized Matrix Decomposition Biplot and Its Application to Microbiome Data Wang, Yue Randolph, Timothy W. Shojaie, Ali Ma, Jing mSystems Methods and Protocols Exploratory analysis of human microbiome data is often based on dimension-reduced graphical displays derived from similarities based on non-Euclidean distances, such as UniFrac or Bray-Curtis. However, a display of this type, often referred to as the principal-coordinate analysis (PCoA) plot, does not reveal which taxa are related to the observed clustering because the configuration of samples is not based on a coordinate system in which both the samples and variables can be represented. The reason is that the PCoA plot is based on the eigen-decomposition of a similarity matrix and not the singular value decomposition (SVD) of the sample-by-abundance matrix. We propose a novel biplot that is based on an extension of the SVD, called the generalized matrix decomposition biplot (GMD-biplot), which involves an arbitrary matrix of similarities and the original matrix of variable measures, such as taxon abundances. As in a traditional biplot, points represent the samples, and arrows represent the variables. The proposed GMD-biplot is illustrated by analyzing multiple real and simulated data sets which demonstrate that the GMD-biplot provides improved clustering capability and a more meaningful relationship between the arrows and points. IMPORTANCE Biplots that simultaneously display the sample clustering and the important taxa have gained popularity in the exploratory analysis of human microbiome data. Traditional biplots, assuming Euclidean distances between samples, are not appropriate for microbiome data, when non-Euclidean distances are used to characterize dissimilarities among microbial communities. Thus, incorporating information from non-Euclidean distances into a biplot becomes useful for graphical displays of microbiome data. The proposed GMD-biplot accounts for any arbitrary non-Euclidean distances and provides a robust and computationally efficient approach for graphical visualization of microbiome data. In addition, the proposed GMD-biplot displays both the samples and taxa with respect to the same coordinate system, which further allows the configuration of future samples. American Society for Microbiology 2019-12-17 /pmc/articles/PMC6918030/ /pubmed/31848304 http://dx.doi.org/10.1128/mSystems.00504-19 Text en Copyright © 2019 Wang et al. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Methods and Protocols Wang, Yue Randolph, Timothy W. Shojaie, Ali Ma, Jing The Generalized Matrix Decomposition Biplot and Its Application to Microbiome Data |
title | The Generalized Matrix Decomposition Biplot and Its Application to Microbiome Data |
title_full | The Generalized Matrix Decomposition Biplot and Its Application to Microbiome Data |
title_fullStr | The Generalized Matrix Decomposition Biplot and Its Application to Microbiome Data |
title_full_unstemmed | The Generalized Matrix Decomposition Biplot and Its Application to Microbiome Data |
title_short | The Generalized Matrix Decomposition Biplot and Its Application to Microbiome Data |
title_sort | generalized matrix decomposition biplot and its application to microbiome data |
topic | Methods and Protocols |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6918030/ https://www.ncbi.nlm.nih.gov/pubmed/31848304 http://dx.doi.org/10.1128/mSystems.00504-19 |
work_keys_str_mv | AT wangyue thegeneralizedmatrixdecompositionbiplotanditsapplicationtomicrobiomedata AT randolphtimothyw thegeneralizedmatrixdecompositionbiplotanditsapplicationtomicrobiomedata AT shojaieali thegeneralizedmatrixdecompositionbiplotanditsapplicationtomicrobiomedata AT majing thegeneralizedmatrixdecompositionbiplotanditsapplicationtomicrobiomedata AT wangyue generalizedmatrixdecompositionbiplotanditsapplicationtomicrobiomedata AT randolphtimothyw generalizedmatrixdecompositionbiplotanditsapplicationtomicrobiomedata AT shojaieali generalizedmatrixdecompositionbiplotanditsapplicationtomicrobiomedata AT majing generalizedmatrixdecompositionbiplotanditsapplicationtomicrobiomedata |