Cargando…

Uniform Manifold Approximation and Projection (UMAP) Reveals Composite Patterns and Resolves Visualization Artifacts in Microbiome Data

Microbiome data are sparse and high dimensional, so effective visualization of these data requires dimensionality reduction. To date, the most commonly used method for dimensionality reduction in the microbiome is calculation of between-sample microbial differences (beta diversity), followed by prin...

Descripción completa

Detalles Bibliográficos
Autores principales: Armstrong, George, Martino, Cameron, Rahman, Gibraan, Gonzalez, Antonio, Vázquez-Baeza, Yoshiki, Mishne, Gal, Knight, Rob
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8547469/
https://www.ncbi.nlm.nih.gov/pubmed/34609167
http://dx.doi.org/10.1128/mSystems.00691-21
_version_ 1784590386745835520
author Armstrong, George
Martino, Cameron
Rahman, Gibraan
Gonzalez, Antonio
Vázquez-Baeza, Yoshiki
Mishne, Gal
Knight, Rob
author_facet Armstrong, George
Martino, Cameron
Rahman, Gibraan
Gonzalez, Antonio
Vázquez-Baeza, Yoshiki
Mishne, Gal
Knight, Rob
author_sort Armstrong, George
collection PubMed
description Microbiome data are sparse and high dimensional, so effective visualization of these data requires dimensionality reduction. To date, the most commonly used method for dimensionality reduction in the microbiome is calculation of between-sample microbial differences (beta diversity), followed by principal-coordinate analysis (PCoA). Uniform Manifold Approximation and Projection (UMAP) is an alternative method that can reduce the dimensionality of beta diversity distance matrices. Here, we demonstrate the benefits and limitations of using UMAP for dimensionality reduction on microbiome data. Using real data, we demonstrate that UMAP can improve the representation of clusters, especially when the clusters are composed of multiple subgroups. Additionally, we show that UMAP provides improved correlation of biological variation along a gradient with a reduced number of coordinates of the resulting embedding. Finally, we provide parameter recommendations that emphasize the preservation of global geometry. We therefore conclude that UMAP should be routinely used as a complementary visualization method for microbiome beta diversity studies. IMPORTANCE UMAP provides an additional method to visualize microbiome data. The method is extensible to any beta diversity metric used with PCoA, and our results demonstrate that UMAP can indeed improve visualization quality and correspondence with biological and technical variables of interest. The software to perform this analysis is available under an open-source license and can be obtained at https://github.com/knightlab-analyses/umap-microbiome-benchmarking; additionally, we have provided a QIIME 2 plugin for UMAP at https://github.com/biocore/q2-umap.
format Online
Article
Text
id pubmed-8547469
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher American Society for Microbiology
record_format MEDLINE/PubMed
spelling pubmed-85474692021-10-27 Uniform Manifold Approximation and Projection (UMAP) Reveals Composite Patterns and Resolves Visualization Artifacts in Microbiome Data Armstrong, George Martino, Cameron Rahman, Gibraan Gonzalez, Antonio Vázquez-Baeza, Yoshiki Mishne, Gal Knight, Rob mSystems Observation Microbiome data are sparse and high dimensional, so effective visualization of these data requires dimensionality reduction. To date, the most commonly used method for dimensionality reduction in the microbiome is calculation of between-sample microbial differences (beta diversity), followed by principal-coordinate analysis (PCoA). Uniform Manifold Approximation and Projection (UMAP) is an alternative method that can reduce the dimensionality of beta diversity distance matrices. Here, we demonstrate the benefits and limitations of using UMAP for dimensionality reduction on microbiome data. Using real data, we demonstrate that UMAP can improve the representation of clusters, especially when the clusters are composed of multiple subgroups. Additionally, we show that UMAP provides improved correlation of biological variation along a gradient with a reduced number of coordinates of the resulting embedding. Finally, we provide parameter recommendations that emphasize the preservation of global geometry. We therefore conclude that UMAP should be routinely used as a complementary visualization method for microbiome beta diversity studies. IMPORTANCE UMAP provides an additional method to visualize microbiome data. The method is extensible to any beta diversity metric used with PCoA, and our results demonstrate that UMAP can indeed improve visualization quality and correspondence with biological and technical variables of interest. The software to perform this analysis is available under an open-source license and can be obtained at https://github.com/knightlab-analyses/umap-microbiome-benchmarking; additionally, we have provided a QIIME 2 plugin for UMAP at https://github.com/biocore/q2-umap. American Society for Microbiology 2021-10-05 /pmc/articles/PMC8547469/ /pubmed/34609167 http://dx.doi.org/10.1128/mSystems.00691-21 Text en Copyright © 2021 Armstrong et al. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Observation
Armstrong, George
Martino, Cameron
Rahman, Gibraan
Gonzalez, Antonio
Vázquez-Baeza, Yoshiki
Mishne, Gal
Knight, Rob
Uniform Manifold Approximation and Projection (UMAP) Reveals Composite Patterns and Resolves Visualization Artifacts in Microbiome Data
title Uniform Manifold Approximation and Projection (UMAP) Reveals Composite Patterns and Resolves Visualization Artifacts in Microbiome Data
title_full Uniform Manifold Approximation and Projection (UMAP) Reveals Composite Patterns and Resolves Visualization Artifacts in Microbiome Data
title_fullStr Uniform Manifold Approximation and Projection (UMAP) Reveals Composite Patterns and Resolves Visualization Artifacts in Microbiome Data
title_full_unstemmed Uniform Manifold Approximation and Projection (UMAP) Reveals Composite Patterns and Resolves Visualization Artifacts in Microbiome Data
title_short Uniform Manifold Approximation and Projection (UMAP) Reveals Composite Patterns and Resolves Visualization Artifacts in Microbiome Data
title_sort uniform manifold approximation and projection (umap) reveals composite patterns and resolves visualization artifacts in microbiome data
topic Observation
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8547469/
https://www.ncbi.nlm.nih.gov/pubmed/34609167
http://dx.doi.org/10.1128/mSystems.00691-21
work_keys_str_mv AT armstronggeorge uniformmanifoldapproximationandprojectionumaprevealscompositepatternsandresolvesvisualizationartifactsinmicrobiomedata
AT martinocameron uniformmanifoldapproximationandprojectionumaprevealscompositepatternsandresolvesvisualizationartifactsinmicrobiomedata
AT rahmangibraan uniformmanifoldapproximationandprojectionumaprevealscompositepatternsandresolvesvisualizationartifactsinmicrobiomedata
AT gonzalezantonio uniformmanifoldapproximationandprojectionumaprevealscompositepatternsandresolvesvisualizationartifactsinmicrobiomedata
AT vazquezbaezayoshiki uniformmanifoldapproximationandprojectionumaprevealscompositepatternsandresolvesvisualizationartifactsinmicrobiomedata
AT mishnegal uniformmanifoldapproximationandprojectionumaprevealscompositepatternsandresolvesvisualizationartifactsinmicrobiomedata
AT knightrob uniformmanifoldapproximationandprojectionumaprevealscompositepatternsandresolvesvisualizationartifactsinmicrobiomedata