Cargando…

Identifying indicator species in ecological habitats using Deep Optimal Feature Learning

Much of the current research on supervised modelling is focused on maximizing outcome prediction accuracy. However, in engineering disciplines, an arguably more important goal is that of feature extraction, the identification of relevant features associated with the various outcomes. For instance, i...

Descripción completa

Detalles Bibliográficos
Autores principales: Tsai, Yiting, Baldwin, Susan A., Gopaluni, Bhushan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8432828/
https://www.ncbi.nlm.nih.gov/pubmed/34506523
http://dx.doi.org/10.1371/journal.pone.0256782
_version_ 1783751247450341376
author Tsai, Yiting
Baldwin, Susan A.
Gopaluni, Bhushan
author_facet Tsai, Yiting
Baldwin, Susan A.
Gopaluni, Bhushan
author_sort Tsai, Yiting
collection PubMed
description Much of the current research on supervised modelling is focused on maximizing outcome prediction accuracy. However, in engineering disciplines, an arguably more important goal is that of feature extraction, the identification of relevant features associated with the various outcomes. For instance, in microbial communities, the identification of keystone species can often lead to improved prediction of future behavioral shifts. This paper proposes a novel feature extractor based on Deep Learning, which is largely agnostic to underlying assumptions regarding the training data. Starting from a collection of microbial species abundance counts, the Deep Learning model first trains itself to classify the selected distinct habitats. It then identifies indicator species associated with the habitats. The results are then compared and contrasted with those obtained by traditional statistical techniques. The indicator species are similar when compared at top taxonomic levels such as Domain and Phylum, despite visible differences in lower levels such as Class and Order. More importantly, when our estimated indicators are used to predict final habitat labels using simpler models (such as Support Vector Machines and traditional Artificial Neural Networks), the prediction accuracy is improved. Overall, this study serves as a preliminary step that bridges modern, black-box Machine Learning models with traditional, domain expertise-rich techniques.
format Online
Article
Text
id pubmed-8432828
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-84328282021-09-11 Identifying indicator species in ecological habitats using Deep Optimal Feature Learning Tsai, Yiting Baldwin, Susan A. Gopaluni, Bhushan PLoS One Research Article Much of the current research on supervised modelling is focused on maximizing outcome prediction accuracy. However, in engineering disciplines, an arguably more important goal is that of feature extraction, the identification of relevant features associated with the various outcomes. For instance, in microbial communities, the identification of keystone species can often lead to improved prediction of future behavioral shifts. This paper proposes a novel feature extractor based on Deep Learning, which is largely agnostic to underlying assumptions regarding the training data. Starting from a collection of microbial species abundance counts, the Deep Learning model first trains itself to classify the selected distinct habitats. It then identifies indicator species associated with the habitats. The results are then compared and contrasted with those obtained by traditional statistical techniques. The indicator species are similar when compared at top taxonomic levels such as Domain and Phylum, despite visible differences in lower levels such as Class and Order. More importantly, when our estimated indicators are used to predict final habitat labels using simpler models (such as Support Vector Machines and traditional Artificial Neural Networks), the prediction accuracy is improved. Overall, this study serves as a preliminary step that bridges modern, black-box Machine Learning models with traditional, domain expertise-rich techniques. Public Library of Science 2021-09-10 /pmc/articles/PMC8432828/ /pubmed/34506523 http://dx.doi.org/10.1371/journal.pone.0256782 Text en © 2021 Tsai et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Tsai, Yiting
Baldwin, Susan A.
Gopaluni, Bhushan
Identifying indicator species in ecological habitats using Deep Optimal Feature Learning
title Identifying indicator species in ecological habitats using Deep Optimal Feature Learning
title_full Identifying indicator species in ecological habitats using Deep Optimal Feature Learning
title_fullStr Identifying indicator species in ecological habitats using Deep Optimal Feature Learning
title_full_unstemmed Identifying indicator species in ecological habitats using Deep Optimal Feature Learning
title_short Identifying indicator species in ecological habitats using Deep Optimal Feature Learning
title_sort identifying indicator species in ecological habitats using deep optimal feature learning
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8432828/
https://www.ncbi.nlm.nih.gov/pubmed/34506523
http://dx.doi.org/10.1371/journal.pone.0256782
work_keys_str_mv AT tsaiyiting identifyingindicatorspeciesinecologicalhabitatsusingdeepoptimalfeaturelearning
AT baldwinsusana identifyingindicatorspeciesinecologicalhabitatsusingdeepoptimalfeaturelearning
AT gopalunibhushan identifyingindicatorspeciesinecologicalhabitatsusingdeepoptimalfeaturelearning