Cargando…
BowSaw: Inferring Higher-Order Trait Interactions Associated With Complex Biological Phenotypes
Machine learning is helping the interpretation of biological complexity by enabling the inference and classification of cellular, organismal and ecological phenotypes based on large datasets, e.g., from genomic, transcriptomic and metagenomic analyses. A number of available algorithms can help searc...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8245782/ https://www.ncbi.nlm.nih.gov/pubmed/34222331 http://dx.doi.org/10.3389/fmolb.2021.663532 |
_version_ | 1783716184097554432 |
---|---|
author | DiMucci, Demetrius Kon, Mark Segrè, Daniel |
author_facet | DiMucci, Demetrius Kon, Mark Segrè, Daniel |
author_sort | DiMucci, Demetrius |
collection | PubMed |
description | Machine learning is helping the interpretation of biological complexity by enabling the inference and classification of cellular, organismal and ecological phenotypes based on large datasets, e.g., from genomic, transcriptomic and metagenomic analyses. A number of available algorithms can help search these datasets to uncover patterns associated with specific traits, including disease-related attributes. While, in many instances, treating an algorithm as a black box is sufficient, it is interesting to pursue an enhanced understanding of how system variables end up contributing to a specific output, as an avenue toward new mechanistic insight. Here we address this challenge through a suite of algorithms, named BowSaw, which takes advantage of the structure of a trained random forest algorithm to identify combinations of variables (“rules”) frequently used for classification. We first apply BowSaw to a simulated dataset and show that the algorithm can accurately recover the sets of variables used to generate the phenotypes through complex Boolean rules, even under challenging noise levels. We next apply our method to data from the integrative Human Microbiome Project and find previously unreported high-order combinations of microbial taxa putatively associated with Crohn’s disease. By leveraging the structure of trees within a random forest, BowSaw provides a new way of using decision trees to generate testable biological hypotheses. |
format | Online Article Text |
id | pubmed-8245782 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-82457822021-07-02 BowSaw: Inferring Higher-Order Trait Interactions Associated With Complex Biological Phenotypes DiMucci, Demetrius Kon, Mark Segrè, Daniel Front Mol Biosci Molecular Biosciences Machine learning is helping the interpretation of biological complexity by enabling the inference and classification of cellular, organismal and ecological phenotypes based on large datasets, e.g., from genomic, transcriptomic and metagenomic analyses. A number of available algorithms can help search these datasets to uncover patterns associated with specific traits, including disease-related attributes. While, in many instances, treating an algorithm as a black box is sufficient, it is interesting to pursue an enhanced understanding of how system variables end up contributing to a specific output, as an avenue toward new mechanistic insight. Here we address this challenge through a suite of algorithms, named BowSaw, which takes advantage of the structure of a trained random forest algorithm to identify combinations of variables (“rules”) frequently used for classification. We first apply BowSaw to a simulated dataset and show that the algorithm can accurately recover the sets of variables used to generate the phenotypes through complex Boolean rules, even under challenging noise levels. We next apply our method to data from the integrative Human Microbiome Project and find previously unreported high-order combinations of microbial taxa putatively associated with Crohn’s disease. By leveraging the structure of trees within a random forest, BowSaw provides a new way of using decision trees to generate testable biological hypotheses. Frontiers Media S.A. 2021-06-17 /pmc/articles/PMC8245782/ /pubmed/34222331 http://dx.doi.org/10.3389/fmolb.2021.663532 Text en Copyright © 2021 DiMucci, Kon and Segrè. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Molecular Biosciences DiMucci, Demetrius Kon, Mark Segrè, Daniel BowSaw: Inferring Higher-Order Trait Interactions Associated With Complex Biological Phenotypes |
title | BowSaw: Inferring Higher-Order Trait Interactions Associated With Complex Biological Phenotypes |
title_full | BowSaw: Inferring Higher-Order Trait Interactions Associated With Complex Biological Phenotypes |
title_fullStr | BowSaw: Inferring Higher-Order Trait Interactions Associated With Complex Biological Phenotypes |
title_full_unstemmed | BowSaw: Inferring Higher-Order Trait Interactions Associated With Complex Biological Phenotypes |
title_short | BowSaw: Inferring Higher-Order Trait Interactions Associated With Complex Biological Phenotypes |
title_sort | bowsaw: inferring higher-order trait interactions associated with complex biological phenotypes |
topic | Molecular Biosciences |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8245782/ https://www.ncbi.nlm.nih.gov/pubmed/34222331 http://dx.doi.org/10.3389/fmolb.2021.663532 |
work_keys_str_mv | AT dimuccidemetrius bowsawinferringhigherordertraitinteractionsassociatedwithcomplexbiologicalphenotypes AT konmark bowsawinferringhigherordertraitinteractionsassociatedwithcomplexbiologicalphenotypes AT segredaniel bowsawinferringhigherordertraitinteractionsassociatedwithcomplexbiologicalphenotypes |