Cargando…

Using Decision Tree Aggregation with Random Forest Model to Identify Gut Microbes Associated with Colorectal Cancer

The imbalance of human gut microbiota has been associated with colorectal cancer. In recent years, metagenomics research has provided a large amount of scientific data enabling us to study the dedicated roles of gut microbes in the onset and progression of cancer. We removed unrelated and redundant...

Descripción completa

Detalles Bibliográficos
Autores principales: Ai, Dongmei, Pan, Hongfei, Han, Rongbao, Li, Xiaoxin, Liu, Gang, Xia, Li C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6410271/
https://www.ncbi.nlm.nih.gov/pubmed/30717284
http://dx.doi.org/10.3390/genes10020112
_version_ 1783402208644038656
author Ai, Dongmei
Pan, Hongfei
Han, Rongbao
Li, Xiaoxin
Liu, Gang
Xia, Li C.
author_facet Ai, Dongmei
Pan, Hongfei
Han, Rongbao
Li, Xiaoxin
Liu, Gang
Xia, Li C.
author_sort Ai, Dongmei
collection PubMed
description The imbalance of human gut microbiota has been associated with colorectal cancer. In recent years, metagenomics research has provided a large amount of scientific data enabling us to study the dedicated roles of gut microbes in the onset and progression of cancer. We removed unrelated and redundant features during feature selection by mutual information. We then trained a random forest classifier on a large metagenomics dataset of colorectal cancer patients and healthy people assembled from published reports and extracted and analysed the information from the learned decision trees. We identified key microbial species associated with colorectal cancers. These microbes included Porphyromonas asaccharolytica, Peptostreptococcus stomatis, Fusobacterium, Parvimonas sp., Streptococcus vestibularis and Flavonifractor plautii. We obtained the optimal splitting abundance thresholds for these species to distinguish between healthy and colorectal cancer samples. This extracted consensus decision tree may be applied to the diagnosis of colorectal cancers.
format Online
Article
Text
id pubmed-6410271
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-64102712019-03-26 Using Decision Tree Aggregation with Random Forest Model to Identify Gut Microbes Associated with Colorectal Cancer Ai, Dongmei Pan, Hongfei Han, Rongbao Li, Xiaoxin Liu, Gang Xia, Li C. Genes (Basel) Article The imbalance of human gut microbiota has been associated with colorectal cancer. In recent years, metagenomics research has provided a large amount of scientific data enabling us to study the dedicated roles of gut microbes in the onset and progression of cancer. We removed unrelated and redundant features during feature selection by mutual information. We then trained a random forest classifier on a large metagenomics dataset of colorectal cancer patients and healthy people assembled from published reports and extracted and analysed the information from the learned decision trees. We identified key microbial species associated with colorectal cancers. These microbes included Porphyromonas asaccharolytica, Peptostreptococcus stomatis, Fusobacterium, Parvimonas sp., Streptococcus vestibularis and Flavonifractor plautii. We obtained the optimal splitting abundance thresholds for these species to distinguish between healthy and colorectal cancer samples. This extracted consensus decision tree may be applied to the diagnosis of colorectal cancers. MDPI 2019-02-01 /pmc/articles/PMC6410271/ /pubmed/30717284 http://dx.doi.org/10.3390/genes10020112 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Ai, Dongmei
Pan, Hongfei
Han, Rongbao
Li, Xiaoxin
Liu, Gang
Xia, Li C.
Using Decision Tree Aggregation with Random Forest Model to Identify Gut Microbes Associated with Colorectal Cancer
title Using Decision Tree Aggregation with Random Forest Model to Identify Gut Microbes Associated with Colorectal Cancer
title_full Using Decision Tree Aggregation with Random Forest Model to Identify Gut Microbes Associated with Colorectal Cancer
title_fullStr Using Decision Tree Aggregation with Random Forest Model to Identify Gut Microbes Associated with Colorectal Cancer
title_full_unstemmed Using Decision Tree Aggregation with Random Forest Model to Identify Gut Microbes Associated with Colorectal Cancer
title_short Using Decision Tree Aggregation with Random Forest Model to Identify Gut Microbes Associated with Colorectal Cancer
title_sort using decision tree aggregation with random forest model to identify gut microbes associated with colorectal cancer
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6410271/
https://www.ncbi.nlm.nih.gov/pubmed/30717284
http://dx.doi.org/10.3390/genes10020112
work_keys_str_mv AT aidongmei usingdecisiontreeaggregationwithrandomforestmodeltoidentifygutmicrobesassociatedwithcolorectalcancer
AT panhongfei usingdecisiontreeaggregationwithrandomforestmodeltoidentifygutmicrobesassociatedwithcolorectalcancer
AT hanrongbao usingdecisiontreeaggregationwithrandomforestmodeltoidentifygutmicrobesassociatedwithcolorectalcancer
AT lixiaoxin usingdecisiontreeaggregationwithrandomforestmodeltoidentifygutmicrobesassociatedwithcolorectalcancer
AT liugang usingdecisiontreeaggregationwithrandomforestmodeltoidentifygutmicrobesassociatedwithcolorectalcancer
AT xialic usingdecisiontreeaggregationwithrandomforestmodeltoidentifygutmicrobesassociatedwithcolorectalcancer