Cargando…

Machine Learning-Based Ensemble Recursive Feature Selection of Circulating miRNAs for Cancer Tumor Classification

Circulating microRNAs (miRNA) are small noncoding RNA molecules that can be detected in bodily fluids without the need for major invasive procedures on patients. miRNAs have shown great promise as biomarkers for tumors to both assess their presence and to predict their type and subtype. Recently, th...

Descripción completa

Detalles Bibliográficos
Autores principales: Lopez-Rincon, Alejandro, Mendoza-Maldonado, Lucero, Martinez-Archundia, Marlet, Schönhuth, Alexander, Kraneveld, Aletta D., Garssen, Johan, Tonda, Alberto
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7407482/
https://www.ncbi.nlm.nih.gov/pubmed/32635415
http://dx.doi.org/10.3390/cancers12071785
_version_ 1783567631090974720
author Lopez-Rincon, Alejandro
Mendoza-Maldonado, Lucero
Martinez-Archundia, Marlet
Schönhuth, Alexander
Kraneveld, Aletta D.
Garssen, Johan
Tonda, Alberto
author_facet Lopez-Rincon, Alejandro
Mendoza-Maldonado, Lucero
Martinez-Archundia, Marlet
Schönhuth, Alexander
Kraneveld, Aletta D.
Garssen, Johan
Tonda, Alberto
author_sort Lopez-Rincon, Alejandro
collection PubMed
description Circulating microRNAs (miRNA) are small noncoding RNA molecules that can be detected in bodily fluids without the need for major invasive procedures on patients. miRNAs have shown great promise as biomarkers for tumors to both assess their presence and to predict their type and subtype. Recently, thanks to the availability of miRNAs datasets, machine learning techniques have been successfully applied to tumor classification. The results, however, are difficult to assess and interpret by medical experts because the algorithms exploit information from thousands of miRNAs. In this work, we propose a novel technique that aims at reducing the necessary information to the smallest possible set of circulating miRNAs. The dimensionality reduction achieved reflects a very important first step in a potential, clinically actionable, circulating miRNA-based precision medicine pipeline. While it is currently under discussion whether this first step can be taken, we demonstrate here that it is possible to perform classification tasks by exploiting a recursive feature elimination procedure that integrates a heterogeneous ensemble of high-quality, state-of-the-art classifiers on circulating miRNAs. Heterogeneous ensembles can compensate inherent biases of classifiers by using different classification algorithms. Selecting features then further eliminates biases emerging from using data from different studies or batches, yielding more robust and reliable outcomes. The proposed approach is first tested on a tumor classification problem in order to separate 10 different types of cancer, with samples collected over 10 different clinical trials, and later is assessed on a cancer subtype classification task, with the aim to distinguish triple negative breast cancer from other subtypes of breast cancer. Overall, the presented methodology proves to be effective and compares favorably to other state-of-the-art feature selection methods.
format Online
Article
Text
id pubmed-7407482
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-74074822020-08-25 Machine Learning-Based Ensemble Recursive Feature Selection of Circulating miRNAs for Cancer Tumor Classification Lopez-Rincon, Alejandro Mendoza-Maldonado, Lucero Martinez-Archundia, Marlet Schönhuth, Alexander Kraneveld, Aletta D. Garssen, Johan Tonda, Alberto Cancers (Basel) Article Circulating microRNAs (miRNA) are small noncoding RNA molecules that can be detected in bodily fluids without the need for major invasive procedures on patients. miRNAs have shown great promise as biomarkers for tumors to both assess their presence and to predict their type and subtype. Recently, thanks to the availability of miRNAs datasets, machine learning techniques have been successfully applied to tumor classification. The results, however, are difficult to assess and interpret by medical experts because the algorithms exploit information from thousands of miRNAs. In this work, we propose a novel technique that aims at reducing the necessary information to the smallest possible set of circulating miRNAs. The dimensionality reduction achieved reflects a very important first step in a potential, clinically actionable, circulating miRNA-based precision medicine pipeline. While it is currently under discussion whether this first step can be taken, we demonstrate here that it is possible to perform classification tasks by exploiting a recursive feature elimination procedure that integrates a heterogeneous ensemble of high-quality, state-of-the-art classifiers on circulating miRNAs. Heterogeneous ensembles can compensate inherent biases of classifiers by using different classification algorithms. Selecting features then further eliminates biases emerging from using data from different studies or batches, yielding more robust and reliable outcomes. The proposed approach is first tested on a tumor classification problem in order to separate 10 different types of cancer, with samples collected over 10 different clinical trials, and later is assessed on a cancer subtype classification task, with the aim to distinguish triple negative breast cancer from other subtypes of breast cancer. Overall, the presented methodology proves to be effective and compares favorably to other state-of-the-art feature selection methods. MDPI 2020-07-03 /pmc/articles/PMC7407482/ /pubmed/32635415 http://dx.doi.org/10.3390/cancers12071785 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Lopez-Rincon, Alejandro
Mendoza-Maldonado, Lucero
Martinez-Archundia, Marlet
Schönhuth, Alexander
Kraneveld, Aletta D.
Garssen, Johan
Tonda, Alberto
Machine Learning-Based Ensemble Recursive Feature Selection of Circulating miRNAs for Cancer Tumor Classification
title Machine Learning-Based Ensemble Recursive Feature Selection of Circulating miRNAs for Cancer Tumor Classification
title_full Machine Learning-Based Ensemble Recursive Feature Selection of Circulating miRNAs for Cancer Tumor Classification
title_fullStr Machine Learning-Based Ensemble Recursive Feature Selection of Circulating miRNAs for Cancer Tumor Classification
title_full_unstemmed Machine Learning-Based Ensemble Recursive Feature Selection of Circulating miRNAs for Cancer Tumor Classification
title_short Machine Learning-Based Ensemble Recursive Feature Selection of Circulating miRNAs for Cancer Tumor Classification
title_sort machine learning-based ensemble recursive feature selection of circulating mirnas for cancer tumor classification
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7407482/
https://www.ncbi.nlm.nih.gov/pubmed/32635415
http://dx.doi.org/10.3390/cancers12071785
work_keys_str_mv AT lopezrinconalejandro machinelearningbasedensemblerecursivefeatureselectionofcirculatingmirnasforcancertumorclassification
AT mendozamaldonadolucero machinelearningbasedensemblerecursivefeatureselectionofcirculatingmirnasforcancertumorclassification
AT martinezarchundiamarlet machinelearningbasedensemblerecursivefeatureselectionofcirculatingmirnasforcancertumorclassification
AT schonhuthalexander machinelearningbasedensemblerecursivefeatureselectionofcirculatingmirnasforcancertumorclassification
AT kraneveldalettad machinelearningbasedensemblerecursivefeatureselectionofcirculatingmirnasforcancertumorclassification
AT garssenjohan machinelearningbasedensemblerecursivefeatureselectionofcirculatingmirnasforcancertumorclassification
AT tondaalberto machinelearningbasedensemblerecursivefeatureselectionofcirculatingmirnasforcancertumorclassification