Cargando…

A comparison of machine learning techniques for survival prediction in breast cancer

BACKGROUND: The ability to accurately classify cancer patients into risk classes, i.e. to predict the outcome of the pathology on an individual basis, is a key ingredient in making therapeutic decisions. In recent years gene expression data have been successfully used to complement the clinical and...

Descripción completa

Detalles Bibliográficos
Autores principales: Vanneschi, Leonardo, Farinaccio, Antonella, Mauri, Giancarlo, Antoniotti, Mauro, Provero, Paolo, Giacobini, Mario
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3108919/
https://www.ncbi.nlm.nih.gov/pubmed/21569330
http://dx.doi.org/10.1186/1756-0381-4-12
_version_ 1782205384937701376
author Vanneschi, Leonardo
Farinaccio, Antonella
Mauri, Giancarlo
Antoniotti, Mauro
Provero, Paolo
Giacobini, Mario
author_facet Vanneschi, Leonardo
Farinaccio, Antonella
Mauri, Giancarlo
Antoniotti, Mauro
Provero, Paolo
Giacobini, Mario
author_sort Vanneschi, Leonardo
collection PubMed
description BACKGROUND: The ability to accurately classify cancer patients into risk classes, i.e. to predict the outcome of the pathology on an individual basis, is a key ingredient in making therapeutic decisions. In recent years gene expression data have been successfully used to complement the clinical and histological criteria traditionally used in such prediction. Many "gene expression signatures" have been developed, i.e. sets of genes whose expression values in a tumor can be used to predict the outcome of the pathology. Here we investigate the use of several machine learning techniques to classify breast cancer patients using one of such signatures, the well established 70-gene signature. RESULTS: We show that Genetic Programming performs significantly better than Support Vector Machines, Multilayered Perceptrons and Random Forests in classifying patients from the NKI breast cancer dataset, and comparably to the scoring-based method originally proposed by the authors of the 70-gene signature. Furthermore, Genetic Programming is able to perform an automatic feature selection. CONCLUSIONS: Since the performance of Genetic Programming is likely to be improvable compared to the out-of-the-box approach used here, and given the biological insight potentially provided by the Genetic Programming solutions, we conclude that Genetic Programming methods are worth further investigation as a tool for cancer patient classification based on gene expression data.
format Online
Article
Text
id pubmed-3108919
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31089192011-06-07 A comparison of machine learning techniques for survival prediction in breast cancer Vanneschi, Leonardo Farinaccio, Antonella Mauri, Giancarlo Antoniotti, Mauro Provero, Paolo Giacobini, Mario BioData Min Research BACKGROUND: The ability to accurately classify cancer patients into risk classes, i.e. to predict the outcome of the pathology on an individual basis, is a key ingredient in making therapeutic decisions. In recent years gene expression data have been successfully used to complement the clinical and histological criteria traditionally used in such prediction. Many "gene expression signatures" have been developed, i.e. sets of genes whose expression values in a tumor can be used to predict the outcome of the pathology. Here we investigate the use of several machine learning techniques to classify breast cancer patients using one of such signatures, the well established 70-gene signature. RESULTS: We show that Genetic Programming performs significantly better than Support Vector Machines, Multilayered Perceptrons and Random Forests in classifying patients from the NKI breast cancer dataset, and comparably to the scoring-based method originally proposed by the authors of the 70-gene signature. Furthermore, Genetic Programming is able to perform an automatic feature selection. CONCLUSIONS: Since the performance of Genetic Programming is likely to be improvable compared to the out-of-the-box approach used here, and given the biological insight potentially provided by the Genetic Programming solutions, we conclude that Genetic Programming methods are worth further investigation as a tool for cancer patient classification based on gene expression data. BioMed Central 2011-05-11 /pmc/articles/PMC3108919/ /pubmed/21569330 http://dx.doi.org/10.1186/1756-0381-4-12 Text en Copyright ©2011 Vanneschi et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Vanneschi, Leonardo
Farinaccio, Antonella
Mauri, Giancarlo
Antoniotti, Mauro
Provero, Paolo
Giacobini, Mario
A comparison of machine learning techniques for survival prediction in breast cancer
title A comparison of machine learning techniques for survival prediction in breast cancer
title_full A comparison of machine learning techniques for survival prediction in breast cancer
title_fullStr A comparison of machine learning techniques for survival prediction in breast cancer
title_full_unstemmed A comparison of machine learning techniques for survival prediction in breast cancer
title_short A comparison of machine learning techniques for survival prediction in breast cancer
title_sort comparison of machine learning techniques for survival prediction in breast cancer
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3108919/
https://www.ncbi.nlm.nih.gov/pubmed/21569330
http://dx.doi.org/10.1186/1756-0381-4-12
work_keys_str_mv AT vanneschileonardo acomparisonofmachinelearningtechniquesforsurvivalpredictioninbreastcancer
AT farinaccioantonella acomparisonofmachinelearningtechniquesforsurvivalpredictioninbreastcancer
AT maurigiancarlo acomparisonofmachinelearningtechniquesforsurvivalpredictioninbreastcancer
AT antoniottimauro acomparisonofmachinelearningtechniquesforsurvivalpredictioninbreastcancer
AT proveropaolo acomparisonofmachinelearningtechniquesforsurvivalpredictioninbreastcancer
AT giacobinimario acomparisonofmachinelearningtechniquesforsurvivalpredictioninbreastcancer
AT vanneschileonardo comparisonofmachinelearningtechniquesforsurvivalpredictioninbreastcancer
AT farinaccioantonella comparisonofmachinelearningtechniquesforsurvivalpredictioninbreastcancer
AT maurigiancarlo comparisonofmachinelearningtechniquesforsurvivalpredictioninbreastcancer
AT antoniottimauro comparisonofmachinelearningtechniquesforsurvivalpredictioninbreastcancer
AT proveropaolo comparisonofmachinelearningtechniquesforsurvivalpredictioninbreastcancer
AT giacobinimario comparisonofmachinelearningtechniquesforsurvivalpredictioninbreastcancer