Cargando…

The Discovery of Novel Biomarkers Improves Breast Cancer Intrinsic Subtype Prediction and Reconciles the Labels in the METABRIC Data Set

BACKGROUND: The prediction of breast cancer intrinsic subtypes has been introduced as a valuable strategy to determine patient diagnosis and prognosis, and therapy response. The PAM50 method, based on the expression levels of 50 genes, uses a single sample predictor model to assign subtype labels to...

Descripción completa

Detalles Bibliográficos
Autores principales: Milioli, Heloisa Helena, Vimieiro, Renato, Riveros, Carlos, Tishchenko, Inna, Berretta, Regina, Moscato, Pablo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4488510/
https://www.ncbi.nlm.nih.gov/pubmed/26132585
http://dx.doi.org/10.1371/journal.pone.0129711
_version_ 1782379172010655744
author Milioli, Heloisa Helena
Vimieiro, Renato
Riveros, Carlos
Tishchenko, Inna
Berretta, Regina
Moscato, Pablo
author_facet Milioli, Heloisa Helena
Vimieiro, Renato
Riveros, Carlos
Tishchenko, Inna
Berretta, Regina
Moscato, Pablo
author_sort Milioli, Heloisa Helena
collection PubMed
description BACKGROUND: The prediction of breast cancer intrinsic subtypes has been introduced as a valuable strategy to determine patient diagnosis and prognosis, and therapy response. The PAM50 method, based on the expression levels of 50 genes, uses a single sample predictor model to assign subtype labels to samples. Intrinsic errors reported within this assay demonstrate the challenge of identifying and understanding the breast cancer groups. In this study, we aim to: a) identify novel biomarkers for subtype individuation by exploring the competence of a newly proposed method named CM1 score, and b) apply an ensemble learning, as opposed to the use of a single classifier, for sample subtype assignment. The overarching objective is to improve class prediction. METHODS AND FINDINGS: The microarray transcriptome data sets used in this study are: the METABRIC breast cancer data recorded for over 2000 patients, and the public integrated source from ROCK database with 1570 samples. We first computed the CM1 score to identify the probes with highly discriminative patterns of expression across samples of each intrinsic subtype. We further assessed the ability of 42 selected probes on assigning correct subtype labels using 24 different classifiers from the Weka software suite. For comparison, the same method was applied on the list of 50 genes from the PAM50 method. CONCLUSIONS: The CM1 score portrayed 30 novel biomarkers for predicting breast cancer subtypes, with the confirmation of the role of 12 well-established genes. Intrinsic subtypes assigned using the CM1 list and the ensemble of classifiers are more consistent and homogeneous than the original PAM50 labels. The new subtypes show accurate distributions of current clinical markers ER, PR and HER2, and survival curves in the METABRIC and ROCK data sets. Remarkably, the paradoxical attribution of the original labels reinforces the limitations of employing a single sample classifiers to predict breast cancer intrinsic subtypes.
format Online
Article
Text
id pubmed-4488510
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-44885102015-07-14 The Discovery of Novel Biomarkers Improves Breast Cancer Intrinsic Subtype Prediction and Reconciles the Labels in the METABRIC Data Set Milioli, Heloisa Helena Vimieiro, Renato Riveros, Carlos Tishchenko, Inna Berretta, Regina Moscato, Pablo PLoS One Research Article BACKGROUND: The prediction of breast cancer intrinsic subtypes has been introduced as a valuable strategy to determine patient diagnosis and prognosis, and therapy response. The PAM50 method, based on the expression levels of 50 genes, uses a single sample predictor model to assign subtype labels to samples. Intrinsic errors reported within this assay demonstrate the challenge of identifying and understanding the breast cancer groups. In this study, we aim to: a) identify novel biomarkers for subtype individuation by exploring the competence of a newly proposed method named CM1 score, and b) apply an ensemble learning, as opposed to the use of a single classifier, for sample subtype assignment. The overarching objective is to improve class prediction. METHODS AND FINDINGS: The microarray transcriptome data sets used in this study are: the METABRIC breast cancer data recorded for over 2000 patients, and the public integrated source from ROCK database with 1570 samples. We first computed the CM1 score to identify the probes with highly discriminative patterns of expression across samples of each intrinsic subtype. We further assessed the ability of 42 selected probes on assigning correct subtype labels using 24 different classifiers from the Weka software suite. For comparison, the same method was applied on the list of 50 genes from the PAM50 method. CONCLUSIONS: The CM1 score portrayed 30 novel biomarkers for predicting breast cancer subtypes, with the confirmation of the role of 12 well-established genes. Intrinsic subtypes assigned using the CM1 list and the ensemble of classifiers are more consistent and homogeneous than the original PAM50 labels. The new subtypes show accurate distributions of current clinical markers ER, PR and HER2, and survival curves in the METABRIC and ROCK data sets. Remarkably, the paradoxical attribution of the original labels reinforces the limitations of employing a single sample classifiers to predict breast cancer intrinsic subtypes. Public Library of Science 2015-07-01 /pmc/articles/PMC4488510/ /pubmed/26132585 http://dx.doi.org/10.1371/journal.pone.0129711 Text en © 2015 Milioli et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Milioli, Heloisa Helena
Vimieiro, Renato
Riveros, Carlos
Tishchenko, Inna
Berretta, Regina
Moscato, Pablo
The Discovery of Novel Biomarkers Improves Breast Cancer Intrinsic Subtype Prediction and Reconciles the Labels in the METABRIC Data Set
title The Discovery of Novel Biomarkers Improves Breast Cancer Intrinsic Subtype Prediction and Reconciles the Labels in the METABRIC Data Set
title_full The Discovery of Novel Biomarkers Improves Breast Cancer Intrinsic Subtype Prediction and Reconciles the Labels in the METABRIC Data Set
title_fullStr The Discovery of Novel Biomarkers Improves Breast Cancer Intrinsic Subtype Prediction and Reconciles the Labels in the METABRIC Data Set
title_full_unstemmed The Discovery of Novel Biomarkers Improves Breast Cancer Intrinsic Subtype Prediction and Reconciles the Labels in the METABRIC Data Set
title_short The Discovery of Novel Biomarkers Improves Breast Cancer Intrinsic Subtype Prediction and Reconciles the Labels in the METABRIC Data Set
title_sort discovery of novel biomarkers improves breast cancer intrinsic subtype prediction and reconciles the labels in the metabric data set
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4488510/
https://www.ncbi.nlm.nih.gov/pubmed/26132585
http://dx.doi.org/10.1371/journal.pone.0129711
work_keys_str_mv AT milioliheloisahelena thediscoveryofnovelbiomarkersimprovesbreastcancerintrinsicsubtypepredictionandreconcilesthelabelsinthemetabricdataset
AT vimieirorenato thediscoveryofnovelbiomarkersimprovesbreastcancerintrinsicsubtypepredictionandreconcilesthelabelsinthemetabricdataset
AT riveroscarlos thediscoveryofnovelbiomarkersimprovesbreastcancerintrinsicsubtypepredictionandreconcilesthelabelsinthemetabricdataset
AT tishchenkoinna thediscoveryofnovelbiomarkersimprovesbreastcancerintrinsicsubtypepredictionandreconcilesthelabelsinthemetabricdataset
AT berrettaregina thediscoveryofnovelbiomarkersimprovesbreastcancerintrinsicsubtypepredictionandreconcilesthelabelsinthemetabricdataset
AT moscatopablo thediscoveryofnovelbiomarkersimprovesbreastcancerintrinsicsubtypepredictionandreconcilesthelabelsinthemetabricdataset
AT milioliheloisahelena discoveryofnovelbiomarkersimprovesbreastcancerintrinsicsubtypepredictionandreconcilesthelabelsinthemetabricdataset
AT vimieirorenato discoveryofnovelbiomarkersimprovesbreastcancerintrinsicsubtypepredictionandreconcilesthelabelsinthemetabricdataset
AT riveroscarlos discoveryofnovelbiomarkersimprovesbreastcancerintrinsicsubtypepredictionandreconcilesthelabelsinthemetabricdataset
AT tishchenkoinna discoveryofnovelbiomarkersimprovesbreastcancerintrinsicsubtypepredictionandreconcilesthelabelsinthemetabricdataset
AT berrettaregina discoveryofnovelbiomarkersimprovesbreastcancerintrinsicsubtypepredictionandreconcilesthelabelsinthemetabricdataset
AT moscatopablo discoveryofnovelbiomarkersimprovesbreastcancerintrinsicsubtypepredictionandreconcilesthelabelsinthemetabricdataset