Cargando…

A Machine Learned Classifier That Uses Gene Expression Data to Accurately Predict Estrogen Receptor Status

BACKGROUND: Selecting the appropriate treatment for breast cancer requires accurately determining the estrogen receptor (ER) status of the tumor. However, the standard for determining this status, immunohistochemical analysis of formalin-fixed paraffin embedded samples, suffers from numerous technic...

Descripción completa

Detalles Bibliográficos
Autores principales: Bastani, Meysam, Vos, Larissa, Asgarian, Nasimeh, Deschenes, Jean, Graham, Kathryn, Mackey, John, Greiner, Russell
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3846850/
https://www.ncbi.nlm.nih.gov/pubmed/24312637
http://dx.doi.org/10.1371/journal.pone.0082144
_version_ 1782293499515764736
author Bastani, Meysam
Vos, Larissa
Asgarian, Nasimeh
Deschenes, Jean
Graham, Kathryn
Mackey, John
Greiner, Russell
author_facet Bastani, Meysam
Vos, Larissa
Asgarian, Nasimeh
Deschenes, Jean
Graham, Kathryn
Mackey, John
Greiner, Russell
author_sort Bastani, Meysam
collection PubMed
description BACKGROUND: Selecting the appropriate treatment for breast cancer requires accurately determining the estrogen receptor (ER) status of the tumor. However, the standard for determining this status, immunohistochemical analysis of formalin-fixed paraffin embedded samples, suffers from numerous technical and reproducibility issues. Assessment of ER-status based on RNA expression can provide more objective, quantitative and reproducible test results. METHODS: To learn a parsimonious RNA-based classifier of hormone receptor status, we applied a machine learning tool to a training dataset of gene expression microarray data obtained from 176 frozen breast tumors, whose ER-status was determined by applying ASCO-CAP guidelines to standardized immunohistochemical testing of formalin fixed tumor. RESULTS: This produced a three-gene classifier that can predict the ER-status of a novel tumor, with a cross-validation accuracy of 93.17±2.44%. When applied to an independent validation set and to four other public databases, some on different platforms, this classifier obtained over 90% accuracy in each. In addition, we found that this prediction rule separated the patients' recurrence-free survival curves with a hazard ratio lower than the one based on the IHC analysis of ER-status. CONCLUSIONS: Our efficient and parsimonious classifier lends itself to high throughput, highly accurate and low-cost RNA-based assessments of ER-status, suitable for routine high-throughput clinical use. This analytic method provides a proof-of-principle that may be applicable to developing effective RNA-based tests for other biomarkers and conditions.
format Online
Article
Text
id pubmed-3846850
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-38468502013-12-05 A Machine Learned Classifier That Uses Gene Expression Data to Accurately Predict Estrogen Receptor Status Bastani, Meysam Vos, Larissa Asgarian, Nasimeh Deschenes, Jean Graham, Kathryn Mackey, John Greiner, Russell PLoS One Research Article BACKGROUND: Selecting the appropriate treatment for breast cancer requires accurately determining the estrogen receptor (ER) status of the tumor. However, the standard for determining this status, immunohistochemical analysis of formalin-fixed paraffin embedded samples, suffers from numerous technical and reproducibility issues. Assessment of ER-status based on RNA expression can provide more objective, quantitative and reproducible test results. METHODS: To learn a parsimonious RNA-based classifier of hormone receptor status, we applied a machine learning tool to a training dataset of gene expression microarray data obtained from 176 frozen breast tumors, whose ER-status was determined by applying ASCO-CAP guidelines to standardized immunohistochemical testing of formalin fixed tumor. RESULTS: This produced a three-gene classifier that can predict the ER-status of a novel tumor, with a cross-validation accuracy of 93.17±2.44%. When applied to an independent validation set and to four other public databases, some on different platforms, this classifier obtained over 90% accuracy in each. In addition, we found that this prediction rule separated the patients' recurrence-free survival curves with a hazard ratio lower than the one based on the IHC analysis of ER-status. CONCLUSIONS: Our efficient and parsimonious classifier lends itself to high throughput, highly accurate and low-cost RNA-based assessments of ER-status, suitable for routine high-throughput clinical use. This analytic method provides a proof-of-principle that may be applicable to developing effective RNA-based tests for other biomarkers and conditions. Public Library of Science 2013-12-02 /pmc/articles/PMC3846850/ /pubmed/24312637 http://dx.doi.org/10.1371/journal.pone.0082144 Text en © 2013 Bastani et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Bastani, Meysam
Vos, Larissa
Asgarian, Nasimeh
Deschenes, Jean
Graham, Kathryn
Mackey, John
Greiner, Russell
A Machine Learned Classifier That Uses Gene Expression Data to Accurately Predict Estrogen Receptor Status
title A Machine Learned Classifier That Uses Gene Expression Data to Accurately Predict Estrogen Receptor Status
title_full A Machine Learned Classifier That Uses Gene Expression Data to Accurately Predict Estrogen Receptor Status
title_fullStr A Machine Learned Classifier That Uses Gene Expression Data to Accurately Predict Estrogen Receptor Status
title_full_unstemmed A Machine Learned Classifier That Uses Gene Expression Data to Accurately Predict Estrogen Receptor Status
title_short A Machine Learned Classifier That Uses Gene Expression Data to Accurately Predict Estrogen Receptor Status
title_sort machine learned classifier that uses gene expression data to accurately predict estrogen receptor status
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3846850/
https://www.ncbi.nlm.nih.gov/pubmed/24312637
http://dx.doi.org/10.1371/journal.pone.0082144
work_keys_str_mv AT bastanimeysam amachinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus
AT voslarissa amachinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus
AT asgariannasimeh amachinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus
AT deschenesjean amachinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus
AT grahamkathryn amachinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus
AT mackeyjohn amachinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus
AT greinerrussell amachinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus
AT bastanimeysam machinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus
AT voslarissa machinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus
AT asgariannasimeh machinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus
AT deschenesjean machinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus
AT grahamkathryn machinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus
AT mackeyjohn machinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus
AT greinerrussell machinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus