Cargando…

Breast Cancer Prognosis Risk Estimation Using Integrated Gene Expression and Clinical Data

Background. Novel prognostic markers are needed so newly diagnosed breast cancer patients do not undergo any unnecessary therapy. Various microarray gene expression datasets based studies have generated gene signatures to predict the prognosis outcomes, while ignoring the large amount of information...

Descripción completa

Detalles Bibliográficos
Autores principales: Saini, Ashish, Hou, Jingyu, Zhou, Wanlei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4052785/
https://www.ncbi.nlm.nih.gov/pubmed/24949450
http://dx.doi.org/10.1155/2014/459203
_version_ 1782320288442089472
author Saini, Ashish
Hou, Jingyu
Zhou, Wanlei
author_facet Saini, Ashish
Hou, Jingyu
Zhou, Wanlei
author_sort Saini, Ashish
collection PubMed
description Background. Novel prognostic markers are needed so newly diagnosed breast cancer patients do not undergo any unnecessary therapy. Various microarray gene expression datasets based studies have generated gene signatures to predict the prognosis outcomes, while ignoring the large amount of information contained in established clinical markers. Nevertheless, small sample sizes in individual microarray datasets remain a bottleneck in generating robust gene signatures that show limited predictive power. The aim of this study is to achieve high classification accuracy for the good prognosis group and then achieve high classification accuracy for the poor prognosis group. Methods. We propose a novel algorithm called the IPRE (integrated prognosis risk estimation) algorithm. We used integrated microarray datasets from multiple studies to increase the sample sizes (∼2,700 samples). The IPRE algorithm consists of a virtual chromosome for the extraction of the prognostic gene signature that has 79 genes, and a multivariate logistic regression model that incorporates clinical data along with expression data to generate the risk score formula that accurately categorizes breast cancer patients into two prognosis groups. Results. The evaluation on two testing datasets showed that the IPRE algorithm achieved high classification accuracies of 82% and 87%, which was far greater than any existing algorithms.
format Online
Article
Text
id pubmed-4052785
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-40527852014-06-19 Breast Cancer Prognosis Risk Estimation Using Integrated Gene Expression and Clinical Data Saini, Ashish Hou, Jingyu Zhou, Wanlei Biomed Res Int Research Article Background. Novel prognostic markers are needed so newly diagnosed breast cancer patients do not undergo any unnecessary therapy. Various microarray gene expression datasets based studies have generated gene signatures to predict the prognosis outcomes, while ignoring the large amount of information contained in established clinical markers. Nevertheless, small sample sizes in individual microarray datasets remain a bottleneck in generating robust gene signatures that show limited predictive power. The aim of this study is to achieve high classification accuracy for the good prognosis group and then achieve high classification accuracy for the poor prognosis group. Methods. We propose a novel algorithm called the IPRE (integrated prognosis risk estimation) algorithm. We used integrated microarray datasets from multiple studies to increase the sample sizes (∼2,700 samples). The IPRE algorithm consists of a virtual chromosome for the extraction of the prognostic gene signature that has 79 genes, and a multivariate logistic regression model that incorporates clinical data along with expression data to generate the risk score formula that accurately categorizes breast cancer patients into two prognosis groups. Results. The evaluation on two testing datasets showed that the IPRE algorithm achieved high classification accuracies of 82% and 87%, which was far greater than any existing algorithms. Hindawi Publishing Corporation 2014 2014-05-14 /pmc/articles/PMC4052785/ /pubmed/24949450 http://dx.doi.org/10.1155/2014/459203 Text en Copyright © 2014 Ashish Saini et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Saini, Ashish
Hou, Jingyu
Zhou, Wanlei
Breast Cancer Prognosis Risk Estimation Using Integrated Gene Expression and Clinical Data
title Breast Cancer Prognosis Risk Estimation Using Integrated Gene Expression and Clinical Data
title_full Breast Cancer Prognosis Risk Estimation Using Integrated Gene Expression and Clinical Data
title_fullStr Breast Cancer Prognosis Risk Estimation Using Integrated Gene Expression and Clinical Data
title_full_unstemmed Breast Cancer Prognosis Risk Estimation Using Integrated Gene Expression and Clinical Data
title_short Breast Cancer Prognosis Risk Estimation Using Integrated Gene Expression and Clinical Data
title_sort breast cancer prognosis risk estimation using integrated gene expression and clinical data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4052785/
https://www.ncbi.nlm.nih.gov/pubmed/24949450
http://dx.doi.org/10.1155/2014/459203
work_keys_str_mv AT sainiashish breastcancerprognosisriskestimationusingintegratedgeneexpressionandclinicaldata
AT houjingyu breastcancerprognosisriskestimationusingintegratedgeneexpressionandclinicaldata
AT zhouwanlei breastcancerprognosisriskestimationusingintegratedgeneexpressionandclinicaldata