Cargando…

Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees

BACKGROUND: Microarray technology can acquire information about thousands of genes simultaneously. We analyzed published breast cancer microarray databases to predict five-year recurrence and compared the performance of three data mining algorithms of artificial neural networks (ANN), decision trees...

Descripción completa

Detalles Bibliográficos
Autores principales: Chou, Hsiu-Ling, Yao, Chung-Tay, Su, Sui-Lun, Lee, Chia-Yi, Hu, Kuang-Yu, Terng, Harn-Jing, Shih, Yun-Wen, Chang, Yu-Tien, Lu, Yu-Fen, Chang, Chi-Wen, Wahlqvist, Mark L, Wetter, Thomas, Chu, Chi-Ming
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3614553/
https://www.ncbi.nlm.nih.gov/pubmed/23506640
http://dx.doi.org/10.1186/1471-2105-14-100
_version_ 1782264862544494592
author Chou, Hsiu-Ling
Yao, Chung-Tay
Su, Sui-Lun
Lee, Chia-Yi
Hu, Kuang-Yu
Terng, Harn-Jing
Shih, Yun-Wen
Chang, Yu-Tien
Lu, Yu-Fen
Chang, Chi-Wen
Wahlqvist, Mark L
Wetter, Thomas
Chu, Chi-Ming
author_facet Chou, Hsiu-Ling
Yao, Chung-Tay
Su, Sui-Lun
Lee, Chia-Yi
Hu, Kuang-Yu
Terng, Harn-Jing
Shih, Yun-Wen
Chang, Yu-Tien
Lu, Yu-Fen
Chang, Chi-Wen
Wahlqvist, Mark L
Wetter, Thomas
Chu, Chi-Ming
author_sort Chou, Hsiu-Ling
collection PubMed
description BACKGROUND: Microarray technology can acquire information about thousands of genes simultaneously. We analyzed published breast cancer microarray databases to predict five-year recurrence and compared the performance of three data mining algorithms of artificial neural networks (ANN), decision trees (DT) and logistic regression (LR) and two composite models of DT-ANN and DT-LR. The collection of microarray datasets from the Gene Expression Omnibus, four breast cancer datasets were pooled for predicting five-year breast cancer relapse. After data compilation, 757 subjects, 5 clinical variables and 13,452 genetic variables were aggregated. The bootstrap method, Mann–Whitney U test and 20-fold cross-validation were performed to investigate candidate genes with 100 most-significant p-values. The predictive powers of DT, LR and ANN models were assessed using accuracy and the area under ROC curve. The associated genes were evaluated using Cox regression. RESULTS: The DT models exhibited the lowest predictive power and the poorest extrapolation when applied to the test samples. The ANN models displayed the best predictive power and showed the best extrapolation. The 21 most-associated genes, as determined by integration of each model, were analyzed using Cox regression with a 3.53-fold (95% CI: 2.24-5.58) increased risk of breast cancer five-year recurrence… CONCLUSIONS: The 21 selected genes can predict breast cancer recurrence. Among these genes, CCNB1, PLK1 and TOP2A are in the cell cycle G2/M DNA damage checkpoint pathway. Oncologists can offer the genetic information for patients when understanding the gene expression profiles on breast cancer recurrence.
format Online
Article
Text
id pubmed-3614553
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-36145532013-04-05 Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees Chou, Hsiu-Ling Yao, Chung-Tay Su, Sui-Lun Lee, Chia-Yi Hu, Kuang-Yu Terng, Harn-Jing Shih, Yun-Wen Chang, Yu-Tien Lu, Yu-Fen Chang, Chi-Wen Wahlqvist, Mark L Wetter, Thomas Chu, Chi-Ming BMC Bioinformatics Research Article BACKGROUND: Microarray technology can acquire information about thousands of genes simultaneously. We analyzed published breast cancer microarray databases to predict five-year recurrence and compared the performance of three data mining algorithms of artificial neural networks (ANN), decision trees (DT) and logistic regression (LR) and two composite models of DT-ANN and DT-LR. The collection of microarray datasets from the Gene Expression Omnibus, four breast cancer datasets were pooled for predicting five-year breast cancer relapse. After data compilation, 757 subjects, 5 clinical variables and 13,452 genetic variables were aggregated. The bootstrap method, Mann–Whitney U test and 20-fold cross-validation were performed to investigate candidate genes with 100 most-significant p-values. The predictive powers of DT, LR and ANN models were assessed using accuracy and the area under ROC curve. The associated genes were evaluated using Cox regression. RESULTS: The DT models exhibited the lowest predictive power and the poorest extrapolation when applied to the test samples. The ANN models displayed the best predictive power and showed the best extrapolation. The 21 most-associated genes, as determined by integration of each model, were analyzed using Cox regression with a 3.53-fold (95% CI: 2.24-5.58) increased risk of breast cancer five-year recurrence… CONCLUSIONS: The 21 selected genes can predict breast cancer recurrence. Among these genes, CCNB1, PLK1 and TOP2A are in the cell cycle G2/M DNA damage checkpoint pathway. Oncologists can offer the genetic information for patients when understanding the gene expression profiles on breast cancer recurrence. BioMed Central 2013-03-19 /pmc/articles/PMC3614553/ /pubmed/23506640 http://dx.doi.org/10.1186/1471-2105-14-100 Text en Copyright © 2013 Chou et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Chou, Hsiu-Ling
Yao, Chung-Tay
Su, Sui-Lun
Lee, Chia-Yi
Hu, Kuang-Yu
Terng, Harn-Jing
Shih, Yun-Wen
Chang, Yu-Tien
Lu, Yu-Fen
Chang, Chi-Wen
Wahlqvist, Mark L
Wetter, Thomas
Chu, Chi-Ming
Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees
title Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees
title_full Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees
title_fullStr Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees
title_full_unstemmed Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees
title_short Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees
title_sort gene expression profiling of breast cancer survivability by pooled cdna microarray analysis using logistic regression, artificial neural networks and decision trees
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3614553/
https://www.ncbi.nlm.nih.gov/pubmed/23506640
http://dx.doi.org/10.1186/1471-2105-14-100
work_keys_str_mv AT chouhsiuling geneexpressionprofilingofbreastcancersurvivabilitybypooledcdnamicroarrayanalysisusinglogisticregressionartificialneuralnetworksanddecisiontrees
AT yaochungtay geneexpressionprofilingofbreastcancersurvivabilitybypooledcdnamicroarrayanalysisusinglogisticregressionartificialneuralnetworksanddecisiontrees
AT susuilun geneexpressionprofilingofbreastcancersurvivabilitybypooledcdnamicroarrayanalysisusinglogisticregressionartificialneuralnetworksanddecisiontrees
AT leechiayi geneexpressionprofilingofbreastcancersurvivabilitybypooledcdnamicroarrayanalysisusinglogisticregressionartificialneuralnetworksanddecisiontrees
AT hukuangyu geneexpressionprofilingofbreastcancersurvivabilitybypooledcdnamicroarrayanalysisusinglogisticregressionartificialneuralnetworksanddecisiontrees
AT terngharnjing geneexpressionprofilingofbreastcancersurvivabilitybypooledcdnamicroarrayanalysisusinglogisticregressionartificialneuralnetworksanddecisiontrees
AT shihyunwen geneexpressionprofilingofbreastcancersurvivabilitybypooledcdnamicroarrayanalysisusinglogisticregressionartificialneuralnetworksanddecisiontrees
AT changyutien geneexpressionprofilingofbreastcancersurvivabilitybypooledcdnamicroarrayanalysisusinglogisticregressionartificialneuralnetworksanddecisiontrees
AT luyufen geneexpressionprofilingofbreastcancersurvivabilitybypooledcdnamicroarrayanalysisusinglogisticregressionartificialneuralnetworksanddecisiontrees
AT changchiwen geneexpressionprofilingofbreastcancersurvivabilitybypooledcdnamicroarrayanalysisusinglogisticregressionartificialneuralnetworksanddecisiontrees
AT wahlqvistmarkl geneexpressionprofilingofbreastcancersurvivabilitybypooledcdnamicroarrayanalysisusinglogisticregressionartificialneuralnetworksanddecisiontrees
AT wetterthomas geneexpressionprofilingofbreastcancersurvivabilitybypooledcdnamicroarrayanalysisusinglogisticregressionartificialneuralnetworksanddecisiontrees
AT chuchiming geneexpressionprofilingofbreastcancersurvivabilitybypooledcdnamicroarrayanalysisusinglogisticregressionartificialneuralnetworksanddecisiontrees