Cargando…
Comparative evaluation of network features for the prediction of breast cancer metastasis
BACKGROUND: Discovering a highly accurate and robust gene signature for the prediction of breast cancer metastasis from gene expression profiling of primary tumors is one of the most challenging tasks to reduce the number of deaths in women. Due to the limited success of gene-based features in achie...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7119280/ https://www.ncbi.nlm.nih.gov/pubmed/32241278 http://dx.doi.org/10.1186/s12920-020-0676-3 |
_version_ | 1783514740292583424 |
---|---|
author | Adnan, Nahim Liu, Zhijie Huang, Tim H.M. Ruan, Jianhua |
author_facet | Adnan, Nahim Liu, Zhijie Huang, Tim H.M. Ruan, Jianhua |
author_sort | Adnan, Nahim |
collection | PubMed |
description | BACKGROUND: Discovering a highly accurate and robust gene signature for the prediction of breast cancer metastasis from gene expression profiling of primary tumors is one of the most challenging tasks to reduce the number of deaths in women. Due to the limited success of gene-based features in achieving satisfactory prediction accuracy, many methodologies have been proposed in recent years to develop network-based features by integrating network information with gene expression. However, evaluation results are inconsistent to confirm the effectiveness of network-based features, because of many confounding factors involved in classification model learning process, such as data normalization, dimension reduction, and feature selection. An unbiased comparative evaluation is essential for uncovering the strength of network-based features. METHODS: In this study, we compared several types of network-based features obtained using different mathematical operators (Mean, Maximum, Minimum, Median, Variance) on geneset (i.e., a gene and its’ neighbors in the network) in protein-protein interaction network and gene co-expression network for their ability in predicting breast cancer metastasis using gene expression data from more than 10 patient cohorts. RESULTS: While network-based features are usually statistically more significant than gene-based feature, a consistent improvement of prediction performance using network-based features requires a substantial number of patients in the dataset. In contrary to many previous reports, no evidence was found to support the robustness of network-based features and we argue some of the robustness may be due to the inherent bias associated with node degree in the network. In addition, different types of network features seem to cover different pathways and are complementary to each other. Consequently, an ensemble classifier combining different network features was proposed and was found to significantly outperform classifiers based on gene-based feature or any single type of network-based features. CONCLUSIONS: Network-based features and their combination show promise for improving the prediction of breast cancer metastasis but may require a large amount of training data. Robustness claim of network-based features needs to be re-examined with network node degree and other confounding factors in consideration. |
format | Online Article Text |
id | pubmed-7119280 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-71192802020-04-07 Comparative evaluation of network features for the prediction of breast cancer metastasis Adnan, Nahim Liu, Zhijie Huang, Tim H.M. Ruan, Jianhua BMC Med Genomics Research BACKGROUND: Discovering a highly accurate and robust gene signature for the prediction of breast cancer metastasis from gene expression profiling of primary tumors is one of the most challenging tasks to reduce the number of deaths in women. Due to the limited success of gene-based features in achieving satisfactory prediction accuracy, many methodologies have been proposed in recent years to develop network-based features by integrating network information with gene expression. However, evaluation results are inconsistent to confirm the effectiveness of network-based features, because of many confounding factors involved in classification model learning process, such as data normalization, dimension reduction, and feature selection. An unbiased comparative evaluation is essential for uncovering the strength of network-based features. METHODS: In this study, we compared several types of network-based features obtained using different mathematical operators (Mean, Maximum, Minimum, Median, Variance) on geneset (i.e., a gene and its’ neighbors in the network) in protein-protein interaction network and gene co-expression network for their ability in predicting breast cancer metastasis using gene expression data from more than 10 patient cohorts. RESULTS: While network-based features are usually statistically more significant than gene-based feature, a consistent improvement of prediction performance using network-based features requires a substantial number of patients in the dataset. In contrary to many previous reports, no evidence was found to support the robustness of network-based features and we argue some of the robustness may be due to the inherent bias associated with node degree in the network. In addition, different types of network features seem to cover different pathways and are complementary to each other. Consequently, an ensemble classifier combining different network features was proposed and was found to significantly outperform classifiers based on gene-based feature or any single type of network-based features. CONCLUSIONS: Network-based features and their combination show promise for improving the prediction of breast cancer metastasis but may require a large amount of training data. Robustness claim of network-based features needs to be re-examined with network node degree and other confounding factors in consideration. BioMed Central 2020-04-03 /pmc/articles/PMC7119280/ /pubmed/32241278 http://dx.doi.org/10.1186/s12920-020-0676-3 Text en © The Author(s) 2020 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Adnan, Nahim Liu, Zhijie Huang, Tim H.M. Ruan, Jianhua Comparative evaluation of network features for the prediction of breast cancer metastasis |
title | Comparative evaluation of network features for the prediction of breast cancer metastasis |
title_full | Comparative evaluation of network features for the prediction of breast cancer metastasis |
title_fullStr | Comparative evaluation of network features for the prediction of breast cancer metastasis |
title_full_unstemmed | Comparative evaluation of network features for the prediction of breast cancer metastasis |
title_short | Comparative evaluation of network features for the prediction of breast cancer metastasis |
title_sort | comparative evaluation of network features for the prediction of breast cancer metastasis |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7119280/ https://www.ncbi.nlm.nih.gov/pubmed/32241278 http://dx.doi.org/10.1186/s12920-020-0676-3 |
work_keys_str_mv | AT adnannahim comparativeevaluationofnetworkfeaturesforthepredictionofbreastcancermetastasis AT liuzhijie comparativeevaluationofnetworkfeaturesforthepredictionofbreastcancermetastasis AT huangtimhm comparativeevaluationofnetworkfeaturesforthepredictionofbreastcancermetastasis AT ruanjianhua comparativeevaluationofnetworkfeaturesforthepredictionofbreastcancermetastasis |