Cargando…

Robust edge-based biomarker discovery improves prediction of breast cancer metastasis

BACKGROUND: The abundance of molecular profiling of breast cancer tissues entailed active research on molecular marker-based early diagnosis of metastasis. Recently there is a surging interest in combining gene expression with gene networks such as protein-protein interaction (PPI) network, gene co-...

Descripción completa

Detalles Bibliográficos
Autores principales: Adnan, Nahim, Lei, Chengwei, Ruan, Jianhua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7526355/
https://www.ncbi.nlm.nih.gov/pubmed/32998692
http://dx.doi.org/10.1186/s12859-020-03692-2
_version_ 1783588857022775296
author Adnan, Nahim
Lei, Chengwei
Ruan, Jianhua
author_facet Adnan, Nahim
Lei, Chengwei
Ruan, Jianhua
author_sort Adnan, Nahim
collection PubMed
description BACKGROUND: The abundance of molecular profiling of breast cancer tissues entailed active research on molecular marker-based early diagnosis of metastasis. Recently there is a surging interest in combining gene expression with gene networks such as protein-protein interaction (PPI) network, gene co-expression (CE) network and pathway information to identify robust and accurate biomarkers for metastasis prediction, reflecting the common belief that cancer is a systems biology disease. However, controversy exists in the literature regarding whether network markers are indeed better features than genes alone for predicting as well as understanding metastasis. We believe much of the existing results may have been biased by the overly complicated prediction algorithms, unfair evaluation, and lack of rigorous statistics. In this study, we propose a simple approach to use network edges as features, based on two types of networks respectively, and compared their prediction power using three classification algorithms and rigorous statistical procedure on one of the largest datasets available. To detect biomarkers that are significant for the prediction and to compare the robustness of different feature types, we propose an unbiased and novel procedure to measure feature importance that eliminates the potential bias from factors such as different sample size, number of features, as well as class distribution. RESULTS: Experimental results reveal that edge-based feature types consistently outperformed gene-based feature type in random forest and logistic regression models under all performance evaluation metrics, while the prediction accuracy of edge-based support vector machine (SVM) model was poorer, due to the larger number of edge features compared to gene features and the lack of feature selection in SVM model. Experimental results also show that edge features are much more robust than gene features and the top biomarkers from edge feature types are statistically more significantly enriched in the biological processes that are well known to be related to breast cancer metastasis. CONCLUSIONS: Overall, this study validates the utility of edge features as biomarkers but also highlights the importance of carefully designed experimental procedures in order to achieve statistically reliable comparison results.
format Online
Article
Text
id pubmed-7526355
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-75263552020-10-01 Robust edge-based biomarker discovery improves prediction of breast cancer metastasis Adnan, Nahim Lei, Chengwei Ruan, Jianhua BMC Bioinformatics Research BACKGROUND: The abundance of molecular profiling of breast cancer tissues entailed active research on molecular marker-based early diagnosis of metastasis. Recently there is a surging interest in combining gene expression with gene networks such as protein-protein interaction (PPI) network, gene co-expression (CE) network and pathway information to identify robust and accurate biomarkers for metastasis prediction, reflecting the common belief that cancer is a systems biology disease. However, controversy exists in the literature regarding whether network markers are indeed better features than genes alone for predicting as well as understanding metastasis. We believe much of the existing results may have been biased by the overly complicated prediction algorithms, unfair evaluation, and lack of rigorous statistics. In this study, we propose a simple approach to use network edges as features, based on two types of networks respectively, and compared their prediction power using three classification algorithms and rigorous statistical procedure on one of the largest datasets available. To detect biomarkers that are significant for the prediction and to compare the robustness of different feature types, we propose an unbiased and novel procedure to measure feature importance that eliminates the potential bias from factors such as different sample size, number of features, as well as class distribution. RESULTS: Experimental results reveal that edge-based feature types consistently outperformed gene-based feature type in random forest and logistic regression models under all performance evaluation metrics, while the prediction accuracy of edge-based support vector machine (SVM) model was poorer, due to the larger number of edge features compared to gene features and the lack of feature selection in SVM model. Experimental results also show that edge features are much more robust than gene features and the top biomarkers from edge feature types are statistically more significantly enriched in the biological processes that are well known to be related to breast cancer metastasis. CONCLUSIONS: Overall, this study validates the utility of edge features as biomarkers but also highlights the importance of carefully designed experimental procedures in order to achieve statistically reliable comparison results. BioMed Central 2020-09-30 /pmc/articles/PMC7526355/ /pubmed/32998692 http://dx.doi.org/10.1186/s12859-020-03692-2 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Adnan, Nahim
Lei, Chengwei
Ruan, Jianhua
Robust edge-based biomarker discovery improves prediction of breast cancer metastasis
title Robust edge-based biomarker discovery improves prediction of breast cancer metastasis
title_full Robust edge-based biomarker discovery improves prediction of breast cancer metastasis
title_fullStr Robust edge-based biomarker discovery improves prediction of breast cancer metastasis
title_full_unstemmed Robust edge-based biomarker discovery improves prediction of breast cancer metastasis
title_short Robust edge-based biomarker discovery improves prediction of breast cancer metastasis
title_sort robust edge-based biomarker discovery improves prediction of breast cancer metastasis
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7526355/
https://www.ncbi.nlm.nih.gov/pubmed/32998692
http://dx.doi.org/10.1186/s12859-020-03692-2
work_keys_str_mv AT adnannahim robustedgebasedbiomarkerdiscoveryimprovespredictionofbreastcancermetastasis
AT leichengwei robustedgebasedbiomarkerdiscoveryimprovespredictionofbreastcancermetastasis
AT ruanjianhua robustedgebasedbiomarkerdiscoveryimprovespredictionofbreastcancermetastasis