Cargando…

A Robust Approach for Identification of Cancer Biomarkers and Candidate Drugs

Background and objectives: Identification of cancer biomarkers that are differentially expressed (DE) between two biological conditions is an important task in many microarray studies. There exist several methods in the literature in this regards and most of these methods designed especially for unp...

Descripción completa

Detalles Bibliográficos
Autores principales:	Shahjaman, Md., Rahman, Md. Rezanur, Islam, S. M. Shahinul, Mollah, Md. Nurul Haque
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2019
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6631768/ https://www.ncbi.nlm.nih.gov/pubmed/31212673 http://dx.doi.org/10.3390/medicina55060269

_version_	1783435595539808256
author	Shahjaman, Md. Rahman, Md. Rezanur Islam, S. M. Shahinul Mollah, Md. Nurul Haque
author_facet	Shahjaman, Md. Rahman, Md. Rezanur Islam, S. M. Shahinul Mollah, Md. Nurul Haque
author_sort	Shahjaman, Md.
collection	PubMed
description	Background and objectives: Identification of cancer biomarkers that are differentially expressed (DE) between two biological conditions is an important task in many microarray studies. There exist several methods in the literature in this regards and most of these methods designed especially for unpaired samples, those are not suitable for paired samples. Furthermore, the traditional methods use p-values or fold change (FC) values to detect the DE genes. However, sometimes, p-value based results do not comply with FC based results due to the smaller pooled variance of gene expressions, which occurs when variance of each individual condition becomes smaller. There are some methods that combine both p-values and FC values to solve this problem. But, those methods also show weak performance for small sample cases in the presence of outlying expressions. To overcome this problem, in this paper, an attempt is made to propose a hybrid robust SAM-FC approach by combining rank of FC values and rank of p-values computed by SAM statistic using minimum β-divergence method, which is designed for paired samples. Materials and Methods: The proposed method introduces a weight function known as β-weight function. This weight function produces larger weights corresponding to usual and smaller weights for unusual expressions. The β-weight function plays the significant role on the performance of the proposed method. The proposed method uses β-weight function as a measure of outlier detection by setting β = 0.2. We unify both classical and robust estimates using β-weight function, such that maximum likelihood estimators (MLEs) are used in absence of outliers and minimum β-divergence estimators are used in presence of outliers to obtain reasonable p-values and FC values in the proposed method. Results: We examined the performance of proposed method in a comparison of some popular methods (t-test, SAM, LIMMA, Wilcoxon, WAD, RP, and FCROS) using both simulated and real gene expression profiles for both small and large sample cases. From the simulation and a real spike in data analysis results, we observed that the proposed method outperforms other methods for small sample cases in the presence of outliers and it keeps almost equal performance with other robust methods (Wilcoxon, RP, and FCROS) otherwise. From the head and neck cancer (HNC) gene expression dataset, the proposed method identified two additional genes (CYP3A4 and NOVA1) that are significantly enriched in linoleic acid metabolism, drug metabolism, steroid hormone biosynthesis and metabolic pathways. The survival analysis through Kaplan–Meier curve revealed that combined effect of these two genes has prognostic capability and they might be promising biomarker of HNC. Moreover, we retrieved the 12 candidate drugs based on gene interaction from glad4u and drug bank literature based gene associations. Conclusions: Using pathway analysis, disease association study, protein–protein interactions and survival analysis we found that our proposed two additional genes might be involved in the critical pathways of cancer. Furthermore, the identified drugs showed statistical significance which indicates that proteins associated with these genes might be therapeutic target in cancer.
format	Online Article Text
id	pubmed-6631768
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-66317682019-08-19 A Robust Approach for Identification of Cancer Biomarkers and Candidate Drugs Shahjaman, Md. Rahman, Md. Rezanur Islam, S. M. Shahinul Mollah, Md. Nurul Haque Medicina (Kaunas) Article Background and objectives: Identification of cancer biomarkers that are differentially expressed (DE) between two biological conditions is an important task in many microarray studies. There exist several methods in the literature in this regards and most of these methods designed especially for unpaired samples, those are not suitable for paired samples. Furthermore, the traditional methods use p-values or fold change (FC) values to detect the DE genes. However, sometimes, p-value based results do not comply with FC based results due to the smaller pooled variance of gene expressions, which occurs when variance of each individual condition becomes smaller. There are some methods that combine both p-values and FC values to solve this problem. But, those methods also show weak performance for small sample cases in the presence of outlying expressions. To overcome this problem, in this paper, an attempt is made to propose a hybrid robust SAM-FC approach by combining rank of FC values and rank of p-values computed by SAM statistic using minimum β-divergence method, which is designed for paired samples. Materials and Methods: The proposed method introduces a weight function known as β-weight function. This weight function produces larger weights corresponding to usual and smaller weights for unusual expressions. The β-weight function plays the significant role on the performance of the proposed method. The proposed method uses β-weight function as a measure of outlier detection by setting β = 0.2. We unify both classical and robust estimates using β-weight function, such that maximum likelihood estimators (MLEs) are used in absence of outliers and minimum β-divergence estimators are used in presence of outliers to obtain reasonable p-values and FC values in the proposed method. Results: We examined the performance of proposed method in a comparison of some popular methods (t-test, SAM, LIMMA, Wilcoxon, WAD, RP, and FCROS) using both simulated and real gene expression profiles for both small and large sample cases. From the simulation and a real spike in data analysis results, we observed that the proposed method outperforms other methods for small sample cases in the presence of outliers and it keeps almost equal performance with other robust methods (Wilcoxon, RP, and FCROS) otherwise. From the head and neck cancer (HNC) gene expression dataset, the proposed method identified two additional genes (CYP3A4 and NOVA1) that are significantly enriched in linoleic acid metabolism, drug metabolism, steroid hormone biosynthesis and metabolic pathways. The survival analysis through Kaplan–Meier curve revealed that combined effect of these two genes has prognostic capability and they might be promising biomarker of HNC. Moreover, we retrieved the 12 candidate drugs based on gene interaction from glad4u and drug bank literature based gene associations. Conclusions: Using pathway analysis, disease association study, protein–protein interactions and survival analysis we found that our proposed two additional genes might be involved in the critical pathways of cancer. Furthermore, the identified drugs showed statistical significance which indicates that proteins associated with these genes might be therapeutic target in cancer. MDPI 2019-06-11 /pmc/articles/PMC6631768/ /pubmed/31212673 http://dx.doi.org/10.3390/medicina55060269 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Shahjaman, Md. Rahman, Md. Rezanur Islam, S. M. Shahinul Mollah, Md. Nurul Haque A Robust Approach for Identification of Cancer Biomarkers and Candidate Drugs
title	A Robust Approach for Identification of Cancer Biomarkers and Candidate Drugs
title_full	A Robust Approach for Identification of Cancer Biomarkers and Candidate Drugs
title_fullStr	A Robust Approach for Identification of Cancer Biomarkers and Candidate Drugs
title_full_unstemmed	A Robust Approach for Identification of Cancer Biomarkers and Candidate Drugs
title_short	A Robust Approach for Identification of Cancer Biomarkers and Candidate Drugs
title_sort	robust approach for identification of cancer biomarkers and candidate drugs
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6631768/ https://www.ncbi.nlm.nih.gov/pubmed/31212673 http://dx.doi.org/10.3390/medicina55060269
work_keys_str_mv	AT shahjamanmd arobustapproachforidentificationofcancerbiomarkersandcandidatedrugs AT rahmanmdrezanur arobustapproachforidentificationofcancerbiomarkersandcandidatedrugs AT islamsmshahinul arobustapproachforidentificationofcancerbiomarkersandcandidatedrugs AT mollahmdnurulhaque arobustapproachforidentificationofcancerbiomarkersandcandidatedrugs AT shahjamanmd robustapproachforidentificationofcancerbiomarkersandcandidatedrugs AT rahmanmdrezanur robustapproachforidentificationofcancerbiomarkersandcandidatedrugs AT islamsmshahinul robustapproachforidentificationofcancerbiomarkersandcandidatedrugs AT mollahmdnurulhaque robustapproachforidentificationofcancerbiomarkersandcandidatedrugs

A Robust Approach for Identification of Cancer Biomarkers and Candidate Drugs

Ejemplares similares