Cargando…

A Hybrid One-Way ANOVA Approach for the Robust and Efficient Estimation of Differential Gene Expression with Multiple Patterns

BACKGROUND: Identifying genes that are differentially expressed (DE) between two or more conditions with multiple patterns of expression is one of the primary objectives of gene expression data analysis. Several statistical approaches, including one-way analysis of variance (ANOVA), are used to iden...

Descripción completa

Detalles Bibliográficos
Autores principales: Mollah, Mohammad Manir Hossain, Jamal, Rahman, Mokhtar, Norfilza Mohd, Harun, Roslan, Mollah, Md. Nurul Haque
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4587675/
https://www.ncbi.nlm.nih.gov/pubmed/26413858
http://dx.doi.org/10.1371/journal.pone.0138810
_version_ 1782392499061391360
author Mollah, Mohammad Manir Hossain
Jamal, Rahman
Mokhtar, Norfilza Mohd
Harun, Roslan
Mollah, Md. Nurul Haque
author_facet Mollah, Mohammad Manir Hossain
Jamal, Rahman
Mokhtar, Norfilza Mohd
Harun, Roslan
Mollah, Md. Nurul Haque
author_sort Mollah, Mohammad Manir Hossain
collection PubMed
description BACKGROUND: Identifying genes that are differentially expressed (DE) between two or more conditions with multiple patterns of expression is one of the primary objectives of gene expression data analysis. Several statistical approaches, including one-way analysis of variance (ANOVA), are used to identify DE genes. However, most of these methods provide misleading results for two or more conditions with multiple patterns of expression in the presence of outlying genes. In this paper, an attempt is made to develop a hybrid one-way ANOVA approach that unifies the robustness and efficiency of estimation using the minimum β-divergence method to overcome some problems that arise in the existing robust methods for both small- and large-sample cases with multiple patterns of expression. RESULTS: The proposed method relies on a β-weight function, which produces values between 0 and 1. The β-weight function with β = 0.2 is used as a measure of outlier detection. It assigns smaller weights (≥ 0) to outlying expressions and larger weights (≤ 1) to typical expressions. The distribution of the β-weights is used to calculate the cut-off point, which is compared to the observed β-weight of an expression to determine whether that gene expression is an outlier. This weight function plays a key role in unifying the robustness and efficiency of estimation in one-way ANOVA. CONCLUSION: Analyses of simulated gene expression profiles revealed that all eight methods (ANOVA, SAM, LIMMA, EBarrays, eLNN, KW, robust BetaEB and proposed) perform almost identically for m = 2 conditions in the absence of outliers. However, the robust BetaEB method and the proposed method exhibited considerably better performance than the other six methods in the presence of outliers. In this case, the BetaEB method exhibited slightly better performance than the proposed method for the small-sample cases, but the the proposed method exhibited much better performance than the BetaEB method for both the small- and large-sample cases in the presence of more than 50% outlying genes. The proposed method also exhibited better performance than the other methods for m > 2 conditions with multiple patterns of expression, where the BetaEB was not extended for this condition. Therefore, the proposed approach would be more suitable and reliable on average for the identification of DE genes between two or more conditions with multiple patterns of expression.
format Online
Article
Text
id pubmed-4587675
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-45876752015-10-01 A Hybrid One-Way ANOVA Approach for the Robust and Efficient Estimation of Differential Gene Expression with Multiple Patterns Mollah, Mohammad Manir Hossain Jamal, Rahman Mokhtar, Norfilza Mohd Harun, Roslan Mollah, Md. Nurul Haque PLoS One Research Article BACKGROUND: Identifying genes that are differentially expressed (DE) between two or more conditions with multiple patterns of expression is one of the primary objectives of gene expression data analysis. Several statistical approaches, including one-way analysis of variance (ANOVA), are used to identify DE genes. However, most of these methods provide misleading results for two or more conditions with multiple patterns of expression in the presence of outlying genes. In this paper, an attempt is made to develop a hybrid one-way ANOVA approach that unifies the robustness and efficiency of estimation using the minimum β-divergence method to overcome some problems that arise in the existing robust methods for both small- and large-sample cases with multiple patterns of expression. RESULTS: The proposed method relies on a β-weight function, which produces values between 0 and 1. The β-weight function with β = 0.2 is used as a measure of outlier detection. It assigns smaller weights (≥ 0) to outlying expressions and larger weights (≤ 1) to typical expressions. The distribution of the β-weights is used to calculate the cut-off point, which is compared to the observed β-weight of an expression to determine whether that gene expression is an outlier. This weight function plays a key role in unifying the robustness and efficiency of estimation in one-way ANOVA. CONCLUSION: Analyses of simulated gene expression profiles revealed that all eight methods (ANOVA, SAM, LIMMA, EBarrays, eLNN, KW, robust BetaEB and proposed) perform almost identically for m = 2 conditions in the absence of outliers. However, the robust BetaEB method and the proposed method exhibited considerably better performance than the other six methods in the presence of outliers. In this case, the BetaEB method exhibited slightly better performance than the proposed method for the small-sample cases, but the the proposed method exhibited much better performance than the BetaEB method for both the small- and large-sample cases in the presence of more than 50% outlying genes. The proposed method also exhibited better performance than the other methods for m > 2 conditions with multiple patterns of expression, where the BetaEB was not extended for this condition. Therefore, the proposed approach would be more suitable and reliable on average for the identification of DE genes between two or more conditions with multiple patterns of expression. Public Library of Science 2015-09-28 /pmc/articles/PMC4587675/ /pubmed/26413858 http://dx.doi.org/10.1371/journal.pone.0138810 Text en © 2015 Mollah et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Mollah, Mohammad Manir Hossain
Jamal, Rahman
Mokhtar, Norfilza Mohd
Harun, Roslan
Mollah, Md. Nurul Haque
A Hybrid One-Way ANOVA Approach for the Robust and Efficient Estimation of Differential Gene Expression with Multiple Patterns
title A Hybrid One-Way ANOVA Approach for the Robust and Efficient Estimation of Differential Gene Expression with Multiple Patterns
title_full A Hybrid One-Way ANOVA Approach for the Robust and Efficient Estimation of Differential Gene Expression with Multiple Patterns
title_fullStr A Hybrid One-Way ANOVA Approach for the Robust and Efficient Estimation of Differential Gene Expression with Multiple Patterns
title_full_unstemmed A Hybrid One-Way ANOVA Approach for the Robust and Efficient Estimation of Differential Gene Expression with Multiple Patterns
title_short A Hybrid One-Way ANOVA Approach for the Robust and Efficient Estimation of Differential Gene Expression with Multiple Patterns
title_sort hybrid one-way anova approach for the robust and efficient estimation of differential gene expression with multiple patterns
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4587675/
https://www.ncbi.nlm.nih.gov/pubmed/26413858
http://dx.doi.org/10.1371/journal.pone.0138810
work_keys_str_mv AT mollahmohammadmanirhossain ahybridonewayanovaapproachfortherobustandefficientestimationofdifferentialgeneexpressionwithmultiplepatterns
AT jamalrahman ahybridonewayanovaapproachfortherobustandefficientestimationofdifferentialgeneexpressionwithmultiplepatterns
AT mokhtarnorfilzamohd ahybridonewayanovaapproachfortherobustandefficientestimationofdifferentialgeneexpressionwithmultiplepatterns
AT harunroslan ahybridonewayanovaapproachfortherobustandefficientestimationofdifferentialgeneexpressionwithmultiplepatterns
AT mollahmdnurulhaque ahybridonewayanovaapproachfortherobustandefficientestimationofdifferentialgeneexpressionwithmultiplepatterns
AT mollahmohammadmanirhossain hybridonewayanovaapproachfortherobustandefficientestimationofdifferentialgeneexpressionwithmultiplepatterns
AT jamalrahman hybridonewayanovaapproachfortherobustandefficientestimationofdifferentialgeneexpressionwithmultiplepatterns
AT mokhtarnorfilzamohd hybridonewayanovaapproachfortherobustandefficientestimationofdifferentialgeneexpressionwithmultiplepatterns
AT harunroslan hybridonewayanovaapproachfortherobustandefficientestimationofdifferentialgeneexpressionwithmultiplepatterns
AT mollahmdnurulhaque hybridonewayanovaapproachfortherobustandefficientestimationofdifferentialgeneexpressionwithmultiplepatterns