Cargando…
A novel Markov Blanket-based repeated-fishing strategy for capturing phenotype-related biomarkers in big omics data
BACKGROUND: We propose a novel Markov Blanket-based repeated-fishing strategy (MBRFS) in attempt to increase the power of existing Markov Blanket method (DASSO-MB) and maintain its advantages in omic data analysis. RESULTS: Both simulation and real data analysis were conducted to assess its performa...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4784463/ https://www.ncbi.nlm.nih.gov/pubmed/26957081 http://dx.doi.org/10.1186/s12863-016-0358-5 |
_version_ | 1782420271886499840 |
---|---|
author | Li, Hongkai Yuan, Zhongshang Ji, Jiadong Xu, Jing Zhang, Tao Zhang, Xiaoshuai Xue, Fuzhong |
author_facet | Li, Hongkai Yuan, Zhongshang Ji, Jiadong Xu, Jing Zhang, Tao Zhang, Xiaoshuai Xue, Fuzhong |
author_sort | Li, Hongkai |
collection | PubMed |
description | BACKGROUND: We propose a novel Markov Blanket-based repeated-fishing strategy (MBRFS) in attempt to increase the power of existing Markov Blanket method (DASSO-MB) and maintain its advantages in omic data analysis. RESULTS: Both simulation and real data analysis were conducted to assess its performances by comparing with other methods including χ(2) test with Bonferroni and B-H adjustment, least absolute shrinkage and selection operator (LASSO) and DASSO-MB. A serious of simulation studies showed that the true discovery rate (TDR) of proposed MBRFS was always close to zero under null hypothesis (odds ratio = 1 for each SNPs) with excellent stability in all three scenarios of independent phenotype-related SNPs without linkage disequilibrium (LD) around them, correlated phenotype-related SNPs without LD around them, and phenotype-related SNPs with strong LD around them. As expected, under different odds ratio and minor allel frequency (MAFs), MBRFS always had the best performances in capturing the true phenotype-related biomarkers with higher matthews correlation coefficience (MCC) for all three scenarios above. More importantly, since proposed MBRFS using the repeated fishing strategy, it still captures more phenotype-related SNPs with minor effects when non-significant phenotype-related SNPs emerged under χ(2) test after Bonferroni multiple correction. The various real omics data analysis, including GWAS data, DNA methylation data, gene expression data and metabolites data, indicated that the proposed MBRFS always detected relatively reasonable biomarkers. CONCLUSIONS: Our proposed MBRFS can exactly capture the true phenotype-related biomarkers with the reduction of false negative rate when the phenotype-related biomarkers are independent or correlated, as well as the circumstance that phenotype-related biomarkers are associated with non-phenotype-related ones. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12863-016-0358-5) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4784463 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-47844632016-03-10 A novel Markov Blanket-based repeated-fishing strategy for capturing phenotype-related biomarkers in big omics data Li, Hongkai Yuan, Zhongshang Ji, Jiadong Xu, Jing Zhang, Tao Zhang, Xiaoshuai Xue, Fuzhong BMC Genet Methodology Article BACKGROUND: We propose a novel Markov Blanket-based repeated-fishing strategy (MBRFS) in attempt to increase the power of existing Markov Blanket method (DASSO-MB) and maintain its advantages in omic data analysis. RESULTS: Both simulation and real data analysis were conducted to assess its performances by comparing with other methods including χ(2) test with Bonferroni and B-H adjustment, least absolute shrinkage and selection operator (LASSO) and DASSO-MB. A serious of simulation studies showed that the true discovery rate (TDR) of proposed MBRFS was always close to zero under null hypothesis (odds ratio = 1 for each SNPs) with excellent stability in all three scenarios of independent phenotype-related SNPs without linkage disequilibrium (LD) around them, correlated phenotype-related SNPs without LD around them, and phenotype-related SNPs with strong LD around them. As expected, under different odds ratio and minor allel frequency (MAFs), MBRFS always had the best performances in capturing the true phenotype-related biomarkers with higher matthews correlation coefficience (MCC) for all three scenarios above. More importantly, since proposed MBRFS using the repeated fishing strategy, it still captures more phenotype-related SNPs with minor effects when non-significant phenotype-related SNPs emerged under χ(2) test after Bonferroni multiple correction. The various real omics data analysis, including GWAS data, DNA methylation data, gene expression data and metabolites data, indicated that the proposed MBRFS always detected relatively reasonable biomarkers. CONCLUSIONS: Our proposed MBRFS can exactly capture the true phenotype-related biomarkers with the reduction of false negative rate when the phenotype-related biomarkers are independent or correlated, as well as the circumstance that phenotype-related biomarkers are associated with non-phenotype-related ones. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12863-016-0358-5) contains supplementary material, which is available to authorized users. BioMed Central 2016-03-09 /pmc/articles/PMC4784463/ /pubmed/26957081 http://dx.doi.org/10.1186/s12863-016-0358-5 Text en © Li et al. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Li, Hongkai Yuan, Zhongshang Ji, Jiadong Xu, Jing Zhang, Tao Zhang, Xiaoshuai Xue, Fuzhong A novel Markov Blanket-based repeated-fishing strategy for capturing phenotype-related biomarkers in big omics data |
title | A novel Markov Blanket-based repeated-fishing strategy for capturing phenotype-related biomarkers in big omics data |
title_full | A novel Markov Blanket-based repeated-fishing strategy for capturing phenotype-related biomarkers in big omics data |
title_fullStr | A novel Markov Blanket-based repeated-fishing strategy for capturing phenotype-related biomarkers in big omics data |
title_full_unstemmed | A novel Markov Blanket-based repeated-fishing strategy for capturing phenotype-related biomarkers in big omics data |
title_short | A novel Markov Blanket-based repeated-fishing strategy for capturing phenotype-related biomarkers in big omics data |
title_sort | novel markov blanket-based repeated-fishing strategy for capturing phenotype-related biomarkers in big omics data |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4784463/ https://www.ncbi.nlm.nih.gov/pubmed/26957081 http://dx.doi.org/10.1186/s12863-016-0358-5 |
work_keys_str_mv | AT lihongkai anovelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata AT yuanzhongshang anovelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata AT jijiadong anovelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata AT xujing anovelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata AT zhangtao anovelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata AT zhangxiaoshuai anovelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata AT xuefuzhong anovelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata AT lihongkai novelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata AT yuanzhongshang novelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata AT jijiadong novelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata AT xujing novelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata AT zhangtao novelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata AT zhangxiaoshuai novelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata AT xuefuzhong novelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata |