Cargando…

A novel Markov Blanket-based repeated-fishing strategy for capturing phenotype-related biomarkers in big omics data

BACKGROUND: We propose a novel Markov Blanket-based repeated-fishing strategy (MBRFS) in attempt to increase the power of existing Markov Blanket method (DASSO-MB) and maintain its advantages in omic data analysis. RESULTS: Both simulation and real data analysis were conducted to assess its performa...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Hongkai, Yuan, Zhongshang, Ji, Jiadong, Xu, Jing, Zhang, Tao, Zhang, Xiaoshuai, Xue, Fuzhong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4784463/
https://www.ncbi.nlm.nih.gov/pubmed/26957081
http://dx.doi.org/10.1186/s12863-016-0358-5
_version_ 1782420271886499840
author Li, Hongkai
Yuan, Zhongshang
Ji, Jiadong
Xu, Jing
Zhang, Tao
Zhang, Xiaoshuai
Xue, Fuzhong
author_facet Li, Hongkai
Yuan, Zhongshang
Ji, Jiadong
Xu, Jing
Zhang, Tao
Zhang, Xiaoshuai
Xue, Fuzhong
author_sort Li, Hongkai
collection PubMed
description BACKGROUND: We propose a novel Markov Blanket-based repeated-fishing strategy (MBRFS) in attempt to increase the power of existing Markov Blanket method (DASSO-MB) and maintain its advantages in omic data analysis. RESULTS: Both simulation and real data analysis were conducted to assess its performances by comparing with other methods including χ(2) test with Bonferroni and B-H adjustment, least absolute shrinkage and selection operator (LASSO) and DASSO-MB. A serious of simulation studies showed that the true discovery rate (TDR) of proposed MBRFS was always close to zero under null hypothesis (odds ratio = 1 for each SNPs) with excellent stability in all three scenarios of independent phenotype-related SNPs without linkage disequilibrium (LD) around them, correlated phenotype-related SNPs without LD around them, and phenotype-related SNPs with strong LD around them. As expected, under different odds ratio and minor allel frequency (MAFs), MBRFS always had the best performances in capturing the true phenotype-related biomarkers with higher matthews correlation coefficience (MCC) for all three scenarios above. More importantly, since proposed MBRFS using the repeated fishing strategy, it still captures more phenotype-related SNPs with minor effects when non-significant phenotype-related SNPs emerged under χ(2) test after Bonferroni multiple correction. The various real omics data analysis, including GWAS data, DNA methylation data, gene expression data and metabolites data, indicated that the proposed MBRFS always detected relatively reasonable biomarkers. CONCLUSIONS: Our proposed MBRFS can exactly capture the true phenotype-related biomarkers with the reduction of false negative rate when the phenotype-related biomarkers are independent or correlated, as well as the circumstance that phenotype-related biomarkers are associated with non-phenotype-related ones. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12863-016-0358-5) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4784463
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-47844632016-03-10 A novel Markov Blanket-based repeated-fishing strategy for capturing phenotype-related biomarkers in big omics data Li, Hongkai Yuan, Zhongshang Ji, Jiadong Xu, Jing Zhang, Tao Zhang, Xiaoshuai Xue, Fuzhong BMC Genet Methodology Article BACKGROUND: We propose a novel Markov Blanket-based repeated-fishing strategy (MBRFS) in attempt to increase the power of existing Markov Blanket method (DASSO-MB) and maintain its advantages in omic data analysis. RESULTS: Both simulation and real data analysis were conducted to assess its performances by comparing with other methods including χ(2) test with Bonferroni and B-H adjustment, least absolute shrinkage and selection operator (LASSO) and DASSO-MB. A serious of simulation studies showed that the true discovery rate (TDR) of proposed MBRFS was always close to zero under null hypothesis (odds ratio = 1 for each SNPs) with excellent stability in all three scenarios of independent phenotype-related SNPs without linkage disequilibrium (LD) around them, correlated phenotype-related SNPs without LD around them, and phenotype-related SNPs with strong LD around them. As expected, under different odds ratio and minor allel frequency (MAFs), MBRFS always had the best performances in capturing the true phenotype-related biomarkers with higher matthews correlation coefficience (MCC) for all three scenarios above. More importantly, since proposed MBRFS using the repeated fishing strategy, it still captures more phenotype-related SNPs with minor effects when non-significant phenotype-related SNPs emerged under χ(2) test after Bonferroni multiple correction. The various real omics data analysis, including GWAS data, DNA methylation data, gene expression data and metabolites data, indicated that the proposed MBRFS always detected relatively reasonable biomarkers. CONCLUSIONS: Our proposed MBRFS can exactly capture the true phenotype-related biomarkers with the reduction of false negative rate when the phenotype-related biomarkers are independent or correlated, as well as the circumstance that phenotype-related biomarkers are associated with non-phenotype-related ones. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12863-016-0358-5) contains supplementary material, which is available to authorized users. BioMed Central 2016-03-09 /pmc/articles/PMC4784463/ /pubmed/26957081 http://dx.doi.org/10.1186/s12863-016-0358-5 Text en © Li et al. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Li, Hongkai
Yuan, Zhongshang
Ji, Jiadong
Xu, Jing
Zhang, Tao
Zhang, Xiaoshuai
Xue, Fuzhong
A novel Markov Blanket-based repeated-fishing strategy for capturing phenotype-related biomarkers in big omics data
title A novel Markov Blanket-based repeated-fishing strategy for capturing phenotype-related biomarkers in big omics data
title_full A novel Markov Blanket-based repeated-fishing strategy for capturing phenotype-related biomarkers in big omics data
title_fullStr A novel Markov Blanket-based repeated-fishing strategy for capturing phenotype-related biomarkers in big omics data
title_full_unstemmed A novel Markov Blanket-based repeated-fishing strategy for capturing phenotype-related biomarkers in big omics data
title_short A novel Markov Blanket-based repeated-fishing strategy for capturing phenotype-related biomarkers in big omics data
title_sort novel markov blanket-based repeated-fishing strategy for capturing phenotype-related biomarkers in big omics data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4784463/
https://www.ncbi.nlm.nih.gov/pubmed/26957081
http://dx.doi.org/10.1186/s12863-016-0358-5
work_keys_str_mv AT lihongkai anovelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata
AT yuanzhongshang anovelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata
AT jijiadong anovelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata
AT xujing anovelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata
AT zhangtao anovelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata
AT zhangxiaoshuai anovelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata
AT xuefuzhong anovelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata
AT lihongkai novelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata
AT yuanzhongshang novelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata
AT jijiadong novelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata
AT xujing novelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata
AT zhangtao novelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata
AT zhangxiaoshuai novelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata
AT xuefuzhong novelmarkovblanketbasedrepeatedfishingstrategyforcapturingphenotyperelatedbiomarkersinbigomicsdata