Cargando…
SMITE: an R/Bioconductor package that identifies network modules by integrating genomic and epigenomic information
BACKGROUND: The molecular assays that test gene expression, transcriptional, and epigenetic regulation are increasingly diverse and numerous. The information generated by each type of assay individually gives an insight into the state of the cells tested. What should be possible is to add the inform...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5242055/ https://www.ncbi.nlm.nih.gov/pubmed/28100166 http://dx.doi.org/10.1186/s12859-017-1477-3 |
_version_ | 1782496291338584064 |
---|---|
author | Wijetunga, N. Ari Johnston, Andrew D. Maekawa, Ryo Delahaye, Fabien Ulahannan, Netha Kim, Kami Greally, John M. |
author_facet | Wijetunga, N. Ari Johnston, Andrew D. Maekawa, Ryo Delahaye, Fabien Ulahannan, Netha Kim, Kami Greally, John M. |
author_sort | Wijetunga, N. Ari |
collection | PubMed |
description | BACKGROUND: The molecular assays that test gene expression, transcriptional, and epigenetic regulation are increasingly diverse and numerous. The information generated by each type of assay individually gives an insight into the state of the cells tested. What should be possible is to add the information derived from separate, complementary assays to gain higher-confidence insights into cellular states. At present, the analysis of multi-dimensional, massive genome-wide data requires an initial pruning step to create manageable subsets of observations that are then used for integration, which decreases the sizes of the intersecting data sets and the potential for biological insights. Our Significance-based Modules Integrating the Transcriptome and Epigenome (SMITE) approach was developed to integrate transcriptional and epigenetic regulatory data without a loss of resolution. RESULTS: SMITE combines p-values by accounting for the correlation between non-independent values within data sets, allowing genes and gene modules in an interaction network to be assigned significance values. The contribution of each type of genomic data can be weighted, permitting integration of individually under-powered data sets, increasing the overall ability to detect effects within modules of genes. We apply SMITE to a complex genomic data set including the epigenomic and transcriptomic effects of Toxoplasma gondii infection on human host cells and demonstrate that SMITE is able to identify novel subnetworks of dysregulated genes. Additionally, we show that SMITE outperforms Functional Epigenetic Modules (FEM), the current paradigm of using the spin-glass algorithm to integrate gene expression and epigenetic data. CONCLUSIONS: SMITE represents a flexible, scalable tool that allows integration of transcriptional and epigenetic regulatory data from genome-wide assays to boost confidence in finding gene modules reflecting altered cellular states. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1477-3) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5242055 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-52420552017-01-23 SMITE: an R/Bioconductor package that identifies network modules by integrating genomic and epigenomic information Wijetunga, N. Ari Johnston, Andrew D. Maekawa, Ryo Delahaye, Fabien Ulahannan, Netha Kim, Kami Greally, John M. BMC Bioinformatics Software BACKGROUND: The molecular assays that test gene expression, transcriptional, and epigenetic regulation are increasingly diverse and numerous. The information generated by each type of assay individually gives an insight into the state of the cells tested. What should be possible is to add the information derived from separate, complementary assays to gain higher-confidence insights into cellular states. At present, the analysis of multi-dimensional, massive genome-wide data requires an initial pruning step to create manageable subsets of observations that are then used for integration, which decreases the sizes of the intersecting data sets and the potential for biological insights. Our Significance-based Modules Integrating the Transcriptome and Epigenome (SMITE) approach was developed to integrate transcriptional and epigenetic regulatory data without a loss of resolution. RESULTS: SMITE combines p-values by accounting for the correlation between non-independent values within data sets, allowing genes and gene modules in an interaction network to be assigned significance values. The contribution of each type of genomic data can be weighted, permitting integration of individually under-powered data sets, increasing the overall ability to detect effects within modules of genes. We apply SMITE to a complex genomic data set including the epigenomic and transcriptomic effects of Toxoplasma gondii infection on human host cells and demonstrate that SMITE is able to identify novel subnetworks of dysregulated genes. Additionally, we show that SMITE outperforms Functional Epigenetic Modules (FEM), the current paradigm of using the spin-glass algorithm to integrate gene expression and epigenetic data. CONCLUSIONS: SMITE represents a flexible, scalable tool that allows integration of transcriptional and epigenetic regulatory data from genome-wide assays to boost confidence in finding gene modules reflecting altered cellular states. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1477-3) contains supplementary material, which is available to authorized users. BioMed Central 2017-01-18 /pmc/articles/PMC5242055/ /pubmed/28100166 http://dx.doi.org/10.1186/s12859-017-1477-3 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Wijetunga, N. Ari Johnston, Andrew D. Maekawa, Ryo Delahaye, Fabien Ulahannan, Netha Kim, Kami Greally, John M. SMITE: an R/Bioconductor package that identifies network modules by integrating genomic and epigenomic information |
title | SMITE: an R/Bioconductor package that identifies network modules by integrating genomic and epigenomic information |
title_full | SMITE: an R/Bioconductor package that identifies network modules by integrating genomic and epigenomic information |
title_fullStr | SMITE: an R/Bioconductor package that identifies network modules by integrating genomic and epigenomic information |
title_full_unstemmed | SMITE: an R/Bioconductor package that identifies network modules by integrating genomic and epigenomic information |
title_short | SMITE: an R/Bioconductor package that identifies network modules by integrating genomic and epigenomic information |
title_sort | smite: an r/bioconductor package that identifies network modules by integrating genomic and epigenomic information |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5242055/ https://www.ncbi.nlm.nih.gov/pubmed/28100166 http://dx.doi.org/10.1186/s12859-017-1477-3 |
work_keys_str_mv | AT wijetunganari smiteanrbioconductorpackagethatidentifiesnetworkmodulesbyintegratinggenomicandepigenomicinformation AT johnstonandrewd smiteanrbioconductorpackagethatidentifiesnetworkmodulesbyintegratinggenomicandepigenomicinformation AT maekawaryo smiteanrbioconductorpackagethatidentifiesnetworkmodulesbyintegratinggenomicandepigenomicinformation AT delahayefabien smiteanrbioconductorpackagethatidentifiesnetworkmodulesbyintegratinggenomicandepigenomicinformation AT ulahannannetha smiteanrbioconductorpackagethatidentifiesnetworkmodulesbyintegratinggenomicandepigenomicinformation AT kimkami smiteanrbioconductorpackagethatidentifiesnetworkmodulesbyintegratinggenomicandepigenomicinformation AT greallyjohnm smiteanrbioconductorpackagethatidentifiesnetworkmodulesbyintegratinggenomicandepigenomicinformation |