Cargando…

Markov chain Monte Carlo for active module identification problem

BACKGROUND: Integrative network methods are commonly used for interpretation of high-throughput experimental biological data: transcriptomics, proteomics, metabolomics and others. One of the common approaches is finding a connected subnetwork of a global interaction network that best encompasses sig...

Descripción completa

Detalles Bibliográficos
Autores principales:	Alexeev, Nikita, Isomurodov, Javlon, Sukhov, Vladimir, Korotkevich, Gennady, Sergushichev, Alexey
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2020
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7672893/ https://www.ncbi.nlm.nih.gov/pubmed/33203350 http://dx.doi.org/10.1186/s12859-020-03572-9

_version_	1783611226163511296
author	Alexeev, Nikita Isomurodov, Javlon Sukhov, Vladimir Korotkevich, Gennady Sergushichev, Alexey
author_facet	Alexeev, Nikita Isomurodov, Javlon Sukhov, Vladimir Korotkevich, Gennady Sergushichev, Alexey
author_sort	Alexeev, Nikita
collection	PubMed
description	BACKGROUND: Integrative network methods are commonly used for interpretation of high-throughput experimental biological data: transcriptomics, proteomics, metabolomics and others. One of the common approaches is finding a connected subnetwork of a global interaction network that best encompasses significant individual changes in the data and represents a so-called active module. Usually methods implementing this approach find a single subnetwork and thus solve a hard classification problem for vertices. This subnetwork inherently contains erroneous vertices, while no instrument is provided to estimate the confidence level of any particular vertex inclusion. To address this issue, in the current study we consider the active module problem as a soft classification problem. RESULTS: We propose a method to estimate probabilities of each vertex to belong to the active module based on Markov chain Monte Carlo (MCMC) subnetwork sampling. As an example of the performance of our method on real data, we run it on two gene expression datasets. For the first many-replicate expression dataset we show that the proposed approach is consistent with an existing resampling-based method. On the second dataset the jackknife resampling method is inapplicable due to the small number of biological replicates, but the MCMC method can be run and shows high classification performance. CONCLUSIONS: The proposed method allows to estimate the probability that an individual vertex belongs to the active module as well as the false discovery rate (FDR) for a given set of vertices. Given the estimated probabilities, it becomes possible to provide a connected subgraph in a consistent manner for any given FDR level: no vertex can disappear when the FDR level is relaxed. We show, on both simulated and real datasets, that the proposed method has good computational performance and high classification accuracy.
format	Online Article Text
id	pubmed-7672893
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-76728932020-11-19 Markov chain Monte Carlo for active module identification problem Alexeev, Nikita Isomurodov, Javlon Sukhov, Vladimir Korotkevich, Gennady Sergushichev, Alexey BMC Bioinformatics Research BACKGROUND: Integrative network methods are commonly used for interpretation of high-throughput experimental biological data: transcriptomics, proteomics, metabolomics and others. One of the common approaches is finding a connected subnetwork of a global interaction network that best encompasses significant individual changes in the data and represents a so-called active module. Usually methods implementing this approach find a single subnetwork and thus solve a hard classification problem for vertices. This subnetwork inherently contains erroneous vertices, while no instrument is provided to estimate the confidence level of any particular vertex inclusion. To address this issue, in the current study we consider the active module problem as a soft classification problem. RESULTS: We propose a method to estimate probabilities of each vertex to belong to the active module based on Markov chain Monte Carlo (MCMC) subnetwork sampling. As an example of the performance of our method on real data, we run it on two gene expression datasets. For the first many-replicate expression dataset we show that the proposed approach is consistent with an existing resampling-based method. On the second dataset the jackknife resampling method is inapplicable due to the small number of biological replicates, but the MCMC method can be run and shows high classification performance. CONCLUSIONS: The proposed method allows to estimate the probability that an individual vertex belongs to the active module as well as the false discovery rate (FDR) for a given set of vertices. Given the estimated probabilities, it becomes possible to provide a connected subgraph in a consistent manner for any given FDR level: no vertex can disappear when the FDR level is relaxed. We show, on both simulated and real datasets, that the proposed method has good computational performance and high classification accuracy. BioMed Central 2020-11-18 /pmc/articles/PMC7672893/ /pubmed/33203350 http://dx.doi.org/10.1186/s12859-020-03572-9 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Research Alexeev, Nikita Isomurodov, Javlon Sukhov, Vladimir Korotkevich, Gennady Sergushichev, Alexey Markov chain Monte Carlo for active module identification problem
title	Markov chain Monte Carlo for active module identification problem
title_full	Markov chain Monte Carlo for active module identification problem
title_fullStr	Markov chain Monte Carlo for active module identification problem
title_full_unstemmed	Markov chain Monte Carlo for active module identification problem
title_short	Markov chain Monte Carlo for active module identification problem
title_sort	markov chain monte carlo for active module identification problem
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7672893/ https://www.ncbi.nlm.nih.gov/pubmed/33203350 http://dx.doi.org/10.1186/s12859-020-03572-9
work_keys_str_mv	AT alexeevnikita markovchainmontecarloforactivemoduleidentificationproblem AT isomurodovjavlon markovchainmontecarloforactivemoduleidentificationproblem AT sukhovvladimir markovchainmontecarloforactivemoduleidentificationproblem AT korotkevichgennady markovchainmontecarloforactivemoduleidentificationproblem AT sergushichevalexey markovchainmontecarloforactivemoduleidentificationproblem

Markov chain Monte Carlo for active module identification problem

Ejemplares similares