Cargando…

Independent component analysis recovers consistent regulatory signals from disparate datasets

The availability of bacterial transcriptomes has dramatically increased in recent years. This data deluge could result in detailed inference of underlying regulatory networks, but the diversity of experimental platforms and protocols introduces critical biases that could hinder scalable analysis of...

Descripción completa

Detalles Bibliográficos
Autores principales: Sastry, Anand V., Hu, Alyssa, Heckmann, David, Poudel, Saugat, Kavvas, Erol, Palsson, Bernhard O.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7888660/
https://www.ncbi.nlm.nih.gov/pubmed/33529205
http://dx.doi.org/10.1371/journal.pcbi.1008647
_version_ 1783652204988596224
author Sastry, Anand V.
Hu, Alyssa
Heckmann, David
Poudel, Saugat
Kavvas, Erol
Palsson, Bernhard O.
author_facet Sastry, Anand V.
Hu, Alyssa
Heckmann, David
Poudel, Saugat
Kavvas, Erol
Palsson, Bernhard O.
author_sort Sastry, Anand V.
collection PubMed
description The availability of bacterial transcriptomes has dramatically increased in recent years. This data deluge could result in detailed inference of underlying regulatory networks, but the diversity of experimental platforms and protocols introduces critical biases that could hinder scalable analysis of existing data. Here, we show that the underlying structure of the E. coli transcriptome, as determined by Independent Component Analysis (ICA), is conserved across multiple independent datasets, including both RNA-seq and microarray datasets. We subsequently combined five transcriptomics datasets into a large compendium containing over 800 expression profiles and discovered that its underlying ICA-based structure was still comparable to that of the individual datasets. With this understanding, we expanded our analysis to over 3,000 E. coli expression profiles and predicted three high-impact regulons that respond to oxidative stress, anaerobiosis, and antibiotic treatment. ICA thus enables deep analysis of disparate data to uncover new insights that were not visible in the individual datasets.
format Online
Article
Text
id pubmed-7888660
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-78886602021-02-25 Independent component analysis recovers consistent regulatory signals from disparate datasets Sastry, Anand V. Hu, Alyssa Heckmann, David Poudel, Saugat Kavvas, Erol Palsson, Bernhard O. PLoS Comput Biol Research Article The availability of bacterial transcriptomes has dramatically increased in recent years. This data deluge could result in detailed inference of underlying regulatory networks, but the diversity of experimental platforms and protocols introduces critical biases that could hinder scalable analysis of existing data. Here, we show that the underlying structure of the E. coli transcriptome, as determined by Independent Component Analysis (ICA), is conserved across multiple independent datasets, including both RNA-seq and microarray datasets. We subsequently combined five transcriptomics datasets into a large compendium containing over 800 expression profiles and discovered that its underlying ICA-based structure was still comparable to that of the individual datasets. With this understanding, we expanded our analysis to over 3,000 E. coli expression profiles and predicted three high-impact regulons that respond to oxidative stress, anaerobiosis, and antibiotic treatment. ICA thus enables deep analysis of disparate data to uncover new insights that were not visible in the individual datasets. Public Library of Science 2021-02-02 /pmc/articles/PMC7888660/ /pubmed/33529205 http://dx.doi.org/10.1371/journal.pcbi.1008647 Text en © 2021 Sastry et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Sastry, Anand V.
Hu, Alyssa
Heckmann, David
Poudel, Saugat
Kavvas, Erol
Palsson, Bernhard O.
Independent component analysis recovers consistent regulatory signals from disparate datasets
title Independent component analysis recovers consistent regulatory signals from disparate datasets
title_full Independent component analysis recovers consistent regulatory signals from disparate datasets
title_fullStr Independent component analysis recovers consistent regulatory signals from disparate datasets
title_full_unstemmed Independent component analysis recovers consistent regulatory signals from disparate datasets
title_short Independent component analysis recovers consistent regulatory signals from disparate datasets
title_sort independent component analysis recovers consistent regulatory signals from disparate datasets
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7888660/
https://www.ncbi.nlm.nih.gov/pubmed/33529205
http://dx.doi.org/10.1371/journal.pcbi.1008647
work_keys_str_mv AT sastryanandv independentcomponentanalysisrecoversconsistentregulatorysignalsfromdisparatedatasets
AT hualyssa independentcomponentanalysisrecoversconsistentregulatorysignalsfromdisparatedatasets
AT heckmanndavid independentcomponentanalysisrecoversconsistentregulatorysignalsfromdisparatedatasets
AT poudelsaugat independentcomponentanalysisrecoversconsistentregulatorysignalsfromdisparatedatasets
AT kavvaserol independentcomponentanalysisrecoversconsistentregulatorysignalsfromdisparatedatasets
AT palssonbernhardo independentcomponentanalysisrecoversconsistentregulatorysignalsfromdisparatedatasets