Cargando…
GenomicSuperSignature facilitates interpretation of RNA-seq experiments through robust, efficient comparison to public databases
Millions of transcriptomic profiles have been deposited in public archives, yet remain underused for the interpretation of new experiments. We present a method for interpreting new transcriptomic datasets through instant comparison to public datasets without high-performance computing requirements....
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9237024/ https://www.ncbi.nlm.nih.gov/pubmed/35760813 http://dx.doi.org/10.1038/s41467-022-31411-3 |
_version_ | 1784736673790164992 |
---|---|
author | Oh, Sehyun Geistlinger, Ludwig Ramos, Marcel Blankenberg, Daniel van den Beek, Marius Taroni, Jaclyn N. Carey, Vincent J. Greene, Casey S. Waldron, Levi Davis, Sean |
author_facet | Oh, Sehyun Geistlinger, Ludwig Ramos, Marcel Blankenberg, Daniel van den Beek, Marius Taroni, Jaclyn N. Carey, Vincent J. Greene, Casey S. Waldron, Levi Davis, Sean |
author_sort | Oh, Sehyun |
collection | PubMed |
description | Millions of transcriptomic profiles have been deposited in public archives, yet remain underused for the interpretation of new experiments. We present a method for interpreting new transcriptomic datasets through instant comparison to public datasets without high-performance computing requirements. We apply Principal Component Analysis on 536 studies comprising 44,890 human RNA sequencing profiles and aggregate sufficiently similar loading vectors to form Replicable Axes of Variation (RAV). RAVs are annotated with metadata of originating studies and by gene set enrichment analysis. Functionality to associate new datasets with RAVs, extract interpretable annotations, and provide intuitive visualization are implemented as the GenomicSuperSignature R/Bioconductor package. We demonstrate the efficient and coherent database search, robustness to batch effects and heterogeneous training data, and transfer learning capacity of our method using TCGA and rare diseases datasets. GenomicSuperSignature aids in analyzing new gene expression data in the context of existing databases using minimal computing resources. |
format | Online Article Text |
id | pubmed-9237024 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-92370242022-06-29 GenomicSuperSignature facilitates interpretation of RNA-seq experiments through robust, efficient comparison to public databases Oh, Sehyun Geistlinger, Ludwig Ramos, Marcel Blankenberg, Daniel van den Beek, Marius Taroni, Jaclyn N. Carey, Vincent J. Greene, Casey S. Waldron, Levi Davis, Sean Nat Commun Article Millions of transcriptomic profiles have been deposited in public archives, yet remain underused for the interpretation of new experiments. We present a method for interpreting new transcriptomic datasets through instant comparison to public datasets without high-performance computing requirements. We apply Principal Component Analysis on 536 studies comprising 44,890 human RNA sequencing profiles and aggregate sufficiently similar loading vectors to form Replicable Axes of Variation (RAV). RAVs are annotated with metadata of originating studies and by gene set enrichment analysis. Functionality to associate new datasets with RAVs, extract interpretable annotations, and provide intuitive visualization are implemented as the GenomicSuperSignature R/Bioconductor package. We demonstrate the efficient and coherent database search, robustness to batch effects and heterogeneous training data, and transfer learning capacity of our method using TCGA and rare diseases datasets. GenomicSuperSignature aids in analyzing new gene expression data in the context of existing databases using minimal computing resources. Nature Publishing Group UK 2022-06-27 /pmc/articles/PMC9237024/ /pubmed/35760813 http://dx.doi.org/10.1038/s41467-022-31411-3 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Oh, Sehyun Geistlinger, Ludwig Ramos, Marcel Blankenberg, Daniel van den Beek, Marius Taroni, Jaclyn N. Carey, Vincent J. Greene, Casey S. Waldron, Levi Davis, Sean GenomicSuperSignature facilitates interpretation of RNA-seq experiments through robust, efficient comparison to public databases |
title | GenomicSuperSignature facilitates interpretation of RNA-seq experiments through robust, efficient comparison to public databases |
title_full | GenomicSuperSignature facilitates interpretation of RNA-seq experiments through robust, efficient comparison to public databases |
title_fullStr | GenomicSuperSignature facilitates interpretation of RNA-seq experiments through robust, efficient comparison to public databases |
title_full_unstemmed | GenomicSuperSignature facilitates interpretation of RNA-seq experiments through robust, efficient comparison to public databases |
title_short | GenomicSuperSignature facilitates interpretation of RNA-seq experiments through robust, efficient comparison to public databases |
title_sort | genomicsupersignature facilitates interpretation of rna-seq experiments through robust, efficient comparison to public databases |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9237024/ https://www.ncbi.nlm.nih.gov/pubmed/35760813 http://dx.doi.org/10.1038/s41467-022-31411-3 |
work_keys_str_mv | AT ohsehyun genomicsupersignaturefacilitatesinterpretationofrnaseqexperimentsthroughrobustefficientcomparisontopublicdatabases AT geistlingerludwig genomicsupersignaturefacilitatesinterpretationofrnaseqexperimentsthroughrobustefficientcomparisontopublicdatabases AT ramosmarcel genomicsupersignaturefacilitatesinterpretationofrnaseqexperimentsthroughrobustefficientcomparisontopublicdatabases AT blankenbergdaniel genomicsupersignaturefacilitatesinterpretationofrnaseqexperimentsthroughrobustefficientcomparisontopublicdatabases AT vandenbeekmarius genomicsupersignaturefacilitatesinterpretationofrnaseqexperimentsthroughrobustefficientcomparisontopublicdatabases AT taronijaclynn genomicsupersignaturefacilitatesinterpretationofrnaseqexperimentsthroughrobustefficientcomparisontopublicdatabases AT careyvincentj genomicsupersignaturefacilitatesinterpretationofrnaseqexperimentsthroughrobustefficientcomparisontopublicdatabases AT greenecaseys genomicsupersignaturefacilitatesinterpretationofrnaseqexperimentsthroughrobustefficientcomparisontopublicdatabases AT waldronlevi genomicsupersignaturefacilitatesinterpretationofrnaseqexperimentsthroughrobustefficientcomparisontopublicdatabases AT davissean genomicsupersignaturefacilitatesinterpretationofrnaseqexperimentsthroughrobustefficientcomparisontopublicdatabases |