Cargando…

Shambhala: a platform-agnostic data harmonizer for gene expression data

BACKGROUND: Harmonization techniques make different gene expression profiles and their sets compatible and ready for comparisons. Here we present a new bioinformatic tool termed Shambhala for harmonization of multiple human gene expression datasets obtained using different experimental methods and p...

Descripción completa

Detalles Bibliográficos
Autores principales: Borisov, Nicolas, Shabalina, Irina, Tkachev, Victor, Sorokin, Maxim, Garazha, Andrew, Pulin, Andrey, Eremin, Ilya I., Buzdin, Anton
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6366102/
https://www.ncbi.nlm.nih.gov/pubmed/30727942
http://dx.doi.org/10.1186/s12859-019-2641-8
_version_ 1783393546754064384
author Borisov, Nicolas
Shabalina, Irina
Tkachev, Victor
Sorokin, Maxim
Garazha, Andrew
Pulin, Andrey
Eremin, Ilya I.
Buzdin, Anton
author_facet Borisov, Nicolas
Shabalina, Irina
Tkachev, Victor
Sorokin, Maxim
Garazha, Andrew
Pulin, Andrey
Eremin, Ilya I.
Buzdin, Anton
author_sort Borisov, Nicolas
collection PubMed
description BACKGROUND: Harmonization techniques make different gene expression profiles and their sets compatible and ready for comparisons. Here we present a new bioinformatic tool termed Shambhala for harmonization of multiple human gene expression datasets obtained using different experimental methods and platforms of microarray hybridization and RNA sequencing. RESULTS: Unlike previously published methods enabling good quality data harmonization for only two datasets, Shambhala allows conversion of multiple datasets into the universal form suitable for further comparisons. Shambhala harmonization is based on the calibration of gene expression profiles using the auxiliary standardization dataset. Each profile is transformed to make it similar to the output of microarray hybridization platform Affymetrix Human Gene. This platform was chosen because it has the biggest number of human gene expression profiles deposited in public databases. We evaluated Shambhala ability to retain biologically important features after harmonization. The same four biological samples taken in multiple replicates were profiled independently using three and four different experimental platforms, respectively, then Shambhala-harmonized and investigated by hierarchical clustering. CONCLUSION: Our results showed that unlike other frequently used methods: quantile normalization and DESeq/DESeq2 normalization, Shambhala harmonization was the only method supporting sample-specific and platform-independent biologically meaningful clustering for the data obtained from multiple experimental platforms. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2641-8) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6366102
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-63661022019-02-15 Shambhala: a platform-agnostic data harmonizer for gene expression data Borisov, Nicolas Shabalina, Irina Tkachev, Victor Sorokin, Maxim Garazha, Andrew Pulin, Andrey Eremin, Ilya I. Buzdin, Anton BMC Bioinformatics Research Article BACKGROUND: Harmonization techniques make different gene expression profiles and their sets compatible and ready for comparisons. Here we present a new bioinformatic tool termed Shambhala for harmonization of multiple human gene expression datasets obtained using different experimental methods and platforms of microarray hybridization and RNA sequencing. RESULTS: Unlike previously published methods enabling good quality data harmonization for only two datasets, Shambhala allows conversion of multiple datasets into the universal form suitable for further comparisons. Shambhala harmonization is based on the calibration of gene expression profiles using the auxiliary standardization dataset. Each profile is transformed to make it similar to the output of microarray hybridization platform Affymetrix Human Gene. This platform was chosen because it has the biggest number of human gene expression profiles deposited in public databases. We evaluated Shambhala ability to retain biologically important features after harmonization. The same four biological samples taken in multiple replicates were profiled independently using three and four different experimental platforms, respectively, then Shambhala-harmonized and investigated by hierarchical clustering. CONCLUSION: Our results showed that unlike other frequently used methods: quantile normalization and DESeq/DESeq2 normalization, Shambhala harmonization was the only method supporting sample-specific and platform-independent biologically meaningful clustering for the data obtained from multiple experimental platforms. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2641-8) contains supplementary material, which is available to authorized users. BioMed Central 2019-02-06 /pmc/articles/PMC6366102/ /pubmed/30727942 http://dx.doi.org/10.1186/s12859-019-2641-8 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Borisov, Nicolas
Shabalina, Irina
Tkachev, Victor
Sorokin, Maxim
Garazha, Andrew
Pulin, Andrey
Eremin, Ilya I.
Buzdin, Anton
Shambhala: a platform-agnostic data harmonizer for gene expression data
title Shambhala: a platform-agnostic data harmonizer for gene expression data
title_full Shambhala: a platform-agnostic data harmonizer for gene expression data
title_fullStr Shambhala: a platform-agnostic data harmonizer for gene expression data
title_full_unstemmed Shambhala: a platform-agnostic data harmonizer for gene expression data
title_short Shambhala: a platform-agnostic data harmonizer for gene expression data
title_sort shambhala: a platform-agnostic data harmonizer for gene expression data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6366102/
https://www.ncbi.nlm.nih.gov/pubmed/30727942
http://dx.doi.org/10.1186/s12859-019-2641-8
work_keys_str_mv AT borisovnicolas shambhalaaplatformagnosticdataharmonizerforgeneexpressiondata
AT shabalinairina shambhalaaplatformagnosticdataharmonizerforgeneexpressiondata
AT tkachevvictor shambhalaaplatformagnosticdataharmonizerforgeneexpressiondata
AT sorokinmaxim shambhalaaplatformagnosticdataharmonizerforgeneexpressiondata
AT garazhaandrew shambhalaaplatformagnosticdataharmonizerforgeneexpressiondata
AT pulinandrey shambhalaaplatformagnosticdataharmonizerforgeneexpressiondata
AT ereminilyai shambhalaaplatformagnosticdataharmonizerforgeneexpressiondata
AT buzdinanton shambhalaaplatformagnosticdataharmonizerforgeneexpressiondata