Cargando…

Enumerateblood – an R package to estimate the cellular composition of whole blood from Affymetrix Gene ST gene expression profiles

BACKGROUND: Measuring genome-wide changes in transcript abundance in circulating peripheral whole blood is a useful way to study disease pathobiology and may help elucidate the molecular mechanisms of disease, or discovery of useful disease biomarkers. The sensitivity and interpretability of analyse...

Descripción completa

Detalles Bibliográficos
Autores principales: Shannon, Casey P., Balshaw, Robert, Chen, Virginia, Hollander, Zsuzsanna, Toma, Mustafa, McManus, Bruce M., FitzGerald, J. Mark, Sin, Don D., Ng, Raymond T., Tebbutt, Scott J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5219701/
https://www.ncbi.nlm.nih.gov/pubmed/28061752
http://dx.doi.org/10.1186/s12864-016-3460-1
_version_ 1782492505303941120
author Shannon, Casey P.
Balshaw, Robert
Chen, Virginia
Hollander, Zsuzsanna
Toma, Mustafa
McManus, Bruce M.
FitzGerald, J. Mark
Sin, Don D.
Ng, Raymond T.
Tebbutt, Scott J.
author_facet Shannon, Casey P.
Balshaw, Robert
Chen, Virginia
Hollander, Zsuzsanna
Toma, Mustafa
McManus, Bruce M.
FitzGerald, J. Mark
Sin, Don D.
Ng, Raymond T.
Tebbutt, Scott J.
author_sort Shannon, Casey P.
collection PubMed
description BACKGROUND: Measuring genome-wide changes in transcript abundance in circulating peripheral whole blood is a useful way to study disease pathobiology and may help elucidate the molecular mechanisms of disease, or discovery of useful disease biomarkers. The sensitivity and interpretability of analyses carried out in this complex tissue, however, are significantly affected by its dynamic cellular heterogeneity. It is therefore desirable to quantify this heterogeneity, either to account for it or to better model interactions that may be present between the abundance of certain transcripts, specific cell types and the indication under study. Accurate enumeration of the many component cell types that make up peripheral whole blood can further complicate the sample collection process, however, and result in additional costs. Many approaches have been developed to infer the composition of a sample from high-dimensional transcriptomic and, more recently, epigenetic data. These approaches rely on the availability of isolated expression profiles for the cell types to be enumerated. These profiles are platform-specific, suitable datasets are rare, and generating them is expensive. No such dataset exists on the Affymetrix Gene ST platform. RESULTS: We present ‘Enumerateblood’, a freely-available and open source R package that exposes a multi-response Gaussian model capable of accurately predicting the composition of peripheral whole blood samples from Affymetrix Gene ST expression profiles, outperforming other current methods when applied to Gene ST data. CONCLUSIONS: ‘Enumerateblood’ significantly improves our ability to study disease pathobiology from whole blood gene expression assayed on the popular Affymetrix Gene ST platform by allowing a more complete study of the various components of this complex tissue without the need for additional data collection. Future use of the model may allow for novel insights to be generated from the ~400 Affymetrix Gene ST blood gene expression datasets currently available on the Gene Expression Omnibus (GEO) website. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-3460-1) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5219701
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-52197012017-01-10 Enumerateblood – an R package to estimate the cellular composition of whole blood from Affymetrix Gene ST gene expression profiles Shannon, Casey P. Balshaw, Robert Chen, Virginia Hollander, Zsuzsanna Toma, Mustafa McManus, Bruce M. FitzGerald, J. Mark Sin, Don D. Ng, Raymond T. Tebbutt, Scott J. BMC Genomics Methodology Article BACKGROUND: Measuring genome-wide changes in transcript abundance in circulating peripheral whole blood is a useful way to study disease pathobiology and may help elucidate the molecular mechanisms of disease, or discovery of useful disease biomarkers. The sensitivity and interpretability of analyses carried out in this complex tissue, however, are significantly affected by its dynamic cellular heterogeneity. It is therefore desirable to quantify this heterogeneity, either to account for it or to better model interactions that may be present between the abundance of certain transcripts, specific cell types and the indication under study. Accurate enumeration of the many component cell types that make up peripheral whole blood can further complicate the sample collection process, however, and result in additional costs. Many approaches have been developed to infer the composition of a sample from high-dimensional transcriptomic and, more recently, epigenetic data. These approaches rely on the availability of isolated expression profiles for the cell types to be enumerated. These profiles are platform-specific, suitable datasets are rare, and generating them is expensive. No such dataset exists on the Affymetrix Gene ST platform. RESULTS: We present ‘Enumerateblood’, a freely-available and open source R package that exposes a multi-response Gaussian model capable of accurately predicting the composition of peripheral whole blood samples from Affymetrix Gene ST expression profiles, outperforming other current methods when applied to Gene ST data. CONCLUSIONS: ‘Enumerateblood’ significantly improves our ability to study disease pathobiology from whole blood gene expression assayed on the popular Affymetrix Gene ST platform by allowing a more complete study of the various components of this complex tissue without the need for additional data collection. Future use of the model may allow for novel insights to be generated from the ~400 Affymetrix Gene ST blood gene expression datasets currently available on the Gene Expression Omnibus (GEO) website. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-3460-1) contains supplementary material, which is available to authorized users. BioMed Central 2017-01-06 /pmc/articles/PMC5219701/ /pubmed/28061752 http://dx.doi.org/10.1186/s12864-016-3460-1 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Shannon, Casey P.
Balshaw, Robert
Chen, Virginia
Hollander, Zsuzsanna
Toma, Mustafa
McManus, Bruce M.
FitzGerald, J. Mark
Sin, Don D.
Ng, Raymond T.
Tebbutt, Scott J.
Enumerateblood – an R package to estimate the cellular composition of whole blood from Affymetrix Gene ST gene expression profiles
title Enumerateblood – an R package to estimate the cellular composition of whole blood from Affymetrix Gene ST gene expression profiles
title_full Enumerateblood – an R package to estimate the cellular composition of whole blood from Affymetrix Gene ST gene expression profiles
title_fullStr Enumerateblood – an R package to estimate the cellular composition of whole blood from Affymetrix Gene ST gene expression profiles
title_full_unstemmed Enumerateblood – an R package to estimate the cellular composition of whole blood from Affymetrix Gene ST gene expression profiles
title_short Enumerateblood – an R package to estimate the cellular composition of whole blood from Affymetrix Gene ST gene expression profiles
title_sort enumerateblood – an r package to estimate the cellular composition of whole blood from affymetrix gene st gene expression profiles
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5219701/
https://www.ncbi.nlm.nih.gov/pubmed/28061752
http://dx.doi.org/10.1186/s12864-016-3460-1
work_keys_str_mv AT shannoncaseyp enumeratebloodanrpackagetoestimatethecellularcompositionofwholebloodfromaffymetrixgenestgeneexpressionprofiles
AT balshawrobert enumeratebloodanrpackagetoestimatethecellularcompositionofwholebloodfromaffymetrixgenestgeneexpressionprofiles
AT chenvirginia enumeratebloodanrpackagetoestimatethecellularcompositionofwholebloodfromaffymetrixgenestgeneexpressionprofiles
AT hollanderzsuzsanna enumeratebloodanrpackagetoestimatethecellularcompositionofwholebloodfromaffymetrixgenestgeneexpressionprofiles
AT tomamustafa enumeratebloodanrpackagetoestimatethecellularcompositionofwholebloodfromaffymetrixgenestgeneexpressionprofiles
AT mcmanusbrucem enumeratebloodanrpackagetoestimatethecellularcompositionofwholebloodfromaffymetrixgenestgeneexpressionprofiles
AT fitzgeraldjmark enumeratebloodanrpackagetoestimatethecellularcompositionofwholebloodfromaffymetrixgenestgeneexpressionprofiles
AT sindond enumeratebloodanrpackagetoestimatethecellularcompositionofwholebloodfromaffymetrixgenestgeneexpressionprofiles
AT ngraymondt enumeratebloodanrpackagetoestimatethecellularcompositionofwholebloodfromaffymetrixgenestgeneexpressionprofiles
AT tebbuttscottj enumeratebloodanrpackagetoestimatethecellularcompositionofwholebloodfromaffymetrixgenestgeneexpressionprofiles