Cargando…

Optimal Deconvolution of Transcriptional Profiling Data Using Quadratic Programming with Application to Complex Clinical Blood Samples

Large-scale molecular profiling technologies have assisted the identification of disease biomarkers and facilitated the basic understanding of cellular processes. However, samples collected from human subjects in clinical trials possess a level of complexity, arising from multiple cell types, that c...

Descripción completa

Detalles Bibliográficos
Autores principales:	Gong, Ting, Hartmann, Nicole, Kohane, Isaac S., Brinkmann, Volker, Staedtler, Frank, Letzkus, Martin, Bongiovanni, Sandrine, Szustakowski, Joseph D.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2011
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3217948/ https://www.ncbi.nlm.nih.gov/pubmed/22110609 http://dx.doi.org/10.1371/journal.pone.0027156

_version_	1782216638578294784
author	Gong, Ting Hartmann, Nicole Kohane, Isaac S. Brinkmann, Volker Staedtler, Frank Letzkus, Martin Bongiovanni, Sandrine Szustakowski, Joseph D.
author_facet	Gong, Ting Hartmann, Nicole Kohane, Isaac S. Brinkmann, Volker Staedtler, Frank Letzkus, Martin Bongiovanni, Sandrine Szustakowski, Joseph D.
author_sort	Gong, Ting
collection	PubMed
description	Large-scale molecular profiling technologies have assisted the identification of disease biomarkers and facilitated the basic understanding of cellular processes. However, samples collected from human subjects in clinical trials possess a level of complexity, arising from multiple cell types, that can obfuscate the analysis of data derived from them. Failure to identify, quantify, and incorporate sources of heterogeneity into an analysis can have widespread and detrimental effects on subsequent statistical studies. We describe an approach that builds upon a linear latent variable model, in which expression levels from mixed cell populations are modeled as the weighted average of expression from different cell types. We solve these equations using quadratic programming, which efficiently identifies the globally optimal solution while preserving non-negativity of the fraction of the cells. We applied our method to various existing platforms to estimate proportions of different pure cell or tissue types and gene expression profilings of distinct phenotypes, with a focus on complex samples collected in clinical trials. We tested our methods on several well controlled benchmark data sets with known mixing fractions of pure cell or tissue types and mRNA expression profiling data from samples collected in a clinical trial. Accurate agreement between predicted and actual mixing fractions was observed. In addition, our method was able to predict mixing fractions for more than ten species of circulating cells and to provide accurate estimates for relatively rare cell types (<10% total population). Furthermore, accurate changes in leukocyte trafficking associated with Fingolomid (FTY720) treatment were identified that were consistent with previous results generated by both cell counts and flow cytometry. These data suggest that our method can solve one of the open questions regarding the analysis of complex transcriptional data: namely, how to identify the optimal mixing fractions in a given experiment.
format	Online Article Text
id	pubmed-3217948
institution	National Center for Biotechnology Information
language	English
publishDate	2011
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-32179482011-11-21 Optimal Deconvolution of Transcriptional Profiling Data Using Quadratic Programming with Application to Complex Clinical Blood Samples Gong, Ting Hartmann, Nicole Kohane, Isaac S. Brinkmann, Volker Staedtler, Frank Letzkus, Martin Bongiovanni, Sandrine Szustakowski, Joseph D. PLoS One Research Article Large-scale molecular profiling technologies have assisted the identification of disease biomarkers and facilitated the basic understanding of cellular processes. However, samples collected from human subjects in clinical trials possess a level of complexity, arising from multiple cell types, that can obfuscate the analysis of data derived from them. Failure to identify, quantify, and incorporate sources of heterogeneity into an analysis can have widespread and detrimental effects on subsequent statistical studies. We describe an approach that builds upon a linear latent variable model, in which expression levels from mixed cell populations are modeled as the weighted average of expression from different cell types. We solve these equations using quadratic programming, which efficiently identifies the globally optimal solution while preserving non-negativity of the fraction of the cells. We applied our method to various existing platforms to estimate proportions of different pure cell or tissue types and gene expression profilings of distinct phenotypes, with a focus on complex samples collected in clinical trials. We tested our methods on several well controlled benchmark data sets with known mixing fractions of pure cell or tissue types and mRNA expression profiling data from samples collected in a clinical trial. Accurate agreement between predicted and actual mixing fractions was observed. In addition, our method was able to predict mixing fractions for more than ten species of circulating cells and to provide accurate estimates for relatively rare cell types (<10% total population). Furthermore, accurate changes in leukocyte trafficking associated with Fingolomid (FTY720) treatment were identified that were consistent with previous results generated by both cell counts and flow cytometry. These data suggest that our method can solve one of the open questions regarding the analysis of complex transcriptional data: namely, how to identify the optimal mixing fractions in a given experiment. Public Library of Science 2011-11-16 /pmc/articles/PMC3217948/ /pubmed/22110609 http://dx.doi.org/10.1371/journal.pone.0027156 Text en Gong et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle	Research Article Gong, Ting Hartmann, Nicole Kohane, Isaac S. Brinkmann, Volker Staedtler, Frank Letzkus, Martin Bongiovanni, Sandrine Szustakowski, Joseph D. Optimal Deconvolution of Transcriptional Profiling Data Using Quadratic Programming with Application to Complex Clinical Blood Samples
title	Optimal Deconvolution of Transcriptional Profiling Data Using Quadratic Programming with Application to Complex Clinical Blood Samples
title_full	Optimal Deconvolution of Transcriptional Profiling Data Using Quadratic Programming with Application to Complex Clinical Blood Samples
title_fullStr	Optimal Deconvolution of Transcriptional Profiling Data Using Quadratic Programming with Application to Complex Clinical Blood Samples
title_full_unstemmed	Optimal Deconvolution of Transcriptional Profiling Data Using Quadratic Programming with Application to Complex Clinical Blood Samples
title_short	Optimal Deconvolution of Transcriptional Profiling Data Using Quadratic Programming with Application to Complex Clinical Blood Samples
title_sort	optimal deconvolution of transcriptional profiling data using quadratic programming with application to complex clinical blood samples
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3217948/ https://www.ncbi.nlm.nih.gov/pubmed/22110609 http://dx.doi.org/10.1371/journal.pone.0027156
work_keys_str_mv	AT gongting optimaldeconvolutionoftranscriptionalprofilingdatausingquadraticprogrammingwithapplicationtocomplexclinicalbloodsamples AT hartmannnicole optimaldeconvolutionoftranscriptionalprofilingdatausingquadraticprogrammingwithapplicationtocomplexclinicalbloodsamples AT kohaneisaacs optimaldeconvolutionoftranscriptionalprofilingdatausingquadraticprogrammingwithapplicationtocomplexclinicalbloodsamples AT brinkmannvolker optimaldeconvolutionoftranscriptionalprofilingdatausingquadraticprogrammingwithapplicationtocomplexclinicalbloodsamples AT staedtlerfrank optimaldeconvolutionoftranscriptionalprofilingdatausingquadraticprogrammingwithapplicationtocomplexclinicalbloodsamples AT letzkusmartin optimaldeconvolutionoftranscriptionalprofilingdatausingquadraticprogrammingwithapplicationtocomplexclinicalbloodsamples AT bongiovannisandrine optimaldeconvolutionoftranscriptionalprofilingdatausingquadraticprogrammingwithapplicationtocomplexclinicalbloodsamples AT szustakowskijosephd optimaldeconvolutionoftranscriptionalprofilingdatausingquadraticprogrammingwithapplicationtocomplexclinicalbloodsamples

Optimal Deconvolution of Transcriptional Profiling Data Using Quadratic Programming with Application to Complex Clinical Blood Samples

Ejemplares similares