Cargando…

Non-random sampling leads to biased estimates of transcriptome association

Integration of independent data resources across -omics platforms offers transformative opportunity for novel clinical and biological discoveries. However, application of emerging analytic methods in the context of selection bias represents a noteworthy and pervasive challenge. We hypothesize that c...

Descripción completa

Detalles Bibliográficos
Autores principales:	Foulkes, A. S., Balasubramanian, R., Qian, J., Reilly, M. P.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Nature Publishing Group UK 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7148323/ https://www.ncbi.nlm.nih.gov/pubmed/32277087 http://dx.doi.org/10.1038/s41598-020-62575-x

_version_	1783520570682376192
author	Foulkes, A. S. Balasubramanian, R. Qian, J. Reilly, M. P.
author_facet	Foulkes, A. S. Balasubramanian, R. Qian, J. Reilly, M. P.
author_sort	Foulkes, A. S.
collection	PubMed
description	Integration of independent data resources across -omics platforms offers transformative opportunity for novel clinical and biological discoveries. However, application of emerging analytic methods in the context of selection bias represents a noteworthy and pervasive challenge. We hypothesize that combining differentially selected samples for integrated transcriptome analysis will lead to bias in the estimated association between predicted expression and the trait. Our results are based on in silico investigations and a case example focused on body mass index across four well-described cohorts apparently derived from markedly different populations. Our findings suggest that integrative analysis can lead to substantial relative bias in the estimate of association between predicted expression and the trait. The average estimate of association ranged from 51.3% less than to 96.7% greater than the true value for the biased sampling scenarios considered, while the average error was − 2.7% for the unbiased scenario. The corresponding 95% confidence interval coverage rate ranged from 46.4% to 69.5% under biased sampling, and was equal to 75% for the unbiased scenario. Inverse probability weighting with observed and estimated weights is applied as one corrective measure and appears to reduce the bias and improve coverage. These results highlight a critical need to address selection bias in integrative analysis and to use caution in interpreting findings in the presence of different sampling mechanisms between groups.
format	Online Article Text
id	pubmed-7148323
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Nature Publishing Group UK
record_format	MEDLINE/PubMed
spelling	pubmed-71483232020-04-15 Non-random sampling leads to biased estimates of transcriptome association Foulkes, A. S. Balasubramanian, R. Qian, J. Reilly, M. P. Sci Rep Article Integration of independent data resources across -omics platforms offers transformative opportunity for novel clinical and biological discoveries. However, application of emerging analytic methods in the context of selection bias represents a noteworthy and pervasive challenge. We hypothesize that combining differentially selected samples for integrated transcriptome analysis will lead to bias in the estimated association between predicted expression and the trait. Our results are based on in silico investigations and a case example focused on body mass index across four well-described cohorts apparently derived from markedly different populations. Our findings suggest that integrative analysis can lead to substantial relative bias in the estimate of association between predicted expression and the trait. The average estimate of association ranged from 51.3% less than to 96.7% greater than the true value for the biased sampling scenarios considered, while the average error was − 2.7% for the unbiased scenario. The corresponding 95% confidence interval coverage rate ranged from 46.4% to 69.5% under biased sampling, and was equal to 75% for the unbiased scenario. Inverse probability weighting with observed and estimated weights is applied as one corrective measure and appears to reduce the bias and improve coverage. These results highlight a critical need to address selection bias in integrative analysis and to use caution in interpreting findings in the presence of different sampling mechanisms between groups. Nature Publishing Group UK 2020-04-10 /pmc/articles/PMC7148323/ /pubmed/32277087 http://dx.doi.org/10.1038/s41598-020-62575-x Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle	Article Foulkes, A. S. Balasubramanian, R. Qian, J. Reilly, M. P. Non-random sampling leads to biased estimates of transcriptome association
title	Non-random sampling leads to biased estimates of transcriptome association
title_full	Non-random sampling leads to biased estimates of transcriptome association
title_fullStr	Non-random sampling leads to biased estimates of transcriptome association
title_full_unstemmed	Non-random sampling leads to biased estimates of transcriptome association
title_short	Non-random sampling leads to biased estimates of transcriptome association
title_sort	non-random sampling leads to biased estimates of transcriptome association
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7148323/ https://www.ncbi.nlm.nih.gov/pubmed/32277087 http://dx.doi.org/10.1038/s41598-020-62575-x
work_keys_str_mv	AT foulkesas nonrandomsamplingleadstobiasedestimatesoftranscriptomeassociation AT balasubramanianr nonrandomsamplingleadstobiasedestimatesoftranscriptomeassociation AT qianj nonrandomsamplingleadstobiasedestimatesoftranscriptomeassociation AT reillymp nonrandomsamplingleadstobiasedestimatesoftranscriptomeassociation

Non-random sampling leads to biased estimates of transcriptome association

Ejemplares similares