Cargando…
Non-random sampling leads to biased estimates of transcriptome association
Integration of independent data resources across -omics platforms offers transformative opportunity for novel clinical and biological discoveries. However, application of emerging analytic methods in the context of selection bias represents a noteworthy and pervasive challenge. We hypothesize that c...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7148323/ https://www.ncbi.nlm.nih.gov/pubmed/32277087 http://dx.doi.org/10.1038/s41598-020-62575-x |
_version_ | 1783520570682376192 |
---|---|
author | Foulkes, A. S. Balasubramanian, R. Qian, J. Reilly, M. P. |
author_facet | Foulkes, A. S. Balasubramanian, R. Qian, J. Reilly, M. P. |
author_sort | Foulkes, A. S. |
collection | PubMed |
description | Integration of independent data resources across -omics platforms offers transformative opportunity for novel clinical and biological discoveries. However, application of emerging analytic methods in the context of selection bias represents a noteworthy and pervasive challenge. We hypothesize that combining differentially selected samples for integrated transcriptome analysis will lead to bias in the estimated association between predicted expression and the trait. Our results are based on in silico investigations and a case example focused on body mass index across four well-described cohorts apparently derived from markedly different populations. Our findings suggest that integrative analysis can lead to substantial relative bias in the estimate of association between predicted expression and the trait. The average estimate of association ranged from 51.3% less than to 96.7% greater than the true value for the biased sampling scenarios considered, while the average error was − 2.7% for the unbiased scenario. The corresponding 95% confidence interval coverage rate ranged from 46.4% to 69.5% under biased sampling, and was equal to 75% for the unbiased scenario. Inverse probability weighting with observed and estimated weights is applied as one corrective measure and appears to reduce the bias and improve coverage. These results highlight a critical need to address selection bias in integrative analysis and to use caution in interpreting findings in the presence of different sampling mechanisms between groups. |
format | Online Article Text |
id | pubmed-7148323 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-71483232020-04-15 Non-random sampling leads to biased estimates of transcriptome association Foulkes, A. S. Balasubramanian, R. Qian, J. Reilly, M. P. Sci Rep Article Integration of independent data resources across -omics platforms offers transformative opportunity for novel clinical and biological discoveries. However, application of emerging analytic methods in the context of selection bias represents a noteworthy and pervasive challenge. We hypothesize that combining differentially selected samples for integrated transcriptome analysis will lead to bias in the estimated association between predicted expression and the trait. Our results are based on in silico investigations and a case example focused on body mass index across four well-described cohorts apparently derived from markedly different populations. Our findings suggest that integrative analysis can lead to substantial relative bias in the estimate of association between predicted expression and the trait. The average estimate of association ranged from 51.3% less than to 96.7% greater than the true value for the biased sampling scenarios considered, while the average error was − 2.7% for the unbiased scenario. The corresponding 95% confidence interval coverage rate ranged from 46.4% to 69.5% under biased sampling, and was equal to 75% for the unbiased scenario. Inverse probability weighting with observed and estimated weights is applied as one corrective measure and appears to reduce the bias and improve coverage. These results highlight a critical need to address selection bias in integrative analysis and to use caution in interpreting findings in the presence of different sampling mechanisms between groups. Nature Publishing Group UK 2020-04-10 /pmc/articles/PMC7148323/ /pubmed/32277087 http://dx.doi.org/10.1038/s41598-020-62575-x Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Foulkes, A. S. Balasubramanian, R. Qian, J. Reilly, M. P. Non-random sampling leads to biased estimates of transcriptome association |
title | Non-random sampling leads to biased estimates of transcriptome association |
title_full | Non-random sampling leads to biased estimates of transcriptome association |
title_fullStr | Non-random sampling leads to biased estimates of transcriptome association |
title_full_unstemmed | Non-random sampling leads to biased estimates of transcriptome association |
title_short | Non-random sampling leads to biased estimates of transcriptome association |
title_sort | non-random sampling leads to biased estimates of transcriptome association |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7148323/ https://www.ncbi.nlm.nih.gov/pubmed/32277087 http://dx.doi.org/10.1038/s41598-020-62575-x |
work_keys_str_mv | AT foulkesas nonrandomsamplingleadstobiasedestimatesoftranscriptomeassociation AT balasubramanianr nonrandomsamplingleadstobiasedestimatesoftranscriptomeassociation AT qianj nonrandomsamplingleadstobiasedestimatesoftranscriptomeassociation AT reillymp nonrandomsamplingleadstobiasedestimatesoftranscriptomeassociation |