Cargando…
Multiple imputation and direct estimation for qPCR data with non-detects
BACKGROUND: Quantitative real-time PCR (qPCR) is one of the most widely used methods to measure gene expression. An important aspect of qPCR data that has been largely ignored is the presence of non-detects: reactions failing to exceed the quantification threshold and therefore lacking a measurement...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7693525/ https://www.ncbi.nlm.nih.gov/pubmed/33243147 http://dx.doi.org/10.1186/s12859-020-03807-9 |
_version_ | 1783614764613632000 |
---|---|
author | Sherina, Valeriia McMurray, Helene R. Powers, Winslow Land, Harmut Love, Tanzy M. T. McCall, Matthew N. |
author_facet | Sherina, Valeriia McMurray, Helene R. Powers, Winslow Land, Harmut Love, Tanzy M. T. McCall, Matthew N. |
author_sort | Sherina, Valeriia |
collection | PubMed |
description | BACKGROUND: Quantitative real-time PCR (qPCR) is one of the most widely used methods to measure gene expression. An important aspect of qPCR data that has been largely ignored is the presence of non-detects: reactions failing to exceed the quantification threshold and therefore lacking a measurement of expression. While most current software replaces these non-detects with a value representing the limit of detection, this introduces substantial bias in the estimation of both absolute and differential expression. Single imputation procedures, while an improvement on previously used methods, underestimate residual variance, which can lead to anti-conservative inference. RESULTS: We propose to treat non-detects as non-random missing data, model the missing data mechanism, and use this model to impute missing values or obtain direct estimates of model parameters. To account for the uncertainty inherent in the imputation, we propose a multiple imputation procedure, which provides a set of plausible values for each non-detect. We assess the proposed methods via simulation studies and demonstrate the applicability of these methods to three experimental data sets. We compare our methods to mean imputation, single imputation, and a penalized EM algorithm incorporating non-random missingness (PEMM). The developed methods are implemented in the R/Bioconductor package nondetects. CONCLUSIONS: The statistical methods introduced here reduce discrepancies in gene expression values derived from qPCR experiments in the presence of non-detects, providing increased confidence in downstream analyses. |
format | Online Article Text |
id | pubmed-7693525 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-76935252020-11-30 Multiple imputation and direct estimation for qPCR data with non-detects Sherina, Valeriia McMurray, Helene R. Powers, Winslow Land, Harmut Love, Tanzy M. T. McCall, Matthew N. BMC Bioinformatics Methodology Article BACKGROUND: Quantitative real-time PCR (qPCR) is one of the most widely used methods to measure gene expression. An important aspect of qPCR data that has been largely ignored is the presence of non-detects: reactions failing to exceed the quantification threshold and therefore lacking a measurement of expression. While most current software replaces these non-detects with a value representing the limit of detection, this introduces substantial bias in the estimation of both absolute and differential expression. Single imputation procedures, while an improvement on previously used methods, underestimate residual variance, which can lead to anti-conservative inference. RESULTS: We propose to treat non-detects as non-random missing data, model the missing data mechanism, and use this model to impute missing values or obtain direct estimates of model parameters. To account for the uncertainty inherent in the imputation, we propose a multiple imputation procedure, which provides a set of plausible values for each non-detect. We assess the proposed methods via simulation studies and demonstrate the applicability of these methods to three experimental data sets. We compare our methods to mean imputation, single imputation, and a penalized EM algorithm incorporating non-random missingness (PEMM). The developed methods are implemented in the R/Bioconductor package nondetects. CONCLUSIONS: The statistical methods introduced here reduce discrepancies in gene expression values derived from qPCR experiments in the presence of non-detects, providing increased confidence in downstream analyses. BioMed Central 2020-11-26 /pmc/articles/PMC7693525/ /pubmed/33243147 http://dx.doi.org/10.1186/s12859-020-03807-9 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Methodology Article Sherina, Valeriia McMurray, Helene R. Powers, Winslow Land, Harmut Love, Tanzy M. T. McCall, Matthew N. Multiple imputation and direct estimation for qPCR data with non-detects |
title | Multiple imputation and direct estimation for qPCR data with non-detects |
title_full | Multiple imputation and direct estimation for qPCR data with non-detects |
title_fullStr | Multiple imputation and direct estimation for qPCR data with non-detects |
title_full_unstemmed | Multiple imputation and direct estimation for qPCR data with non-detects |
title_short | Multiple imputation and direct estimation for qPCR data with non-detects |
title_sort | multiple imputation and direct estimation for qpcr data with non-detects |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7693525/ https://www.ncbi.nlm.nih.gov/pubmed/33243147 http://dx.doi.org/10.1186/s12859-020-03807-9 |
work_keys_str_mv | AT sherinavaleriia multipleimputationanddirectestimationforqpcrdatawithnondetects AT mcmurrayhelener multipleimputationanddirectestimationforqpcrdatawithnondetects AT powerswinslow multipleimputationanddirectestimationforqpcrdatawithnondetects AT landharmut multipleimputationanddirectestimationforqpcrdatawithnondetects AT lovetanzymt multipleimputationanddirectestimationforqpcrdatawithnondetects AT mccallmatthewn multipleimputationanddirectestimationforqpcrdatawithnondetects |