Cargando…

Multiple imputation and direct estimation for qPCR data with non-detects

BACKGROUND: Quantitative real-time PCR (qPCR) is one of the most widely used methods to measure gene expression. An important aspect of qPCR data that has been largely ignored is the presence of non-detects: reactions failing to exceed the quantification threshold and therefore lacking a measurement...

Descripción completa

Detalles Bibliográficos
Autores principales: Sherina, Valeriia, McMurray, Helene R., Powers, Winslow, Land, Harmut, Love, Tanzy M. T., McCall, Matthew N.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7693525/
https://www.ncbi.nlm.nih.gov/pubmed/33243147
http://dx.doi.org/10.1186/s12859-020-03807-9
_version_ 1783614764613632000
author Sherina, Valeriia
McMurray, Helene R.
Powers, Winslow
Land, Harmut
Love, Tanzy M. T.
McCall, Matthew N.
author_facet Sherina, Valeriia
McMurray, Helene R.
Powers, Winslow
Land, Harmut
Love, Tanzy M. T.
McCall, Matthew N.
author_sort Sherina, Valeriia
collection PubMed
description BACKGROUND: Quantitative real-time PCR (qPCR) is one of the most widely used methods to measure gene expression. An important aspect of qPCR data that has been largely ignored is the presence of non-detects: reactions failing to exceed the quantification threshold and therefore lacking a measurement of expression. While most current software replaces these non-detects with a value representing the limit of detection, this introduces substantial bias in the estimation of both absolute and differential expression. Single imputation procedures, while an improvement on previously used methods, underestimate residual variance, which can lead to anti-conservative inference. RESULTS: We propose to treat non-detects as non-random missing data, model the missing data mechanism, and use this model to impute missing values or obtain direct estimates of model parameters. To account for the uncertainty inherent in the imputation, we propose a multiple imputation procedure, which provides a set of plausible values for each non-detect. We assess the proposed methods via simulation studies and demonstrate the applicability of these methods to three experimental data sets. We compare our methods to mean imputation, single imputation, and a penalized EM algorithm incorporating non-random missingness (PEMM). The developed methods are implemented in the R/Bioconductor package nondetects. CONCLUSIONS: The statistical methods introduced here reduce discrepancies in gene expression values derived from qPCR experiments in the presence of non-detects, providing increased confidence in downstream analyses.
format Online
Article
Text
id pubmed-7693525
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-76935252020-11-30 Multiple imputation and direct estimation for qPCR data with non-detects Sherina, Valeriia McMurray, Helene R. Powers, Winslow Land, Harmut Love, Tanzy M. T. McCall, Matthew N. BMC Bioinformatics Methodology Article BACKGROUND: Quantitative real-time PCR (qPCR) is one of the most widely used methods to measure gene expression. An important aspect of qPCR data that has been largely ignored is the presence of non-detects: reactions failing to exceed the quantification threshold and therefore lacking a measurement of expression. While most current software replaces these non-detects with a value representing the limit of detection, this introduces substantial bias in the estimation of both absolute and differential expression. Single imputation procedures, while an improvement on previously used methods, underestimate residual variance, which can lead to anti-conservative inference. RESULTS: We propose to treat non-detects as non-random missing data, model the missing data mechanism, and use this model to impute missing values or obtain direct estimates of model parameters. To account for the uncertainty inherent in the imputation, we propose a multiple imputation procedure, which provides a set of plausible values for each non-detect. We assess the proposed methods via simulation studies and demonstrate the applicability of these methods to three experimental data sets. We compare our methods to mean imputation, single imputation, and a penalized EM algorithm incorporating non-random missingness (PEMM). The developed methods are implemented in the R/Bioconductor package nondetects. CONCLUSIONS: The statistical methods introduced here reduce discrepancies in gene expression values derived from qPCR experiments in the presence of non-detects, providing increased confidence in downstream analyses. BioMed Central 2020-11-26 /pmc/articles/PMC7693525/ /pubmed/33243147 http://dx.doi.org/10.1186/s12859-020-03807-9 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Sherina, Valeriia
McMurray, Helene R.
Powers, Winslow
Land, Harmut
Love, Tanzy M. T.
McCall, Matthew N.
Multiple imputation and direct estimation for qPCR data with non-detects
title Multiple imputation and direct estimation for qPCR data with non-detects
title_full Multiple imputation and direct estimation for qPCR data with non-detects
title_fullStr Multiple imputation and direct estimation for qPCR data with non-detects
title_full_unstemmed Multiple imputation and direct estimation for qPCR data with non-detects
title_short Multiple imputation and direct estimation for qPCR data with non-detects
title_sort multiple imputation and direct estimation for qpcr data with non-detects
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7693525/
https://www.ncbi.nlm.nih.gov/pubmed/33243147
http://dx.doi.org/10.1186/s12859-020-03807-9
work_keys_str_mv AT sherinavaleriia multipleimputationanddirectestimationforqpcrdatawithnondetects
AT mcmurrayhelener multipleimputationanddirectestimationforqpcrdatawithnondetects
AT powerswinslow multipleimputationanddirectestimationforqpcrdatawithnondetects
AT landharmut multipleimputationanddirectestimationforqpcrdatawithnondetects
AT lovetanzymt multipleimputationanddirectestimationforqpcrdatawithnondetects
AT mccallmatthewn multipleimputationanddirectestimationforqpcrdatawithnondetects