Cargando…

Whole blood RNA extraction efficiency contributes to variability in RNA sequencing data sets

Numerous methodologies are used for blood RNA extraction, and large quantitative differences in recovered RNA content are reported. We evaluated three archived data sets to determine how extraction methodologies might influence mRNA and lncRNA sequencing results. The total quantity of RNA recovered...

Descripción completa

Detalles Bibliográficos
Autores principales: Wilfinger, William W., Eghbalnia, Hamid R., Mackey, Karol, Miller, Robert, Chomczynski, Piotr
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10653446/
https://www.ncbi.nlm.nih.gov/pubmed/37972054
http://dx.doi.org/10.1371/journal.pone.0291209
_version_ 1785147778176909312
author Wilfinger, William W.
Eghbalnia, Hamid R.
Mackey, Karol
Miller, Robert
Chomczynski, Piotr
author_facet Wilfinger, William W.
Eghbalnia, Hamid R.
Mackey, Karol
Miller, Robert
Chomczynski, Piotr
author_sort Wilfinger, William W.
collection PubMed
description Numerous methodologies are used for blood RNA extraction, and large quantitative differences in recovered RNA content are reported. We evaluated three archived data sets to determine how extraction methodologies might influence mRNA and lncRNA sequencing results. The total quantity of RNA recovered /ml of blood affects RNA sequencing by impacting the recovery of weakly expressed mRNA, and lncRNA transcripts. Transcript expression (TPM counts) plotted in relation to transcript size (base pairs, bp) revealed a 30% loss of short to midsized transcripts in some data sets. Quantitative recovery of RNA is of considerable importance, and it should be viewed more judiciously. Transcripts common to the three data sets were subsequently normalized and transcript mean TPM counts and TPM count coefficient of variation (CV) were plotted in relation to increasing transcript size. Regression analysis of mean TPM counts versus transcript size revealed negative slopes in two of the three data sets suggesting a reduction of TPM transcript counts with increasing transcript size. In the third data set, the regression slope line of mRNA transcript TPM counts approximates zero and TPM counts increased in proportion to transcript size over a range of 200 to 30,000 bp. Similarly, transcript TPM count CV values also were uniformly distributed over the range of transcript sizes. In the other data sets, the regression CV slopes increased in relation to transcript size. The recovery of weakly expressed and /or short to midsized mRNA and lncRNA transcripts varies with different RNA extraction methodologies thereby altering the fundamental sequencing relationship between transcript size and TPM counts. Our analysis identifies differences in RNA sequencing results that are dependent upon the quantity of total RNA recovery from whole blood. We propose that incomplete RNA extraction directly impacts the recovery of mRNA and lncRNA transcripts from human blood and speculate these differences contribute to the “batch” effects commonly identified between sequencing results from different archived data sets.
format Online
Article
Text
id pubmed-10653446
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-106534462023-11-16 Whole blood RNA extraction efficiency contributes to variability in RNA sequencing data sets Wilfinger, William W. Eghbalnia, Hamid R. Mackey, Karol Miller, Robert Chomczynski, Piotr PLoS One Research Article Numerous methodologies are used for blood RNA extraction, and large quantitative differences in recovered RNA content are reported. We evaluated three archived data sets to determine how extraction methodologies might influence mRNA and lncRNA sequencing results. The total quantity of RNA recovered /ml of blood affects RNA sequencing by impacting the recovery of weakly expressed mRNA, and lncRNA transcripts. Transcript expression (TPM counts) plotted in relation to transcript size (base pairs, bp) revealed a 30% loss of short to midsized transcripts in some data sets. Quantitative recovery of RNA is of considerable importance, and it should be viewed more judiciously. Transcripts common to the three data sets were subsequently normalized and transcript mean TPM counts and TPM count coefficient of variation (CV) were plotted in relation to increasing transcript size. Regression analysis of mean TPM counts versus transcript size revealed negative slopes in two of the three data sets suggesting a reduction of TPM transcript counts with increasing transcript size. In the third data set, the regression slope line of mRNA transcript TPM counts approximates zero and TPM counts increased in proportion to transcript size over a range of 200 to 30,000 bp. Similarly, transcript TPM count CV values also were uniformly distributed over the range of transcript sizes. In the other data sets, the regression CV slopes increased in relation to transcript size. The recovery of weakly expressed and /or short to midsized mRNA and lncRNA transcripts varies with different RNA extraction methodologies thereby altering the fundamental sequencing relationship between transcript size and TPM counts. Our analysis identifies differences in RNA sequencing results that are dependent upon the quantity of total RNA recovery from whole blood. We propose that incomplete RNA extraction directly impacts the recovery of mRNA and lncRNA transcripts from human blood and speculate these differences contribute to the “batch” effects commonly identified between sequencing results from different archived data sets. Public Library of Science 2023-11-16 /pmc/articles/PMC10653446/ /pubmed/37972054 http://dx.doi.org/10.1371/journal.pone.0291209 Text en © 2023 Wilfinger et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Wilfinger, William W.
Eghbalnia, Hamid R.
Mackey, Karol
Miller, Robert
Chomczynski, Piotr
Whole blood RNA extraction efficiency contributes to variability in RNA sequencing data sets
title Whole blood RNA extraction efficiency contributes to variability in RNA sequencing data sets
title_full Whole blood RNA extraction efficiency contributes to variability in RNA sequencing data sets
title_fullStr Whole blood RNA extraction efficiency contributes to variability in RNA sequencing data sets
title_full_unstemmed Whole blood RNA extraction efficiency contributes to variability in RNA sequencing data sets
title_short Whole blood RNA extraction efficiency contributes to variability in RNA sequencing data sets
title_sort whole blood rna extraction efficiency contributes to variability in rna sequencing data sets
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10653446/
https://www.ncbi.nlm.nih.gov/pubmed/37972054
http://dx.doi.org/10.1371/journal.pone.0291209
work_keys_str_mv AT wilfingerwilliamw wholebloodrnaextractionefficiencycontributestovariabilityinrnasequencingdatasets
AT eghbalniahamidr wholebloodrnaextractionefficiencycontributestovariabilityinrnasequencingdatasets
AT mackeykarol wholebloodrnaextractionefficiencycontributestovariabilityinrnasequencingdatasets
AT millerrobert wholebloodrnaextractionefficiencycontributestovariabilityinrnasequencingdatasets
AT chomczynskipiotr wholebloodrnaextractionefficiencycontributestovariabilityinrnasequencingdatasets