Cargando…
Whole blood RNA extraction efficiency contributes to variability in RNA sequencing data sets
Numerous methodologies are used for blood RNA extraction, and large quantitative differences in recovered RNA content are reported. We evaluated three archived data sets to determine how extraction methodologies might influence mRNA and lncRNA sequencing results. The total quantity of RNA recovered...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10653446/ https://www.ncbi.nlm.nih.gov/pubmed/37972054 http://dx.doi.org/10.1371/journal.pone.0291209 |
_version_ | 1785147778176909312 |
---|---|
author | Wilfinger, William W. Eghbalnia, Hamid R. Mackey, Karol Miller, Robert Chomczynski, Piotr |
author_facet | Wilfinger, William W. Eghbalnia, Hamid R. Mackey, Karol Miller, Robert Chomczynski, Piotr |
author_sort | Wilfinger, William W. |
collection | PubMed |
description | Numerous methodologies are used for blood RNA extraction, and large quantitative differences in recovered RNA content are reported. We evaluated three archived data sets to determine how extraction methodologies might influence mRNA and lncRNA sequencing results. The total quantity of RNA recovered /ml of blood affects RNA sequencing by impacting the recovery of weakly expressed mRNA, and lncRNA transcripts. Transcript expression (TPM counts) plotted in relation to transcript size (base pairs, bp) revealed a 30% loss of short to midsized transcripts in some data sets. Quantitative recovery of RNA is of considerable importance, and it should be viewed more judiciously. Transcripts common to the three data sets were subsequently normalized and transcript mean TPM counts and TPM count coefficient of variation (CV) were plotted in relation to increasing transcript size. Regression analysis of mean TPM counts versus transcript size revealed negative slopes in two of the three data sets suggesting a reduction of TPM transcript counts with increasing transcript size. In the third data set, the regression slope line of mRNA transcript TPM counts approximates zero and TPM counts increased in proportion to transcript size over a range of 200 to 30,000 bp. Similarly, transcript TPM count CV values also were uniformly distributed over the range of transcript sizes. In the other data sets, the regression CV slopes increased in relation to transcript size. The recovery of weakly expressed and /or short to midsized mRNA and lncRNA transcripts varies with different RNA extraction methodologies thereby altering the fundamental sequencing relationship between transcript size and TPM counts. Our analysis identifies differences in RNA sequencing results that are dependent upon the quantity of total RNA recovery from whole blood. We propose that incomplete RNA extraction directly impacts the recovery of mRNA and lncRNA transcripts from human blood and speculate these differences contribute to the “batch” effects commonly identified between sequencing results from different archived data sets. |
format | Online Article Text |
id | pubmed-10653446 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-106534462023-11-16 Whole blood RNA extraction efficiency contributes to variability in RNA sequencing data sets Wilfinger, William W. Eghbalnia, Hamid R. Mackey, Karol Miller, Robert Chomczynski, Piotr PLoS One Research Article Numerous methodologies are used for blood RNA extraction, and large quantitative differences in recovered RNA content are reported. We evaluated three archived data sets to determine how extraction methodologies might influence mRNA and lncRNA sequencing results. The total quantity of RNA recovered /ml of blood affects RNA sequencing by impacting the recovery of weakly expressed mRNA, and lncRNA transcripts. Transcript expression (TPM counts) plotted in relation to transcript size (base pairs, bp) revealed a 30% loss of short to midsized transcripts in some data sets. Quantitative recovery of RNA is of considerable importance, and it should be viewed more judiciously. Transcripts common to the three data sets were subsequently normalized and transcript mean TPM counts and TPM count coefficient of variation (CV) were plotted in relation to increasing transcript size. Regression analysis of mean TPM counts versus transcript size revealed negative slopes in two of the three data sets suggesting a reduction of TPM transcript counts with increasing transcript size. In the third data set, the regression slope line of mRNA transcript TPM counts approximates zero and TPM counts increased in proportion to transcript size over a range of 200 to 30,000 bp. Similarly, transcript TPM count CV values also were uniformly distributed over the range of transcript sizes. In the other data sets, the regression CV slopes increased in relation to transcript size. The recovery of weakly expressed and /or short to midsized mRNA and lncRNA transcripts varies with different RNA extraction methodologies thereby altering the fundamental sequencing relationship between transcript size and TPM counts. Our analysis identifies differences in RNA sequencing results that are dependent upon the quantity of total RNA recovery from whole blood. We propose that incomplete RNA extraction directly impacts the recovery of mRNA and lncRNA transcripts from human blood and speculate these differences contribute to the “batch” effects commonly identified between sequencing results from different archived data sets. Public Library of Science 2023-11-16 /pmc/articles/PMC10653446/ /pubmed/37972054 http://dx.doi.org/10.1371/journal.pone.0291209 Text en © 2023 Wilfinger et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Wilfinger, William W. Eghbalnia, Hamid R. Mackey, Karol Miller, Robert Chomczynski, Piotr Whole blood RNA extraction efficiency contributes to variability in RNA sequencing data sets |
title | Whole blood RNA extraction efficiency contributes to variability in RNA sequencing data sets |
title_full | Whole blood RNA extraction efficiency contributes to variability in RNA sequencing data sets |
title_fullStr | Whole blood RNA extraction efficiency contributes to variability in RNA sequencing data sets |
title_full_unstemmed | Whole blood RNA extraction efficiency contributes to variability in RNA sequencing data sets |
title_short | Whole blood RNA extraction efficiency contributes to variability in RNA sequencing data sets |
title_sort | whole blood rna extraction efficiency contributes to variability in rna sequencing data sets |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10653446/ https://www.ncbi.nlm.nih.gov/pubmed/37972054 http://dx.doi.org/10.1371/journal.pone.0291209 |
work_keys_str_mv | AT wilfingerwilliamw wholebloodrnaextractionefficiencycontributestovariabilityinrnasequencingdatasets AT eghbalniahamidr wholebloodrnaextractionefficiencycontributestovariabilityinrnasequencingdatasets AT mackeykarol wholebloodrnaextractionefficiencycontributestovariabilityinrnasequencingdatasets AT millerrobert wholebloodrnaextractionefficiencycontributestovariabilityinrnasequencingdatasets AT chomczynskipiotr wholebloodrnaextractionefficiencycontributestovariabilityinrnasequencingdatasets |