Cargando…

Detection Theory in Identification of RNA-DNA Sequence Differences Using RNA-Sequencing

Advances in sequencing technology have allowed for detailed analyses of the transcriptome at single-nucleotide resolution, facilitating the study of RNA editing or sequence differences between RNA and DNA genome-wide. In humans, two types of post-transcriptional RNA editing processes are known to oc...

Descripción completa

Detalles Bibliográficos
Autores principales: Toung, Jonathan M., Lahens, Nicholas, Hogenesch, John B., Grant, Gregory
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4232354/
https://www.ncbi.nlm.nih.gov/pubmed/25396741
http://dx.doi.org/10.1371/journal.pone.0112040
_version_ 1782344556836028416
author Toung, Jonathan M.
Lahens, Nicholas
Hogenesch, John B.
Grant, Gregory
author_facet Toung, Jonathan M.
Lahens, Nicholas
Hogenesch, John B.
Grant, Gregory
author_sort Toung, Jonathan M.
collection PubMed
description Advances in sequencing technology have allowed for detailed analyses of the transcriptome at single-nucleotide resolution, facilitating the study of RNA editing or sequence differences between RNA and DNA genome-wide. In humans, two types of post-transcriptional RNA editing processes are known to occur: A-to-I deamination by ADAR and C-to-U deamination by APOBEC1. In addition to these sequence differences, researchers have reported the existence of all 12 types of RNA-DNA sequence differences (RDDs); however, the validity of these claims is debated, as many studies claim that technical artifacts account for the majority of these non-canonical sequence differences. In this study, we used a detection theory approach to evaluate the performance of RNA-Sequencing (RNA-Seq) and associated aligners in accurately identifying RNA-DNA sequence differences. By generating simulated RNA-Seq datasets containing RDDs, we assessed the effect of alignment artifacts and sequencing error on the sensitivity and false discovery rate of RDD detection. Overall, we found that even in the presence of sequencing errors, false negative and false discovery rates of RDD detection can be contained below 10% with relatively lenient thresholds. We also assessed the ability of various filters to target false positive RDDs and found them to be effective in discriminating between true and false positives. Lastly, we used the optimal thresholds we identified from our simulated analyses to identify RDDs in a human lymphoblastoid cell line. We found approximately 6,000 RDDs, the majority of which are A-to-G edits and likely to be mediated by ADAR. Moreover, we found the majority of non A-to-G RDDs to be associated with poorer alignments and conclude from these results that the evidence for widespread non-canonical RDDs in humans is weak. Overall, we found RNA-Seq to be a powerful technique for surveying RDDs genome-wide when coupled with the appropriate thresholds and filters.
format Online
Article
Text
id pubmed-4232354
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-42323542014-11-26 Detection Theory in Identification of RNA-DNA Sequence Differences Using RNA-Sequencing Toung, Jonathan M. Lahens, Nicholas Hogenesch, John B. Grant, Gregory PLoS One Research Article Advances in sequencing technology have allowed for detailed analyses of the transcriptome at single-nucleotide resolution, facilitating the study of RNA editing or sequence differences between RNA and DNA genome-wide. In humans, two types of post-transcriptional RNA editing processes are known to occur: A-to-I deamination by ADAR and C-to-U deamination by APOBEC1. In addition to these sequence differences, researchers have reported the existence of all 12 types of RNA-DNA sequence differences (RDDs); however, the validity of these claims is debated, as many studies claim that technical artifacts account for the majority of these non-canonical sequence differences. In this study, we used a detection theory approach to evaluate the performance of RNA-Sequencing (RNA-Seq) and associated aligners in accurately identifying RNA-DNA sequence differences. By generating simulated RNA-Seq datasets containing RDDs, we assessed the effect of alignment artifacts and sequencing error on the sensitivity and false discovery rate of RDD detection. Overall, we found that even in the presence of sequencing errors, false negative and false discovery rates of RDD detection can be contained below 10% with relatively lenient thresholds. We also assessed the ability of various filters to target false positive RDDs and found them to be effective in discriminating between true and false positives. Lastly, we used the optimal thresholds we identified from our simulated analyses to identify RDDs in a human lymphoblastoid cell line. We found approximately 6,000 RDDs, the majority of which are A-to-G edits and likely to be mediated by ADAR. Moreover, we found the majority of non A-to-G RDDs to be associated with poorer alignments and conclude from these results that the evidence for widespread non-canonical RDDs in humans is weak. Overall, we found RNA-Seq to be a powerful technique for surveying RDDs genome-wide when coupled with the appropriate thresholds and filters. Public Library of Science 2014-11-14 /pmc/articles/PMC4232354/ /pubmed/25396741 http://dx.doi.org/10.1371/journal.pone.0112040 Text en © 2014 Toung et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Toung, Jonathan M.
Lahens, Nicholas
Hogenesch, John B.
Grant, Gregory
Detection Theory in Identification of RNA-DNA Sequence Differences Using RNA-Sequencing
title Detection Theory in Identification of RNA-DNA Sequence Differences Using RNA-Sequencing
title_full Detection Theory in Identification of RNA-DNA Sequence Differences Using RNA-Sequencing
title_fullStr Detection Theory in Identification of RNA-DNA Sequence Differences Using RNA-Sequencing
title_full_unstemmed Detection Theory in Identification of RNA-DNA Sequence Differences Using RNA-Sequencing
title_short Detection Theory in Identification of RNA-DNA Sequence Differences Using RNA-Sequencing
title_sort detection theory in identification of rna-dna sequence differences using rna-sequencing
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4232354/
https://www.ncbi.nlm.nih.gov/pubmed/25396741
http://dx.doi.org/10.1371/journal.pone.0112040
work_keys_str_mv AT toungjonathanm detectiontheoryinidentificationofrnadnasequencedifferencesusingrnasequencing
AT lahensnicholas detectiontheoryinidentificationofrnadnasequencedifferencesusingrnasequencing
AT hogeneschjohnb detectiontheoryinidentificationofrnadnasequencedifferencesusingrnasequencing
AT grantgregory detectiontheoryinidentificationofrnadnasequencedifferencesusingrnasequencing