Cargando…

RNaseIII and T4 Polynucleotide Kinase sequence biases and solutions during RNA-seq library construction

BACKGROUND: RNA-seq is a next generation sequencing method with a wide range of applications including single nucleotide polymorphism (SNP) detection, splice junction identification, and gene expression level measurement. However, the RNA-seq sequence data can be biased during library constructions...

Descripción completa

Detalles Bibliográficos
Autores principales: Lee, Changhoon, Harris, R Adron, Wall, Jason K, Mayfield, R Dayne, Wilke, Claus O
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3710281/
https://www.ncbi.nlm.nih.gov/pubmed/23826734
http://dx.doi.org/10.1186/1745-6150-8-16
_version_ 1782276858880983040
author Lee, Changhoon
Harris, R Adron
Wall, Jason K
Mayfield, R Dayne
Wilke, Claus O
author_facet Lee, Changhoon
Harris, R Adron
Wall, Jason K
Mayfield, R Dayne
Wilke, Claus O
author_sort Lee, Changhoon
collection PubMed
description BACKGROUND: RNA-seq is a next generation sequencing method with a wide range of applications including single nucleotide polymorphism (SNP) detection, splice junction identification, and gene expression level measurement. However, the RNA-seq sequence data can be biased during library constructions resulting in incorrect data for SNP, splice junction, and gene expression studies. Here, we developed new library preparation methods to limit such biases. RESULTS: A whole transcriptome library prepared for the SOLiD system displayed numerous read duplications (pile-ups) and gaps in known exons. The pile-ups and gaps of the whole transcriptome library caused a loss of SNP and splice junction information and reduced the quality of gene expression results. Further, we found clear sequence biases for both 5' and 3' end reads in the whole transcriptome library. To remove this bias, RNaseIII fragmentation was replaced with heat fragmentation. For adaptor ligation, T4 Polynucleotide Kinase (T4PNK) was used following heat fragmentation. However, its kinase and phosphatase activities introduced additional sequence biases. To minimize them, we used OptiKinase before T4PNK. Our study further revealed the specific target sequences of RNaseIII and T4PNK. CONCLUSIONS: Our results suggest that the heat fragmentation removed the RNaseIII sequence bias and significantly reduced the pile-ups and gaps. OptiKinase minimized the T4PNK sequence biases and removed most of the remaining pile-ups and gaps, thus maximizing the quality of RNA-seq data. REVIEWERS: This article was reviewed by Dr. A. Kolodziejczyk (nominated by Dr. Sarah Teichmann), Dr. Eugene Koonin, and Dr. Christoph Adami. For the full reviews, see the Reviewers' Comments section.
format Online
Article
Text
id pubmed-3710281
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-37102812013-07-16 RNaseIII and T4 Polynucleotide Kinase sequence biases and solutions during RNA-seq library construction Lee, Changhoon Harris, R Adron Wall, Jason K Mayfield, R Dayne Wilke, Claus O Biol Direct Research BACKGROUND: RNA-seq is a next generation sequencing method with a wide range of applications including single nucleotide polymorphism (SNP) detection, splice junction identification, and gene expression level measurement. However, the RNA-seq sequence data can be biased during library constructions resulting in incorrect data for SNP, splice junction, and gene expression studies. Here, we developed new library preparation methods to limit such biases. RESULTS: A whole transcriptome library prepared for the SOLiD system displayed numerous read duplications (pile-ups) and gaps in known exons. The pile-ups and gaps of the whole transcriptome library caused a loss of SNP and splice junction information and reduced the quality of gene expression results. Further, we found clear sequence biases for both 5' and 3' end reads in the whole transcriptome library. To remove this bias, RNaseIII fragmentation was replaced with heat fragmentation. For adaptor ligation, T4 Polynucleotide Kinase (T4PNK) was used following heat fragmentation. However, its kinase and phosphatase activities introduced additional sequence biases. To minimize them, we used OptiKinase before T4PNK. Our study further revealed the specific target sequences of RNaseIII and T4PNK. CONCLUSIONS: Our results suggest that the heat fragmentation removed the RNaseIII sequence bias and significantly reduced the pile-ups and gaps. OptiKinase minimized the T4PNK sequence biases and removed most of the remaining pile-ups and gaps, thus maximizing the quality of RNA-seq data. REVIEWERS: This article was reviewed by Dr. A. Kolodziejczyk (nominated by Dr. Sarah Teichmann), Dr. Eugene Koonin, and Dr. Christoph Adami. For the full reviews, see the Reviewers' Comments section. BioMed Central 2013-07-04 /pmc/articles/PMC3710281/ /pubmed/23826734 http://dx.doi.org/10.1186/1745-6150-8-16 Text en Copyright © 2013 Lee et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Lee, Changhoon
Harris, R Adron
Wall, Jason K
Mayfield, R Dayne
Wilke, Claus O
RNaseIII and T4 Polynucleotide Kinase sequence biases and solutions during RNA-seq library construction
title RNaseIII and T4 Polynucleotide Kinase sequence biases and solutions during RNA-seq library construction
title_full RNaseIII and T4 Polynucleotide Kinase sequence biases and solutions during RNA-seq library construction
title_fullStr RNaseIII and T4 Polynucleotide Kinase sequence biases and solutions during RNA-seq library construction
title_full_unstemmed RNaseIII and T4 Polynucleotide Kinase sequence biases and solutions during RNA-seq library construction
title_short RNaseIII and T4 Polynucleotide Kinase sequence biases and solutions during RNA-seq library construction
title_sort rnaseiii and t4 polynucleotide kinase sequence biases and solutions during rna-seq library construction
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3710281/
https://www.ncbi.nlm.nih.gov/pubmed/23826734
http://dx.doi.org/10.1186/1745-6150-8-16
work_keys_str_mv AT leechanghoon rnaseiiiandt4polynucleotidekinasesequencebiasesandsolutionsduringrnaseqlibraryconstruction
AT harrisradron rnaseiiiandt4polynucleotidekinasesequencebiasesandsolutionsduringrnaseqlibraryconstruction
AT walljasonk rnaseiiiandt4polynucleotidekinasesequencebiasesandsolutionsduringrnaseqlibraryconstruction
AT mayfieldrdayne rnaseiiiandt4polynucleotidekinasesequencebiasesandsolutionsduringrnaseqlibraryconstruction
AT wilkeclauso rnaseiiiandt4polynucleotidekinasesequencebiasesandsolutionsduringrnaseqlibraryconstruction