Cargando…

Assessment of insert sizes and adapter content in fastq data from NexteraXT libraries

The Illumina NexteraXT transposon protocol is a cost effective way to generate paired end libraries. However, the resulting insert size is highly sensitive to the concentration of DNA used, and the variation of insert sizes is often large. One consequence of this is some fragments may have an insert...

Descripción completa

Detalles Bibliográficos
Autor principal: Turner, Frances S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3906532/
https://www.ncbi.nlm.nih.gov/pubmed/24523726
http://dx.doi.org/10.3389/fgene.2014.00005
_version_ 1782301486675394560
author Turner, Frances S.
author_facet Turner, Frances S.
author_sort Turner, Frances S.
collection PubMed
description The Illumina NexteraXT transposon protocol is a cost effective way to generate paired end libraries. However, the resulting insert size is highly sensitive to the concentration of DNA used, and the variation of insert sizes is often large. One consequence of this is some fragments may have an insert shorter than the length of a single read, particularly where the library is designed to produce overlapping paired end reads in order to produce longer continuous sequences. Such small insert sizes mean fewer longer reads, and also result in the presence of adapter at the end of the read. Here is presented a protocol to use publicly available tools to identify read pairs with small insert sizes and so likely to contain adapter, to check the sequence of the adapter, and remove adapter sequence from the reads. This protocol does not require a reference genome or prior knowledge of the sequence to be trimmed. Whilst the presence of fragments with small insert sizes may be a particular problem for NexteraXT libraries, the principle can be applied to any Illumina dataset in which the presence of such small inserts is suspected.
format Online
Article
Text
id pubmed-3906532
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-39065322014-02-12 Assessment of insert sizes and adapter content in fastq data from NexteraXT libraries Turner, Frances S. Front Genet Genetics The Illumina NexteraXT transposon protocol is a cost effective way to generate paired end libraries. However, the resulting insert size is highly sensitive to the concentration of DNA used, and the variation of insert sizes is often large. One consequence of this is some fragments may have an insert shorter than the length of a single read, particularly where the library is designed to produce overlapping paired end reads in order to produce longer continuous sequences. Such small insert sizes mean fewer longer reads, and also result in the presence of adapter at the end of the read. Here is presented a protocol to use publicly available tools to identify read pairs with small insert sizes and so likely to contain adapter, to check the sequence of the adapter, and remove adapter sequence from the reads. This protocol does not require a reference genome or prior knowledge of the sequence to be trimmed. Whilst the presence of fragments with small insert sizes may be a particular problem for NexteraXT libraries, the principle can be applied to any Illumina dataset in which the presence of such small inserts is suspected. Frontiers Media S.A. 2014-01-30 /pmc/articles/PMC3906532/ /pubmed/24523726 http://dx.doi.org/10.3389/fgene.2014.00005 Text en Copyright © 2014 Turner. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Turner, Frances S.
Assessment of insert sizes and adapter content in fastq data from NexteraXT libraries
title Assessment of insert sizes and adapter content in fastq data from NexteraXT libraries
title_full Assessment of insert sizes and adapter content in fastq data from NexteraXT libraries
title_fullStr Assessment of insert sizes and adapter content in fastq data from NexteraXT libraries
title_full_unstemmed Assessment of insert sizes and adapter content in fastq data from NexteraXT libraries
title_short Assessment of insert sizes and adapter content in fastq data from NexteraXT libraries
title_sort assessment of insert sizes and adapter content in fastq data from nexteraxt libraries
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3906532/
https://www.ncbi.nlm.nih.gov/pubmed/24523726
http://dx.doi.org/10.3389/fgene.2014.00005
work_keys_str_mv AT turnerfrancess assessmentofinsertsizesandadaptercontentinfastqdatafromnexteraxtlibraries