Cargando…

Determining the quality and complexity of next-generation sequencing data without a reference genome

We describe an open-source kPAL package that facilitates an alignment-free assessment of the quality and comparability of sequencing datasets by analyzing k-mer frequencies. We show that kPAL can detect technical artefacts such as high duplication rates, library chimeras, contamination and differenc...

Descripción completa

Detalles Bibliográficos
Autores principales: Anvar, Seyed Yahya, Khachatryan, Lusine, Vermaat, Martijn, van Galen, Michiel, Pulyakhina, Irina, Ariyurek, Yavuz, Kraaijeveld, Ken, den Dunnen, Johan T, de Knijff, Peter, ’t Hoen, Peter AC, Laros, Jeroen FJ
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4298064/
https://www.ncbi.nlm.nih.gov/pubmed/25514851
http://dx.doi.org/10.1186/s13059-014-0555-3
Descripción
Sumario:We describe an open-source kPAL package that facilitates an alignment-free assessment of the quality and comparability of sequencing datasets by analyzing k-mer frequencies. We show that kPAL can detect technical artefacts such as high duplication rates, library chimeras, contamination and differences in library preparation protocols. kPAL also successfully captures the complexity and diversity of microbiomes and provides a powerful means to study changes in microbial communities. Together, these features make kPAL an attractive and broadly applicable tool to determine the quality and comparability of sequence libraries even in the absence of a reference sequence. kPAL is freely available at https://github.com/LUMC/kPAL. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-014-0555-3) contains supplementary material, which is available to authorized users.