Cargando…

Multi-genome alignment for quality control and contamination screening of next-generation sequencing data

The availability of massive amounts of DNA sequence data, from 1000s of genomes even in a single project has had a huge impact on our understanding of biology, but also creates several problems for biologists carrying out those experiments. Bioinformatic analysis of sequence data is perhaps the most...

Descripción completa

Detalles Bibliográficos
Autores principales: Hadfield, James, Eldridge, Matthew D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3930033/
https://www.ncbi.nlm.nih.gov/pubmed/24600470
http://dx.doi.org/10.3389/fgene.2014.00031
_version_ 1782304488415035392
author Hadfield, James
Eldridge, Matthew D.
author_facet Hadfield, James
Eldridge, Matthew D.
author_sort Hadfield, James
collection PubMed
description The availability of massive amounts of DNA sequence data, from 1000s of genomes even in a single project has had a huge impact on our understanding of biology, but also creates several problems for biologists carrying out those experiments. Bioinformatic analysis of sequence data is perhaps the most obvious challenge but upstream of this even basic quality control of sequence run performance is challenging for many users given the volume of data. Users need to be able to assess run quality efficiently so that only high-quality data are passed through to computationally-, financially-, and time-intensive processes. There is a clear need to make human review of sequence data as efficient as possible. The multi-genome alignment tool presented here presents next-generation sequencing run data in visual and tabular formats simplifying assessment of run yield and quality, as well as presenting some sample-based quality metrics and screening for contamination from adapter sequences and species other than the one being sequenced.
format Online
Article
Text
id pubmed-3930033
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-39300332014-03-05 Multi-genome alignment for quality control and contamination screening of next-generation sequencing data Hadfield, James Eldridge, Matthew D. Front Genet Genetics The availability of massive amounts of DNA sequence data, from 1000s of genomes even in a single project has had a huge impact on our understanding of biology, but also creates several problems for biologists carrying out those experiments. Bioinformatic analysis of sequence data is perhaps the most obvious challenge but upstream of this even basic quality control of sequence run performance is challenging for many users given the volume of data. Users need to be able to assess run quality efficiently so that only high-quality data are passed through to computationally-, financially-, and time-intensive processes. There is a clear need to make human review of sequence data as efficient as possible. The multi-genome alignment tool presented here presents next-generation sequencing run data in visual and tabular formats simplifying assessment of run yield and quality, as well as presenting some sample-based quality metrics and screening for contamination from adapter sequences and species other than the one being sequenced. Frontiers Media S.A. 2014-02-20 /pmc/articles/PMC3930033/ /pubmed/24600470 http://dx.doi.org/10.3389/fgene.2014.00031 Text en Copyright © 2014 Hadfield and Eldridge. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Hadfield, James
Eldridge, Matthew D.
Multi-genome alignment for quality control and contamination screening of next-generation sequencing data
title Multi-genome alignment for quality control and contamination screening of next-generation sequencing data
title_full Multi-genome alignment for quality control and contamination screening of next-generation sequencing data
title_fullStr Multi-genome alignment for quality control and contamination screening of next-generation sequencing data
title_full_unstemmed Multi-genome alignment for quality control and contamination screening of next-generation sequencing data
title_short Multi-genome alignment for quality control and contamination screening of next-generation sequencing data
title_sort multi-genome alignment for quality control and contamination screening of next-generation sequencing data
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3930033/
https://www.ncbi.nlm.nih.gov/pubmed/24600470
http://dx.doi.org/10.3389/fgene.2014.00031
work_keys_str_mv AT hadfieldjames multigenomealignmentforqualitycontrolandcontaminationscreeningofnextgenerationsequencingdata
AT eldridgematthewd multigenomealignmentforqualitycontrolandcontaminationscreeningofnextgenerationsequencingdata