Cargando…

Accuracy and quality assessment of 454 GS-FLX Titanium pyrosequencing

BACKGROUND: The rapid evolution of 454 GS-FLX sequencing technology has not been accompanied by a reassessment of the quality and accuracy of the sequences obtained. Current strategies for decision-making and error-correction are based on an initial analysis by Huse et al. in 2007, for the older GS2...

Descripción completa

Detalles Bibliográficos
Autores principales: Gilles, André, Meglécz, Emese, Pech, Nicolas, Ferreira, Stéphanie, Malausa, Thibaut, Martin, Jean-François
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3116506/
https://www.ncbi.nlm.nih.gov/pubmed/21592414
http://dx.doi.org/10.1186/1471-2164-12-245
_version_ 1782206256495198208
author Gilles, André
Meglécz, Emese
Pech, Nicolas
Ferreira, Stéphanie
Malausa, Thibaut
Martin, Jean-François
author_facet Gilles, André
Meglécz, Emese
Pech, Nicolas
Ferreira, Stéphanie
Malausa, Thibaut
Martin, Jean-François
author_sort Gilles, André
collection PubMed
description BACKGROUND: The rapid evolution of 454 GS-FLX sequencing technology has not been accompanied by a reassessment of the quality and accuracy of the sequences obtained. Current strategies for decision-making and error-correction are based on an initial analysis by Huse et al. in 2007, for the older GS20 system based on experimental sequences. We analyze here the quality of 454 sequencing data and identify factors playing a role in sequencing error, through the use of an extensive dataset for Roche control DNA fragments. RESULTS: We obtained a mean error rate for 454 sequences of 1.07%. More importantly, the error rate is not randomly distributed; it occasionally rose to more than 50% in certain positions, and its distribution was linked to several experimental variables. The main factors related to error are the presence of homopolymers, position in the sequence, size of the sequence and spatial localization in PT plates for insertion and deletion errors. These factors can be described by considering seven variables. No single variable can account for the error rate distribution, but most of the variation is explained by the combination of all seven variables. CONCLUSIONS: The pattern identified here calls for the use of internal controls and error-correcting base callers, to correct for errors, when available (e.g. when sequencing amplicons). For shotgun libraries, the use of both sequencing primers and deep coverage, combined with the use of random sequencing primer sites should partly compensate for even high error rates, although it may prove more difficult than previous thought to distinguish between low-frequency alleles and errors.
format Online
Article
Text
id pubmed-3116506
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31165062011-06-17 Accuracy and quality assessment of 454 GS-FLX Titanium pyrosequencing Gilles, André Meglécz, Emese Pech, Nicolas Ferreira, Stéphanie Malausa, Thibaut Martin, Jean-François BMC Genomics Research Article BACKGROUND: The rapid evolution of 454 GS-FLX sequencing technology has not been accompanied by a reassessment of the quality and accuracy of the sequences obtained. Current strategies for decision-making and error-correction are based on an initial analysis by Huse et al. in 2007, for the older GS20 system based on experimental sequences. We analyze here the quality of 454 sequencing data and identify factors playing a role in sequencing error, through the use of an extensive dataset for Roche control DNA fragments. RESULTS: We obtained a mean error rate for 454 sequences of 1.07%. More importantly, the error rate is not randomly distributed; it occasionally rose to more than 50% in certain positions, and its distribution was linked to several experimental variables. The main factors related to error are the presence of homopolymers, position in the sequence, size of the sequence and spatial localization in PT plates for insertion and deletion errors. These factors can be described by considering seven variables. No single variable can account for the error rate distribution, but most of the variation is explained by the combination of all seven variables. CONCLUSIONS: The pattern identified here calls for the use of internal controls and error-correcting base callers, to correct for errors, when available (e.g. when sequencing amplicons). For shotgun libraries, the use of both sequencing primers and deep coverage, combined with the use of random sequencing primer sites should partly compensate for even high error rates, although it may prove more difficult than previous thought to distinguish between low-frequency alleles and errors. BioMed Central 2011-05-19 /pmc/articles/PMC3116506/ /pubmed/21592414 http://dx.doi.org/10.1186/1471-2164-12-245 Text en Copyright ©2011 Gilles et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Gilles, André
Meglécz, Emese
Pech, Nicolas
Ferreira, Stéphanie
Malausa, Thibaut
Martin, Jean-François
Accuracy and quality assessment of 454 GS-FLX Titanium pyrosequencing
title Accuracy and quality assessment of 454 GS-FLX Titanium pyrosequencing
title_full Accuracy and quality assessment of 454 GS-FLX Titanium pyrosequencing
title_fullStr Accuracy and quality assessment of 454 GS-FLX Titanium pyrosequencing
title_full_unstemmed Accuracy and quality assessment of 454 GS-FLX Titanium pyrosequencing
title_short Accuracy and quality assessment of 454 GS-FLX Titanium pyrosequencing
title_sort accuracy and quality assessment of 454 gs-flx titanium pyrosequencing
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3116506/
https://www.ncbi.nlm.nih.gov/pubmed/21592414
http://dx.doi.org/10.1186/1471-2164-12-245
work_keys_str_mv AT gillesandre accuracyandqualityassessmentof454gsflxtitaniumpyrosequencing
AT megleczemese accuracyandqualityassessmentof454gsflxtitaniumpyrosequencing
AT pechnicolas accuracyandqualityassessmentof454gsflxtitaniumpyrosequencing
AT ferreirastephanie accuracyandqualityassessmentof454gsflxtitaniumpyrosequencing
AT malausathibaut accuracyandqualityassessmentof454gsflxtitaniumpyrosequencing
AT martinjeanfrancois accuracyandqualityassessmentof454gsflxtitaniumpyrosequencing