Cargando…

Iterative Correction of Reference Nucleotides (iCORN) using second generation sequencing technology

Motivation: The accuracy of reference genomes is important for downstream analysis but a low error rate requires expensive manual interrogation of the sequence. Here, we describe a novel algorithm (Iterative Correction of Reference Nucleotides) that iteratively aligns deep coverage of short sequenci...

Descripción completa

Detalles Bibliográficos
Autores principales: Otto, Thomas D., Sanders, Mandy, Berriman, Matthew, Newbold, Chris
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2894513/
https://www.ncbi.nlm.nih.gov/pubmed/20562415
http://dx.doi.org/10.1093/bioinformatics/btq269
Descripción
Sumario:Motivation: The accuracy of reference genomes is important for downstream analysis but a low error rate requires expensive manual interrogation of the sequence. Here, we describe a novel algorithm (Iterative Correction of Reference Nucleotides) that iteratively aligns deep coverage of short sequencing reads to correct errors in reference genome sequences and evaluate their accuracy. Results: Using Plasmodium falciparum (81% A + T content) as an extreme example, we show that the algorithm is highly accurate and corrects over 2000 errors in the reference sequence. We give examples of its application to numerous other eukaryotic and prokaryotic genomes and suggest additional applications. Availability: The software is available at http://icorn.sourceforge.net Contact: tdo@sanger.ac.uk; cnewbold@hammer.imm.ox.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.