Cargando…

ngs_backbone: a pipeline for read cleaning, mapping and SNP calling using Next Generation Sequence

BACKGROUND: The possibilities offered by next generation sequencing (NGS) platforms are revolutionizing biotechnological laboratories. Moreover, the combination of NGS sequencing and affordable high-throughput genotyping technologies is facilitating the rapid discovery and use of SNPs in non-model s...

Descripción completa

Detalles Bibliográficos
Autores principales: Blanca, Jose M, Pascual, Laura, Ziarsolo, Peio, Nuez, Fernando, Cañizares, Joaquin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3124440/
https://www.ncbi.nlm.nih.gov/pubmed/21635747
http://dx.doi.org/10.1186/1471-2164-12-285
_version_ 1782207089387503616
author Blanca, Jose M
Pascual, Laura
Ziarsolo, Peio
Nuez, Fernando
Cañizares, Joaquin
author_facet Blanca, Jose M
Pascual, Laura
Ziarsolo, Peio
Nuez, Fernando
Cañizares, Joaquin
author_sort Blanca, Jose M
collection PubMed
description BACKGROUND: The possibilities offered by next generation sequencing (NGS) platforms are revolutionizing biotechnological laboratories. Moreover, the combination of NGS sequencing and affordable high-throughput genotyping technologies is facilitating the rapid discovery and use of SNPs in non-model species. However, this abundance of sequences and polymorphisms creates new software needs. To fulfill these needs, we have developed a powerful, yet easy-to-use application. RESULTS: The ngs_backbone software is a parallel pipeline capable of analyzing Sanger, 454, Illumina and SOLiD (Sequencing by Oligonucleotide Ligation and Detection) sequence reads. Its main supported analyses are: read cleaning, transcriptome assembly and annotation, read mapping and single nucleotide polymorphism (SNP) calling and selection. In order to build a truly useful tool, the software development was paired with a laboratory experiment. All public tomato Sanger EST reads plus 14.2 million Illumina reads were employed to test the tool and predict polymorphism in tomato. The cleaned reads were mapped to the SGN tomato transcriptome obtaining a coverage of 4.2 for Sanger and 8.5 for Illumina. 23,360 single nucleotide variations (SNVs) were predicted. A total of 76 SNVs were experimentally validated, and 85% were found to be real. CONCLUSIONS: ngs_backbone is a new software package capable of analyzing sequences produced by NGS technologies and predicting SNVs with great accuracy. In our tomato example, we created a highly polymorphic collection of SNVs that will be a useful resource for tomato researchers and breeders. The software developed along with its documentation is freely available under the AGPL license and can be downloaded from http://bioinf.comav.upv.es/ngs_backbone/ or http://github.com/JoseBlanca/franklin.
format Online
Article
Text
id pubmed-3124440
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31244402011-06-28 ngs_backbone: a pipeline for read cleaning, mapping and SNP calling using Next Generation Sequence Blanca, Jose M Pascual, Laura Ziarsolo, Peio Nuez, Fernando Cañizares, Joaquin BMC Genomics Software BACKGROUND: The possibilities offered by next generation sequencing (NGS) platforms are revolutionizing biotechnological laboratories. Moreover, the combination of NGS sequencing and affordable high-throughput genotyping technologies is facilitating the rapid discovery and use of SNPs in non-model species. However, this abundance of sequences and polymorphisms creates new software needs. To fulfill these needs, we have developed a powerful, yet easy-to-use application. RESULTS: The ngs_backbone software is a parallel pipeline capable of analyzing Sanger, 454, Illumina and SOLiD (Sequencing by Oligonucleotide Ligation and Detection) sequence reads. Its main supported analyses are: read cleaning, transcriptome assembly and annotation, read mapping and single nucleotide polymorphism (SNP) calling and selection. In order to build a truly useful tool, the software development was paired with a laboratory experiment. All public tomato Sanger EST reads plus 14.2 million Illumina reads were employed to test the tool and predict polymorphism in tomato. The cleaned reads were mapped to the SGN tomato transcriptome obtaining a coverage of 4.2 for Sanger and 8.5 for Illumina. 23,360 single nucleotide variations (SNVs) were predicted. A total of 76 SNVs were experimentally validated, and 85% were found to be real. CONCLUSIONS: ngs_backbone is a new software package capable of analyzing sequences produced by NGS technologies and predicting SNVs with great accuracy. In our tomato example, we created a highly polymorphic collection of SNVs that will be a useful resource for tomato researchers and breeders. The software developed along with its documentation is freely available under the AGPL license and can be downloaded from http://bioinf.comav.upv.es/ngs_backbone/ or http://github.com/JoseBlanca/franklin. BioMed Central 2011-06-02 /pmc/articles/PMC3124440/ /pubmed/21635747 http://dx.doi.org/10.1186/1471-2164-12-285 Text en Copyright ©2011 Blanca et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Blanca, Jose M
Pascual, Laura
Ziarsolo, Peio
Nuez, Fernando
Cañizares, Joaquin
ngs_backbone: a pipeline for read cleaning, mapping and SNP calling using Next Generation Sequence
title ngs_backbone: a pipeline for read cleaning, mapping and SNP calling using Next Generation Sequence
title_full ngs_backbone: a pipeline for read cleaning, mapping and SNP calling using Next Generation Sequence
title_fullStr ngs_backbone: a pipeline for read cleaning, mapping and SNP calling using Next Generation Sequence
title_full_unstemmed ngs_backbone: a pipeline for read cleaning, mapping and SNP calling using Next Generation Sequence
title_short ngs_backbone: a pipeline for read cleaning, mapping and SNP calling using Next Generation Sequence
title_sort ngs_backbone: a pipeline for read cleaning, mapping and snp calling using next generation sequence
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3124440/
https://www.ncbi.nlm.nih.gov/pubmed/21635747
http://dx.doi.org/10.1186/1471-2164-12-285
work_keys_str_mv AT blancajosem ngsbackboneapipelineforreadcleaningmappingandsnpcallingusingnextgenerationsequence
AT pascuallaura ngsbackboneapipelineforreadcleaningmappingandsnpcallingusingnextgenerationsequence
AT ziarsolopeio ngsbackboneapipelineforreadcleaningmappingandsnpcallingusingnextgenerationsequence
AT nuezfernando ngsbackboneapipelineforreadcleaningmappingandsnpcallingusingnextgenerationsequence
AT canizaresjoaquin ngsbackboneapipelineforreadcleaningmappingandsnpcallingusingnextgenerationsequence