Cargando…

Fast-GBS: a new pipeline for the efficient and highly accurate calling of SNPs from genotyping-by-sequencing data

BACKGROUND: Next-generation sequencing (NGS) technologies have accelerated considerably the investigation into the composition of genomes and their functions. Genotyping-by-sequencing (GBS) is a genotyping approach that makes use of NGS to rapidly and economically scan a genome. It has been shown to...

Descripción completa

Detalles Bibliográficos
Autores principales: Torkamaneh, Davoud, Laroche, Jérôme, Bastien, Maxime, Abed, Amina, Belzile, François
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5210301/
https://www.ncbi.nlm.nih.gov/pubmed/28049422
http://dx.doi.org/10.1186/s12859-016-1431-9
_version_ 1782490856722268160
author Torkamaneh, Davoud
Laroche, Jérôme
Bastien, Maxime
Abed, Amina
Belzile, François
author_facet Torkamaneh, Davoud
Laroche, Jérôme
Bastien, Maxime
Abed, Amina
Belzile, François
author_sort Torkamaneh, Davoud
collection PubMed
description BACKGROUND: Next-generation sequencing (NGS) technologies have accelerated considerably the investigation into the composition of genomes and their functions. Genotyping-by-sequencing (GBS) is a genotyping approach that makes use of NGS to rapidly and economically scan a genome. It has been shown to allow the simultaneous discovery and genotyping of thousands to millions of SNPs across a wide range of species. For most users, the main challenge in GBS is the bioinformatics analysis of the large amount of sequence information derived from sequencing GBS libraries in view of calling alleles at SNP loci. Herein we describe a new GBS bioinformatics pipeline, Fast-GBS, designed to provide highly accurate genotyping, to require modest computing resources and to offer ease of use. RESULTS: Fast-GBS is built upon standard bioinformatics language and file formats, is capable of handling data from different sequencing platforms, is capable of detecting different kinds of variants (SNPs, MNPs, and Indels). To illustrate its performance, we called variants in three collections of samples (soybean, barley, and potato) that cover a range of different genome sizes, levels of genome complexity, and ploidy. Within these small sets of samples, we called 35 k, 32 k and 38 k SNPs for soybean, barley and potato, respectively. To assess genotype accuracy, we compared these GBS-derived SNP genotypes with independent data sets obtained from whole-genome sequencing or SNP arrays. This analysis yielded estimated accuracies of 98.7, 95.2, and 94% for soybean, barley, and potato, respectively. CONCLUSIONS: We conclude that Fast-GBS provides a highly efficient and reliable tool for calling SNPs from GBS data.
format Online
Article
Text
id pubmed-5210301
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-52103012017-01-06 Fast-GBS: a new pipeline for the efficient and highly accurate calling of SNPs from genotyping-by-sequencing data Torkamaneh, Davoud Laroche, Jérôme Bastien, Maxime Abed, Amina Belzile, François BMC Bioinformatics Software BACKGROUND: Next-generation sequencing (NGS) technologies have accelerated considerably the investigation into the composition of genomes and their functions. Genotyping-by-sequencing (GBS) is a genotyping approach that makes use of NGS to rapidly and economically scan a genome. It has been shown to allow the simultaneous discovery and genotyping of thousands to millions of SNPs across a wide range of species. For most users, the main challenge in GBS is the bioinformatics analysis of the large amount of sequence information derived from sequencing GBS libraries in view of calling alleles at SNP loci. Herein we describe a new GBS bioinformatics pipeline, Fast-GBS, designed to provide highly accurate genotyping, to require modest computing resources and to offer ease of use. RESULTS: Fast-GBS is built upon standard bioinformatics language and file formats, is capable of handling data from different sequencing platforms, is capable of detecting different kinds of variants (SNPs, MNPs, and Indels). To illustrate its performance, we called variants in three collections of samples (soybean, barley, and potato) that cover a range of different genome sizes, levels of genome complexity, and ploidy. Within these small sets of samples, we called 35 k, 32 k and 38 k SNPs for soybean, barley and potato, respectively. To assess genotype accuracy, we compared these GBS-derived SNP genotypes with independent data sets obtained from whole-genome sequencing or SNP arrays. This analysis yielded estimated accuracies of 98.7, 95.2, and 94% for soybean, barley, and potato, respectively. CONCLUSIONS: We conclude that Fast-GBS provides a highly efficient and reliable tool for calling SNPs from GBS data. BioMed Central 2017-01-03 /pmc/articles/PMC5210301/ /pubmed/28049422 http://dx.doi.org/10.1186/s12859-016-1431-9 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Torkamaneh, Davoud
Laroche, Jérôme
Bastien, Maxime
Abed, Amina
Belzile, François
Fast-GBS: a new pipeline for the efficient and highly accurate calling of SNPs from genotyping-by-sequencing data
title Fast-GBS: a new pipeline for the efficient and highly accurate calling of SNPs from genotyping-by-sequencing data
title_full Fast-GBS: a new pipeline for the efficient and highly accurate calling of SNPs from genotyping-by-sequencing data
title_fullStr Fast-GBS: a new pipeline for the efficient and highly accurate calling of SNPs from genotyping-by-sequencing data
title_full_unstemmed Fast-GBS: a new pipeline for the efficient and highly accurate calling of SNPs from genotyping-by-sequencing data
title_short Fast-GBS: a new pipeline for the efficient and highly accurate calling of SNPs from genotyping-by-sequencing data
title_sort fast-gbs: a new pipeline for the efficient and highly accurate calling of snps from genotyping-by-sequencing data
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5210301/
https://www.ncbi.nlm.nih.gov/pubmed/28049422
http://dx.doi.org/10.1186/s12859-016-1431-9
work_keys_str_mv AT torkamanehdavoud fastgbsanewpipelinefortheefficientandhighlyaccuratecallingofsnpsfromgenotypingbysequencingdata
AT larochejerome fastgbsanewpipelinefortheefficientandhighlyaccuratecallingofsnpsfromgenotypingbysequencingdata
AT bastienmaxime fastgbsanewpipelinefortheefficientandhighlyaccuratecallingofsnpsfromgenotypingbysequencingdata
AT abedamina fastgbsanewpipelinefortheefficientandhighlyaccuratecallingofsnpsfromgenotypingbysequencingdata
AT belzilefrancois fastgbsanewpipelinefortheefficientandhighlyaccuratecallingofsnpsfromgenotypingbysequencingdata