Cargando…
Fast-GBS: a new pipeline for the efficient and highly accurate calling of SNPs from genotyping-by-sequencing data
BACKGROUND: Next-generation sequencing (NGS) technologies have accelerated considerably the investigation into the composition of genomes and their functions. Genotyping-by-sequencing (GBS) is a genotyping approach that makes use of NGS to rapidly and economically scan a genome. It has been shown to...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5210301/ https://www.ncbi.nlm.nih.gov/pubmed/28049422 http://dx.doi.org/10.1186/s12859-016-1431-9 |
_version_ | 1782490856722268160 |
---|---|
author | Torkamaneh, Davoud Laroche, Jérôme Bastien, Maxime Abed, Amina Belzile, François |
author_facet | Torkamaneh, Davoud Laroche, Jérôme Bastien, Maxime Abed, Amina Belzile, François |
author_sort | Torkamaneh, Davoud |
collection | PubMed |
description | BACKGROUND: Next-generation sequencing (NGS) technologies have accelerated considerably the investigation into the composition of genomes and their functions. Genotyping-by-sequencing (GBS) is a genotyping approach that makes use of NGS to rapidly and economically scan a genome. It has been shown to allow the simultaneous discovery and genotyping of thousands to millions of SNPs across a wide range of species. For most users, the main challenge in GBS is the bioinformatics analysis of the large amount of sequence information derived from sequencing GBS libraries in view of calling alleles at SNP loci. Herein we describe a new GBS bioinformatics pipeline, Fast-GBS, designed to provide highly accurate genotyping, to require modest computing resources and to offer ease of use. RESULTS: Fast-GBS is built upon standard bioinformatics language and file formats, is capable of handling data from different sequencing platforms, is capable of detecting different kinds of variants (SNPs, MNPs, and Indels). To illustrate its performance, we called variants in three collections of samples (soybean, barley, and potato) that cover a range of different genome sizes, levels of genome complexity, and ploidy. Within these small sets of samples, we called 35 k, 32 k and 38 k SNPs for soybean, barley and potato, respectively. To assess genotype accuracy, we compared these GBS-derived SNP genotypes with independent data sets obtained from whole-genome sequencing or SNP arrays. This analysis yielded estimated accuracies of 98.7, 95.2, and 94% for soybean, barley, and potato, respectively. CONCLUSIONS: We conclude that Fast-GBS provides a highly efficient and reliable tool for calling SNPs from GBS data. |
format | Online Article Text |
id | pubmed-5210301 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-52103012017-01-06 Fast-GBS: a new pipeline for the efficient and highly accurate calling of SNPs from genotyping-by-sequencing data Torkamaneh, Davoud Laroche, Jérôme Bastien, Maxime Abed, Amina Belzile, François BMC Bioinformatics Software BACKGROUND: Next-generation sequencing (NGS) technologies have accelerated considerably the investigation into the composition of genomes and their functions. Genotyping-by-sequencing (GBS) is a genotyping approach that makes use of NGS to rapidly and economically scan a genome. It has been shown to allow the simultaneous discovery and genotyping of thousands to millions of SNPs across a wide range of species. For most users, the main challenge in GBS is the bioinformatics analysis of the large amount of sequence information derived from sequencing GBS libraries in view of calling alleles at SNP loci. Herein we describe a new GBS bioinformatics pipeline, Fast-GBS, designed to provide highly accurate genotyping, to require modest computing resources and to offer ease of use. RESULTS: Fast-GBS is built upon standard bioinformatics language and file formats, is capable of handling data from different sequencing platforms, is capable of detecting different kinds of variants (SNPs, MNPs, and Indels). To illustrate its performance, we called variants in three collections of samples (soybean, barley, and potato) that cover a range of different genome sizes, levels of genome complexity, and ploidy. Within these small sets of samples, we called 35 k, 32 k and 38 k SNPs for soybean, barley and potato, respectively. To assess genotype accuracy, we compared these GBS-derived SNP genotypes with independent data sets obtained from whole-genome sequencing or SNP arrays. This analysis yielded estimated accuracies of 98.7, 95.2, and 94% for soybean, barley, and potato, respectively. CONCLUSIONS: We conclude that Fast-GBS provides a highly efficient and reliable tool for calling SNPs from GBS data. BioMed Central 2017-01-03 /pmc/articles/PMC5210301/ /pubmed/28049422 http://dx.doi.org/10.1186/s12859-016-1431-9 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Torkamaneh, Davoud Laroche, Jérôme Bastien, Maxime Abed, Amina Belzile, François Fast-GBS: a new pipeline for the efficient and highly accurate calling of SNPs from genotyping-by-sequencing data |
title | Fast-GBS: a new pipeline for the efficient and highly accurate calling of SNPs from genotyping-by-sequencing data |
title_full | Fast-GBS: a new pipeline for the efficient and highly accurate calling of SNPs from genotyping-by-sequencing data |
title_fullStr | Fast-GBS: a new pipeline for the efficient and highly accurate calling of SNPs from genotyping-by-sequencing data |
title_full_unstemmed | Fast-GBS: a new pipeline for the efficient and highly accurate calling of SNPs from genotyping-by-sequencing data |
title_short | Fast-GBS: a new pipeline for the efficient and highly accurate calling of SNPs from genotyping-by-sequencing data |
title_sort | fast-gbs: a new pipeline for the efficient and highly accurate calling of snps from genotyping-by-sequencing data |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5210301/ https://www.ncbi.nlm.nih.gov/pubmed/28049422 http://dx.doi.org/10.1186/s12859-016-1431-9 |
work_keys_str_mv | AT torkamanehdavoud fastgbsanewpipelinefortheefficientandhighlyaccuratecallingofsnpsfromgenotypingbysequencingdata AT larochejerome fastgbsanewpipelinefortheefficientandhighlyaccuratecallingofsnpsfromgenotypingbysequencingdata AT bastienmaxime fastgbsanewpipelinefortheefficientandhighlyaccuratecallingofsnpsfromgenotypingbysequencingdata AT abedamina fastgbsanewpipelinefortheefficientandhighlyaccuratecallingofsnpsfromgenotypingbysequencingdata AT belzilefrancois fastgbsanewpipelinefortheefficientandhighlyaccuratecallingofsnpsfromgenotypingbysequencingdata |