Cargando…

UGbS-Flex, a novel bioinformatics pipeline for imputation-free SNP discovery in polyploids without a reference genome: finger millet as a case study

BACKGROUND: Research on orphan crops is often hindered by a lack of genomic resources. With the advent of affordable sequencing technologies, genotyping an entire genome or, for large-genome species, a representative fraction of the genome has become feasible for any crop. Nevertheless, most genotyp...

Descripción completa

Detalles Bibliográficos
Autores principales: Qi, Peng, Gimode, Davis, Saha, Dipnarayan, Schröder, Stephan, Chakraborty, Debkanta, Wang, Xuewen, Dida, Mathews M., Malmberg, Russell L., Devos, Katrien M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6003085/
https://www.ncbi.nlm.nih.gov/pubmed/29902967
http://dx.doi.org/10.1186/s12870-018-1316-3
_version_ 1783332302176124928
author Qi, Peng
Gimode, Davis
Saha, Dipnarayan
Schröder, Stephan
Chakraborty, Debkanta
Wang, Xuewen
Dida, Mathews M.
Malmberg, Russell L.
Devos, Katrien M.
author_facet Qi, Peng
Gimode, Davis
Saha, Dipnarayan
Schröder, Stephan
Chakraborty, Debkanta
Wang, Xuewen
Dida, Mathews M.
Malmberg, Russell L.
Devos, Katrien M.
author_sort Qi, Peng
collection PubMed
description BACKGROUND: Research on orphan crops is often hindered by a lack of genomic resources. With the advent of affordable sequencing technologies, genotyping an entire genome or, for large-genome species, a representative fraction of the genome has become feasible for any crop. Nevertheless, most genotyping-by-sequencing (GBS) methods are geared towards obtaining large numbers of markers at low sequence depth, which excludes their application in heterozygous individuals. Furthermore, bioinformatics pipelines often lack the flexibility to deal with paired-end reads or to be applied in polyploid species. RESULTS: UGbS-Flex combines publicly available software with in-house python and perl scripts to efficiently call SNPs from genotyping-by-sequencing reads irrespective of the species’ ploidy level, breeding system and availability of a reference genome. Noteworthy features of the UGbS-Flex pipeline are an ability to use paired-end reads as input, an effective approach to cluster reads across samples with enhanced outputs, and maximization of SNP calling. We demonstrate use of the pipeline for the identification of several thousand high-confidence SNPs with high representation across samples in an F(3)-derived F(2) population in the allotetraploid finger millet. Robust high-density genetic maps were constructed using the time-tested mapping program MAPMAKER which we upgraded to run efficiently and in a semi-automated manner in a Windows Command Prompt Environment. We exploited comparative GBS with one of the diploid ancestors of finger millet to assign linkage groups to subgenomes and demonstrate the presence of chromosomal rearrangements. CONCLUSIONS: The paper combines GBS protocol modifications, a novel flexible GBS analysis pipeline, UGbS-Flex, recommendations to maximize SNP identification, updated genetic mapping software, and the first high-density maps of finger millet. The modules used in the UGbS-Flex pipeline and for genetic mapping were applied to finger millet, an allotetraploid selfing species without a reference genome, as a case study. The UGbS-Flex modules, which can be run independently, are easily transferable to species with other breeding systems or ploidy levels. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12870-018-1316-3) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6003085
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-60030852018-07-06 UGbS-Flex, a novel bioinformatics pipeline for imputation-free SNP discovery in polyploids without a reference genome: finger millet as a case study Qi, Peng Gimode, Davis Saha, Dipnarayan Schröder, Stephan Chakraborty, Debkanta Wang, Xuewen Dida, Mathews M. Malmberg, Russell L. Devos, Katrien M. BMC Plant Biol Methodology BACKGROUND: Research on orphan crops is often hindered by a lack of genomic resources. With the advent of affordable sequencing technologies, genotyping an entire genome or, for large-genome species, a representative fraction of the genome has become feasible for any crop. Nevertheless, most genotyping-by-sequencing (GBS) methods are geared towards obtaining large numbers of markers at low sequence depth, which excludes their application in heterozygous individuals. Furthermore, bioinformatics pipelines often lack the flexibility to deal with paired-end reads or to be applied in polyploid species. RESULTS: UGbS-Flex combines publicly available software with in-house python and perl scripts to efficiently call SNPs from genotyping-by-sequencing reads irrespective of the species’ ploidy level, breeding system and availability of a reference genome. Noteworthy features of the UGbS-Flex pipeline are an ability to use paired-end reads as input, an effective approach to cluster reads across samples with enhanced outputs, and maximization of SNP calling. We demonstrate use of the pipeline for the identification of several thousand high-confidence SNPs with high representation across samples in an F(3)-derived F(2) population in the allotetraploid finger millet. Robust high-density genetic maps were constructed using the time-tested mapping program MAPMAKER which we upgraded to run efficiently and in a semi-automated manner in a Windows Command Prompt Environment. We exploited comparative GBS with one of the diploid ancestors of finger millet to assign linkage groups to subgenomes and demonstrate the presence of chromosomal rearrangements. CONCLUSIONS: The paper combines GBS protocol modifications, a novel flexible GBS analysis pipeline, UGbS-Flex, recommendations to maximize SNP identification, updated genetic mapping software, and the first high-density maps of finger millet. The modules used in the UGbS-Flex pipeline and for genetic mapping were applied to finger millet, an allotetraploid selfing species without a reference genome, as a case study. The UGbS-Flex modules, which can be run independently, are easily transferable to species with other breeding systems or ploidy levels. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12870-018-1316-3) contains supplementary material, which is available to authorized users. BioMed Central 2018-06-15 /pmc/articles/PMC6003085/ /pubmed/29902967 http://dx.doi.org/10.1186/s12870-018-1316-3 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology
Qi, Peng
Gimode, Davis
Saha, Dipnarayan
Schröder, Stephan
Chakraborty, Debkanta
Wang, Xuewen
Dida, Mathews M.
Malmberg, Russell L.
Devos, Katrien M.
UGbS-Flex, a novel bioinformatics pipeline for imputation-free SNP discovery in polyploids without a reference genome: finger millet as a case study
title UGbS-Flex, a novel bioinformatics pipeline for imputation-free SNP discovery in polyploids without a reference genome: finger millet as a case study
title_full UGbS-Flex, a novel bioinformatics pipeline for imputation-free SNP discovery in polyploids without a reference genome: finger millet as a case study
title_fullStr UGbS-Flex, a novel bioinformatics pipeline for imputation-free SNP discovery in polyploids without a reference genome: finger millet as a case study
title_full_unstemmed UGbS-Flex, a novel bioinformatics pipeline for imputation-free SNP discovery in polyploids without a reference genome: finger millet as a case study
title_short UGbS-Flex, a novel bioinformatics pipeline for imputation-free SNP discovery in polyploids without a reference genome: finger millet as a case study
title_sort ugbs-flex, a novel bioinformatics pipeline for imputation-free snp discovery in polyploids without a reference genome: finger millet as a case study
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6003085/
https://www.ncbi.nlm.nih.gov/pubmed/29902967
http://dx.doi.org/10.1186/s12870-018-1316-3
work_keys_str_mv AT qipeng ugbsflexanovelbioinformaticspipelineforimputationfreesnpdiscoveryinpolyploidswithoutareferencegenomefingermilletasacasestudy
AT gimodedavis ugbsflexanovelbioinformaticspipelineforimputationfreesnpdiscoveryinpolyploidswithoutareferencegenomefingermilletasacasestudy
AT sahadipnarayan ugbsflexanovelbioinformaticspipelineforimputationfreesnpdiscoveryinpolyploidswithoutareferencegenomefingermilletasacasestudy
AT schroderstephan ugbsflexanovelbioinformaticspipelineforimputationfreesnpdiscoveryinpolyploidswithoutareferencegenomefingermilletasacasestudy
AT chakrabortydebkanta ugbsflexanovelbioinformaticspipelineforimputationfreesnpdiscoveryinpolyploidswithoutareferencegenomefingermilletasacasestudy
AT wangxuewen ugbsflexanovelbioinformaticspipelineforimputationfreesnpdiscoveryinpolyploidswithoutareferencegenomefingermilletasacasestudy
AT didamathewsm ugbsflexanovelbioinformaticspipelineforimputationfreesnpdiscoveryinpolyploidswithoutareferencegenomefingermilletasacasestudy
AT malmbergrusselll ugbsflexanovelbioinformaticspipelineforimputationfreesnpdiscoveryinpolyploidswithoutareferencegenomefingermilletasacasestudy
AT devoskatrienm ugbsflexanovelbioinformaticspipelineforimputationfreesnpdiscoveryinpolyploidswithoutareferencegenomefingermilletasacasestudy