Cargando…

VCF2CAPS–A high-throughput CAPS marker design from VCF files and its test-use on a genotyping-by-sequencing (GBS) dataset

Next-generation sequencing (NGS) is a powerful tool for massive detection of DNA sequence variants such as single nucleotide polymorphisms (SNPs), multi-nucleotide polymorphisms (MNPs) and insertions/deletions (indels). For routine screening of numerous samples, these variants are often converted in...

Descripción completa

Detalles Bibliográficos
Autores principales: Wesołowski, Wojciech, Domnicz, Beata, Augustynowicz, Joanna, Szklarczyk, Marek
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8186816/
https://www.ncbi.nlm.nih.gov/pubmed/34014924
http://dx.doi.org/10.1371/journal.pcbi.1008980
_version_ 1783705022225186816
author Wesołowski, Wojciech
Domnicz, Beata
Augustynowicz, Joanna
Szklarczyk, Marek
author_facet Wesołowski, Wojciech
Domnicz, Beata
Augustynowicz, Joanna
Szklarczyk, Marek
author_sort Wesołowski, Wojciech
collection PubMed
description Next-generation sequencing (NGS) is a powerful tool for massive detection of DNA sequence variants such as single nucleotide polymorphisms (SNPs), multi-nucleotide polymorphisms (MNPs) and insertions/deletions (indels). For routine screening of numerous samples, these variants are often converted into cleaved amplified polymorphic sequence (CAPS) markers which are based on the presence versus absence of restriction sites within PCR products. Current computational tools for SNP to CAPS conversion are limited and usually infeasible to use for large datasets as those generated with NGS. Moreover, there is no available tool for massive conversion of MNPs and indels into CAPS markers. Here, we present VCF2CAPS–a new software for identification of restriction endonucleases that recognize SNP/MNP/indel-containing sequences from NGS experiments. Additionally, the program contains filtration utilities not available in other SNP to CAPS converters–selection of markers with a single polymorphic cut site within a user-specified sequence length, and selection of markers that differentiate up to three user-defined groups of individuals from the analyzed population. Performance of VCF2CAPS was tested on a thoroughly analyzed dataset from a genotyping-by-sequencing (GBS) experiment. A selection of CAPS markers picked by the program was subjected to experimental verification. CAPS markers, also referred to as PCR-RFLPs, belong to basic tools exploited in plant, animal and human genetics. Our new software–VCF2CAPS–fills the gap in the current inventory of genetic software by high-throughput CAPS marker design from next-generation sequencing (NGS) data. The program should be of interest to geneticists involved in molecular diagnostics. In this paper we show a successful exemplary application of VCF2CAPS and we believe that its usefulness is guaranteed by the growing availability of NGS services.
format Online
Article
Text
id pubmed-8186816
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-81868162021-06-16 VCF2CAPS–A high-throughput CAPS marker design from VCF files and its test-use on a genotyping-by-sequencing (GBS) dataset Wesołowski, Wojciech Domnicz, Beata Augustynowicz, Joanna Szklarczyk, Marek PLoS Comput Biol Research Article Next-generation sequencing (NGS) is a powerful tool for massive detection of DNA sequence variants such as single nucleotide polymorphisms (SNPs), multi-nucleotide polymorphisms (MNPs) and insertions/deletions (indels). For routine screening of numerous samples, these variants are often converted into cleaved amplified polymorphic sequence (CAPS) markers which are based on the presence versus absence of restriction sites within PCR products. Current computational tools for SNP to CAPS conversion are limited and usually infeasible to use for large datasets as those generated with NGS. Moreover, there is no available tool for massive conversion of MNPs and indels into CAPS markers. Here, we present VCF2CAPS–a new software for identification of restriction endonucleases that recognize SNP/MNP/indel-containing sequences from NGS experiments. Additionally, the program contains filtration utilities not available in other SNP to CAPS converters–selection of markers with a single polymorphic cut site within a user-specified sequence length, and selection of markers that differentiate up to three user-defined groups of individuals from the analyzed population. Performance of VCF2CAPS was tested on a thoroughly analyzed dataset from a genotyping-by-sequencing (GBS) experiment. A selection of CAPS markers picked by the program was subjected to experimental verification. CAPS markers, also referred to as PCR-RFLPs, belong to basic tools exploited in plant, animal and human genetics. Our new software–VCF2CAPS–fills the gap in the current inventory of genetic software by high-throughput CAPS marker design from next-generation sequencing (NGS) data. The program should be of interest to geneticists involved in molecular diagnostics. In this paper we show a successful exemplary application of VCF2CAPS and we believe that its usefulness is guaranteed by the growing availability of NGS services. Public Library of Science 2021-05-20 /pmc/articles/PMC8186816/ /pubmed/34014924 http://dx.doi.org/10.1371/journal.pcbi.1008980 Text en © 2021 Wesołowski et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Wesołowski, Wojciech
Domnicz, Beata
Augustynowicz, Joanna
Szklarczyk, Marek
VCF2CAPS–A high-throughput CAPS marker design from VCF files and its test-use on a genotyping-by-sequencing (GBS) dataset
title VCF2CAPS–A high-throughput CAPS marker design from VCF files and its test-use on a genotyping-by-sequencing (GBS) dataset
title_full VCF2CAPS–A high-throughput CAPS marker design from VCF files and its test-use on a genotyping-by-sequencing (GBS) dataset
title_fullStr VCF2CAPS–A high-throughput CAPS marker design from VCF files and its test-use on a genotyping-by-sequencing (GBS) dataset
title_full_unstemmed VCF2CAPS–A high-throughput CAPS marker design from VCF files and its test-use on a genotyping-by-sequencing (GBS) dataset
title_short VCF2CAPS–A high-throughput CAPS marker design from VCF files and its test-use on a genotyping-by-sequencing (GBS) dataset
title_sort vcf2caps–a high-throughput caps marker design from vcf files and its test-use on a genotyping-by-sequencing (gbs) dataset
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8186816/
https://www.ncbi.nlm.nih.gov/pubmed/34014924
http://dx.doi.org/10.1371/journal.pcbi.1008980
work_keys_str_mv AT wesołowskiwojciech vcf2capsahighthroughputcapsmarkerdesignfromvcffilesanditstestuseonagenotypingbysequencinggbsdataset
AT domniczbeata vcf2capsahighthroughputcapsmarkerdesignfromvcffilesanditstestuseonagenotypingbysequencinggbsdataset
AT augustynowiczjoanna vcf2capsahighthroughputcapsmarkerdesignfromvcffilesanditstestuseonagenotypingbysequencinggbsdataset
AT szklarczykmarek vcf2capsahighthroughputcapsmarkerdesignfromvcffilesanditstestuseonagenotypingbysequencinggbsdataset