Cargando…

Variant site strain typer (VaST): efficient strain typing using a minimal number of variant genomic sites

BACKGROUND: Targeted PCR amplicon sequencing (TAS) techniques provide a sensitive, scalable, and cost-effective way to query and identify closely related bacterial species and strains. Typically, this is accomplished by targeting housekeeping genes that provide resolution down to the family, genera,...

Descripción completa

Detalles Bibliográficos
Autores principales: Furstenau, Tara N., Cocking, Jill H., Sahl, Jason W., Fofanov, Viacheslav Y.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5996513/
https://www.ncbi.nlm.nih.gov/pubmed/29890941
http://dx.doi.org/10.1186/s12859-018-2225-z
_version_ 1783330877717086208
author Furstenau, Tara N.
Cocking, Jill H.
Sahl, Jason W.
Fofanov, Viacheslav Y.
author_facet Furstenau, Tara N.
Cocking, Jill H.
Sahl, Jason W.
Fofanov, Viacheslav Y.
author_sort Furstenau, Tara N.
collection PubMed
description BACKGROUND: Targeted PCR amplicon sequencing (TAS) techniques provide a sensitive, scalable, and cost-effective way to query and identify closely related bacterial species and strains. Typically, this is accomplished by targeting housekeeping genes that provide resolution down to the family, genera, and sometimes species level. Unfortunately, this level of resolution is not sufficient in many applications where strain-level identification of bacteria is required (biodefense, forensics, clinical diagnostics, and outbreak investigations). Adding more genomic targets will increase the resolution, but the challenge is identifying the appropriate targets. VaST was developed to address this challenge by finding the minimum number of targets that, in combination, achieve maximum strain-level resolution for any strain complex. The final combination of target regions identified by the algorithm produce a unique haplotype for each strain which can be used as a fingerprint for identifying unknown samples in a TAS assay. VaST ensures that the targets have conserved primer regions so that the targets can be amplified in all of the known strains and it also favors the inclusion of targets with basal variants which makes the set more robust when identifying previously unseen strains. RESULTS: We analyzed VaST’s performance using a number of different pathogenic species that are relevant to human disease outbreaks and biodefense. The number of targets required to achieve full resolution ranged from 20 to 88% fewer sites than what would be required in the worst case and most of the resolution is achieved within the first 20 targets. We computationally and experimentally validated one of the VaST panels and found that the targets led to accurate phylogenetic placement of strains, even when the strains were not a part of the original panel design. CONCLUSIONS: VaST is an open source software that, when provided a set of variant sites, can find the minimum number of sites that will provide maximum resolution of a strain complex, and it has many different run-time options that can accommodate a wide range of applications. VaST can be an effective tool in the design of strain identification panels that, when combined with TAS technologies, offer an efficient and inexpensive strain typing protocol.
format Online
Article
Text
id pubmed-5996513
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-59965132018-06-25 Variant site strain typer (VaST): efficient strain typing using a minimal number of variant genomic sites Furstenau, Tara N. Cocking, Jill H. Sahl, Jason W. Fofanov, Viacheslav Y. BMC Bioinformatics Software BACKGROUND: Targeted PCR amplicon sequencing (TAS) techniques provide a sensitive, scalable, and cost-effective way to query and identify closely related bacterial species and strains. Typically, this is accomplished by targeting housekeeping genes that provide resolution down to the family, genera, and sometimes species level. Unfortunately, this level of resolution is not sufficient in many applications where strain-level identification of bacteria is required (biodefense, forensics, clinical diagnostics, and outbreak investigations). Adding more genomic targets will increase the resolution, but the challenge is identifying the appropriate targets. VaST was developed to address this challenge by finding the minimum number of targets that, in combination, achieve maximum strain-level resolution for any strain complex. The final combination of target regions identified by the algorithm produce a unique haplotype for each strain which can be used as a fingerprint for identifying unknown samples in a TAS assay. VaST ensures that the targets have conserved primer regions so that the targets can be amplified in all of the known strains and it also favors the inclusion of targets with basal variants which makes the set more robust when identifying previously unseen strains. RESULTS: We analyzed VaST’s performance using a number of different pathogenic species that are relevant to human disease outbreaks and biodefense. The number of targets required to achieve full resolution ranged from 20 to 88% fewer sites than what would be required in the worst case and most of the resolution is achieved within the first 20 targets. We computationally and experimentally validated one of the VaST panels and found that the targets led to accurate phylogenetic placement of strains, even when the strains were not a part of the original panel design. CONCLUSIONS: VaST is an open source software that, when provided a set of variant sites, can find the minimum number of sites that will provide maximum resolution of a strain complex, and it has many different run-time options that can accommodate a wide range of applications. VaST can be an effective tool in the design of strain identification panels that, when combined with TAS technologies, offer an efficient and inexpensive strain typing protocol. BioMed Central 2018-06-11 /pmc/articles/PMC5996513/ /pubmed/29890941 http://dx.doi.org/10.1186/s12859-018-2225-z Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Furstenau, Tara N.
Cocking, Jill H.
Sahl, Jason W.
Fofanov, Viacheslav Y.
Variant site strain typer (VaST): efficient strain typing using a minimal number of variant genomic sites
title Variant site strain typer (VaST): efficient strain typing using a minimal number of variant genomic sites
title_full Variant site strain typer (VaST): efficient strain typing using a minimal number of variant genomic sites
title_fullStr Variant site strain typer (VaST): efficient strain typing using a minimal number of variant genomic sites
title_full_unstemmed Variant site strain typer (VaST): efficient strain typing using a minimal number of variant genomic sites
title_short Variant site strain typer (VaST): efficient strain typing using a minimal number of variant genomic sites
title_sort variant site strain typer (vast): efficient strain typing using a minimal number of variant genomic sites
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5996513/
https://www.ncbi.nlm.nih.gov/pubmed/29890941
http://dx.doi.org/10.1186/s12859-018-2225-z
work_keys_str_mv AT furstenautaran variantsitestraintypervastefficientstraintypingusingaminimalnumberofvariantgenomicsites
AT cockingjillh variantsitestraintypervastefficientstraintypingusingaminimalnumberofvariantgenomicsites
AT sahljasonw variantsitestraintypervastefficientstraintypingusingaminimalnumberofvariantgenomicsites
AT fofanovviacheslavy variantsitestraintypervastefficientstraintypingusingaminimalnumberofvariantgenomicsites