Cargando…
Cluster oligonucleotide signatures for rapid identification by sequencing
BACKGROUND: Oligonucleotide signatures (signatures) have been widely used for studying microbial diversity and function in wet-lab settings, but using them for accurate in silico identification of organisms from high-throughput sequencing (HTS) data is only a proof of concept. Existing signature des...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6284311/ https://www.ncbi.nlm.nih.gov/pubmed/30522439 http://dx.doi.org/10.1186/s12859-018-2363-3 |
_version_ | 1783379314330304512 |
---|---|
author | Zahariev, Manuel Chen, Wen Visagie, Cobus M. Lévesque, C. André |
author_facet | Zahariev, Manuel Chen, Wen Visagie, Cobus M. Lévesque, C. André |
author_sort | Zahariev, Manuel |
collection | PubMed |
description | BACKGROUND: Oligonucleotide signatures (signatures) have been widely used for studying microbial diversity and function in wet-lab settings, but using them for accurate in silico identification of organisms from high-throughput sequencing (HTS) data is only a proof of concept. Existing signature design programs for sequence signatures (signatures matching exactly one sequence) or clade signatures (signatures matching every sequence in a phylogenetic clade) are not able to identify all possible polymorphic sites for sequences with high similarity and perform poorly when handling large genome sequencing datasets. RESULTS: We introduce cluster signatures: subsequences that match perfectly and exclusively any group of sequences in a data set. Cluster signatures provide complete recall for primer/probe design and increased discrimination between sequences beyond that of clade signatures. Using cluster signatures for in silico identification of HTS targets achieves good precision/recall and running time performance. This method has been implemented into an open source tool, the Automated Oligonucleotide Design Pipeline (adop), included in supplementary material and available at: https://bitbucket.org/wenchen_aafc/aodp_v2.0_release. CONCLUSIONS: Cluster signatures provide a rapid and universal analysis tool to identify all possible short diagnostic DNA markers and variants from any DNA sequencing dataset. They are particularly useful in discriminating genetic material from closely related organisms and in detecting deleterious mutations in highly or perfectly conserved genomic sites. |
format | Online Article Text |
id | pubmed-6284311 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-62843112018-12-14 Cluster oligonucleotide signatures for rapid identification by sequencing Zahariev, Manuel Chen, Wen Visagie, Cobus M. Lévesque, C. André BMC Bioinformatics Research Article BACKGROUND: Oligonucleotide signatures (signatures) have been widely used for studying microbial diversity and function in wet-lab settings, but using them for accurate in silico identification of organisms from high-throughput sequencing (HTS) data is only a proof of concept. Existing signature design programs for sequence signatures (signatures matching exactly one sequence) or clade signatures (signatures matching every sequence in a phylogenetic clade) are not able to identify all possible polymorphic sites for sequences with high similarity and perform poorly when handling large genome sequencing datasets. RESULTS: We introduce cluster signatures: subsequences that match perfectly and exclusively any group of sequences in a data set. Cluster signatures provide complete recall for primer/probe design and increased discrimination between sequences beyond that of clade signatures. Using cluster signatures for in silico identification of HTS targets achieves good precision/recall and running time performance. This method has been implemented into an open source tool, the Automated Oligonucleotide Design Pipeline (adop), included in supplementary material and available at: https://bitbucket.org/wenchen_aafc/aodp_v2.0_release. CONCLUSIONS: Cluster signatures provide a rapid and universal analysis tool to identify all possible short diagnostic DNA markers and variants from any DNA sequencing dataset. They are particularly useful in discriminating genetic material from closely related organisms and in detecting deleterious mutations in highly or perfectly conserved genomic sites. BioMed Central 2018-10-29 /pmc/articles/PMC6284311/ /pubmed/30522439 http://dx.doi.org/10.1186/s12859-018-2363-3 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Zahariev, Manuel Chen, Wen Visagie, Cobus M. Lévesque, C. André Cluster oligonucleotide signatures for rapid identification by sequencing |
title | Cluster oligonucleotide signatures for rapid identification by sequencing |
title_full | Cluster oligonucleotide signatures for rapid identification by sequencing |
title_fullStr | Cluster oligonucleotide signatures for rapid identification by sequencing |
title_full_unstemmed | Cluster oligonucleotide signatures for rapid identification by sequencing |
title_short | Cluster oligonucleotide signatures for rapid identification by sequencing |
title_sort | cluster oligonucleotide signatures for rapid identification by sequencing |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6284311/ https://www.ncbi.nlm.nih.gov/pubmed/30522439 http://dx.doi.org/10.1186/s12859-018-2363-3 |
work_keys_str_mv | AT zaharievmanuel clusteroligonucleotidesignaturesforrapididentificationbysequencing AT chenwen clusteroligonucleotidesignaturesforrapididentificationbysequencing AT visagiecobusm clusteroligonucleotidesignaturesforrapididentificationbysequencing AT levesquecandre clusteroligonucleotidesignaturesforrapididentificationbysequencing |