Cargando…
Hidden Markov Model: a shortest unique representative approach to detect the protein toxins, virulence factors and antibiotic resistance genes
OBJECTIVE: Currently, next generation sequencing (NGS) is widely used to decode potential novel or variant pathogens both in emergent outbreaks and in routine clinical practice. However, the efficient identification of novel or diverged pathogenomic compositions remains a big challenge. It is especi...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8011099/ https://www.ncbi.nlm.nih.gov/pubmed/33785071 http://dx.doi.org/10.1186/s13104-021-05531-w |
_version_ | 1783673179830484992 |
---|---|
author | Xie, Gary Fair, Jeanne M. |
author_facet | Xie, Gary Fair, Jeanne M. |
author_sort | Xie, Gary |
collection | PubMed |
description | OBJECTIVE: Currently, next generation sequencing (NGS) is widely used to decode potential novel or variant pathogens both in emergent outbreaks and in routine clinical practice. However, the efficient identification of novel or diverged pathogenomic compositions remains a big challenge. It is especially true for short DNA sequence fragments from NGS, since sequence similarity searching is vulnerable to false negatives or false positives, as is mismatching or matching with unrelated proteins. Therefore, this study aimed to establish a bioinformatics approach that can generate unique motif sequences for profiling searching, resulting in high specificity and sensitivity. RESULTS: In this study, we introduced a Shortest Unique Representative Hidden Markov Model (HMM) approach to identify bacterial toxin, virulence factor (VF), and antimicrobial resistance (AR) in short sequence reads. We first construct unique representative domain sequences of toxin genes, VFs, and ARs to avoid potential false positives, and then to use HMM models to accurately identify potential toxin, VF, and AR fragments. The benchmark shows this approach can achieve relatively high specificity and sensitivity if the appropriate cutoff value is applied. Our approach can be used to recognize the protein sequences of known toxins and pathogens, identifies their common characteristics and then searches for similar sequences in other organisms. |
format | Online Article Text |
id | pubmed-8011099 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-80110992021-03-31 Hidden Markov Model: a shortest unique representative approach to detect the protein toxins, virulence factors and antibiotic resistance genes Xie, Gary Fair, Jeanne M. BMC Res Notes Research Note OBJECTIVE: Currently, next generation sequencing (NGS) is widely used to decode potential novel or variant pathogens both in emergent outbreaks and in routine clinical practice. However, the efficient identification of novel or diverged pathogenomic compositions remains a big challenge. It is especially true for short DNA sequence fragments from NGS, since sequence similarity searching is vulnerable to false negatives or false positives, as is mismatching or matching with unrelated proteins. Therefore, this study aimed to establish a bioinformatics approach that can generate unique motif sequences for profiling searching, resulting in high specificity and sensitivity. RESULTS: In this study, we introduced a Shortest Unique Representative Hidden Markov Model (HMM) approach to identify bacterial toxin, virulence factor (VF), and antimicrobial resistance (AR) in short sequence reads. We first construct unique representative domain sequences of toxin genes, VFs, and ARs to avoid potential false positives, and then to use HMM models to accurately identify potential toxin, VF, and AR fragments. The benchmark shows this approach can achieve relatively high specificity and sensitivity if the appropriate cutoff value is applied. Our approach can be used to recognize the protein sequences of known toxins and pathogens, identifies their common characteristics and then searches for similar sequences in other organisms. BioMed Central 2021-03-30 /pmc/articles/PMC8011099/ /pubmed/33785071 http://dx.doi.org/10.1186/s13104-021-05531-w Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Note Xie, Gary Fair, Jeanne M. Hidden Markov Model: a shortest unique representative approach to detect the protein toxins, virulence factors and antibiotic resistance genes |
title | Hidden Markov Model: a shortest unique representative approach to detect the protein toxins, virulence factors and antibiotic resistance genes |
title_full | Hidden Markov Model: a shortest unique representative approach to detect the protein toxins, virulence factors and antibiotic resistance genes |
title_fullStr | Hidden Markov Model: a shortest unique representative approach to detect the protein toxins, virulence factors and antibiotic resistance genes |
title_full_unstemmed | Hidden Markov Model: a shortest unique representative approach to detect the protein toxins, virulence factors and antibiotic resistance genes |
title_short | Hidden Markov Model: a shortest unique representative approach to detect the protein toxins, virulence factors and antibiotic resistance genes |
title_sort | hidden markov model: a shortest unique representative approach to detect the protein toxins, virulence factors and antibiotic resistance genes |
topic | Research Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8011099/ https://www.ncbi.nlm.nih.gov/pubmed/33785071 http://dx.doi.org/10.1186/s13104-021-05531-w |
work_keys_str_mv | AT xiegary hiddenmarkovmodelashortestuniquerepresentativeapproachtodetecttheproteintoxinsvirulencefactorsandantibioticresistancegenes AT fairjeannem hiddenmarkovmodelashortestuniquerepresentativeapproachtodetecttheproteintoxinsvirulencefactorsandantibioticresistancegenes |