Cargando…

Generating Minimal Models of H1N1 NS1 Gene Sequences Using Alignment-Based and Alignment-Free Algorithms

For virus classification and tracing, one idea is to generate minimal models from the gene sequences of each virus group for comparative analysis within and between classes, as well as classification and tracing of new sequences. The starting point of defining a minimal model for a group of gene seq...

Descripción completa

Detalles Bibliográficos
Autores principales: Fang, Meng, Xu, Jiawei, Sun, Nan, Yau, Stephen S.-T.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9858667/
https://www.ncbi.nlm.nih.gov/pubmed/36672928
http://dx.doi.org/10.3390/genes14010186
_version_ 1784874159909634048
author Fang, Meng
Xu, Jiawei
Sun, Nan
Yau, Stephen S.-T.
author_facet Fang, Meng
Xu, Jiawei
Sun, Nan
Yau, Stephen S.-T.
author_sort Fang, Meng
collection PubMed
description For virus classification and tracing, one idea is to generate minimal models from the gene sequences of each virus group for comparative analysis within and between classes, as well as classification and tracing of new sequences. The starting point of defining a minimal model for a group of gene sequences is to find their longest common sequence (LCS), but this is a non-deterministic polynomial-time hard (NP-hard) problem. Therefore, we applied some heuristic approaches of finding LCS, as well as some of the newer methods of treating gene sequences, including multiple sequence alignment (MSA) and k-mer natural vector (NV) encoding. To evaluate our algorithms, a five-fold cross validation classification scheme on a dataset of H1N1 virus non-structural protein 1 (NS1) gene was analyzed. The results indicate that the MSA-based algorithm has the best performance measured by classification accuracy, while the NV-based algorithm exhibits advantages in the time complexity of generating minimal models.
format Online
Article
Text
id pubmed-9858667
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-98586672023-01-21 Generating Minimal Models of H1N1 NS1 Gene Sequences Using Alignment-Based and Alignment-Free Algorithms Fang, Meng Xu, Jiawei Sun, Nan Yau, Stephen S.-T. Genes (Basel) Article For virus classification and tracing, one idea is to generate minimal models from the gene sequences of each virus group for comparative analysis within and between classes, as well as classification and tracing of new sequences. The starting point of defining a minimal model for a group of gene sequences is to find their longest common sequence (LCS), but this is a non-deterministic polynomial-time hard (NP-hard) problem. Therefore, we applied some heuristic approaches of finding LCS, as well as some of the newer methods of treating gene sequences, including multiple sequence alignment (MSA) and k-mer natural vector (NV) encoding. To evaluate our algorithms, a five-fold cross validation classification scheme on a dataset of H1N1 virus non-structural protein 1 (NS1) gene was analyzed. The results indicate that the MSA-based algorithm has the best performance measured by classification accuracy, while the NV-based algorithm exhibits advantages in the time complexity of generating minimal models. MDPI 2023-01-10 /pmc/articles/PMC9858667/ /pubmed/36672928 http://dx.doi.org/10.3390/genes14010186 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Fang, Meng
Xu, Jiawei
Sun, Nan
Yau, Stephen S.-T.
Generating Minimal Models of H1N1 NS1 Gene Sequences Using Alignment-Based and Alignment-Free Algorithms
title Generating Minimal Models of H1N1 NS1 Gene Sequences Using Alignment-Based and Alignment-Free Algorithms
title_full Generating Minimal Models of H1N1 NS1 Gene Sequences Using Alignment-Based and Alignment-Free Algorithms
title_fullStr Generating Minimal Models of H1N1 NS1 Gene Sequences Using Alignment-Based and Alignment-Free Algorithms
title_full_unstemmed Generating Minimal Models of H1N1 NS1 Gene Sequences Using Alignment-Based and Alignment-Free Algorithms
title_short Generating Minimal Models of H1N1 NS1 Gene Sequences Using Alignment-Based and Alignment-Free Algorithms
title_sort generating minimal models of h1n1 ns1 gene sequences using alignment-based and alignment-free algorithms
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9858667/
https://www.ncbi.nlm.nih.gov/pubmed/36672928
http://dx.doi.org/10.3390/genes14010186
work_keys_str_mv AT fangmeng generatingminimalmodelsofh1n1ns1genesequencesusingalignmentbasedandalignmentfreealgorithms
AT xujiawei generatingminimalmodelsofh1n1ns1genesequencesusingalignmentbasedandalignmentfreealgorithms
AT sunnan generatingminimalmodelsofh1n1ns1genesequencesusingalignmentbasedandalignmentfreealgorithms
AT yaustephenst generatingminimalmodelsofh1n1ns1genesequencesusingalignmentbasedandalignmentfreealgorithms