Cargando…
SSAW: A new sequence similarity analysis method based on the stationary discrete wavelet transform
BACKGROUND: Alignment-free sequence similarity analysis methods often lead to significant savings in computational time over alignment-based counterparts. RESULTS: A new alignment-free sequence similarity analysis method, called SSAW is proposed. SSAW stands for Sequence Similarity Analysis using th...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5930706/ https://www.ncbi.nlm.nih.gov/pubmed/29720081 http://dx.doi.org/10.1186/s12859-018-2155-9 |
_version_ | 1783319523594600448 |
---|---|
author | Lin, Jie Wei, Jing Adjeroh, Donald Jiang, Bing-Hua Jiang, Yue |
author_facet | Lin, Jie Wei, Jing Adjeroh, Donald Jiang, Bing-Hua Jiang, Yue |
author_sort | Lin, Jie |
collection | PubMed |
description | BACKGROUND: Alignment-free sequence similarity analysis methods often lead to significant savings in computational time over alignment-based counterparts. RESULTS: A new alignment-free sequence similarity analysis method, called SSAW is proposed. SSAW stands for Sequence Similarity Analysis using the Stationary Discrete Wavelet Transform (SDWT). It extracts k-mers from a sequence, then maps each k-mer to a complex number field. Then, the series of complex numbers formed are transformed into feature vectors using the stationary discrete wavelet transform. After these steps, the original sequence is turned into a feature vector with numeric values, which can then be used for clustering and/or classification. CONCLUSIONS: Using two different types of applications, namely, clustering and classification, we compared SSAW against the the-state-of-the-art alignment free sequence analysis methods. SSAW demonstrates competitive or superior performance in terms of standard indicators, such as accuracy, F-score, precision, and recall. The running time was significantly better in most cases. These make SSAW a suitable method for sequence analysis, especially, given the rapidly increasing volumes of sequence data required by most modern applications. |
format | Online Article Text |
id | pubmed-5930706 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-59307062018-05-09 SSAW: A new sequence similarity analysis method based on the stationary discrete wavelet transform Lin, Jie Wei, Jing Adjeroh, Donald Jiang, Bing-Hua Jiang, Yue BMC Bioinformatics Research Article BACKGROUND: Alignment-free sequence similarity analysis methods often lead to significant savings in computational time over alignment-based counterparts. RESULTS: A new alignment-free sequence similarity analysis method, called SSAW is proposed. SSAW stands for Sequence Similarity Analysis using the Stationary Discrete Wavelet Transform (SDWT). It extracts k-mers from a sequence, then maps each k-mer to a complex number field. Then, the series of complex numbers formed are transformed into feature vectors using the stationary discrete wavelet transform. After these steps, the original sequence is turned into a feature vector with numeric values, which can then be used for clustering and/or classification. CONCLUSIONS: Using two different types of applications, namely, clustering and classification, we compared SSAW against the the-state-of-the-art alignment free sequence analysis methods. SSAW demonstrates competitive or superior performance in terms of standard indicators, such as accuracy, F-score, precision, and recall. The running time was significantly better in most cases. These make SSAW a suitable method for sequence analysis, especially, given the rapidly increasing volumes of sequence data required by most modern applications. BioMed Central 2018-05-02 /pmc/articles/PMC5930706/ /pubmed/29720081 http://dx.doi.org/10.1186/s12859-018-2155-9 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Lin, Jie Wei, Jing Adjeroh, Donald Jiang, Bing-Hua Jiang, Yue SSAW: A new sequence similarity analysis method based on the stationary discrete wavelet transform |
title | SSAW: A new sequence similarity analysis method based on the stationary discrete wavelet transform |
title_full | SSAW: A new sequence similarity analysis method based on the stationary discrete wavelet transform |
title_fullStr | SSAW: A new sequence similarity analysis method based on the stationary discrete wavelet transform |
title_full_unstemmed | SSAW: A new sequence similarity analysis method based on the stationary discrete wavelet transform |
title_short | SSAW: A new sequence similarity analysis method based on the stationary discrete wavelet transform |
title_sort | ssaw: a new sequence similarity analysis method based on the stationary discrete wavelet transform |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5930706/ https://www.ncbi.nlm.nih.gov/pubmed/29720081 http://dx.doi.org/10.1186/s12859-018-2155-9 |
work_keys_str_mv | AT linjie ssawanewsequencesimilarityanalysismethodbasedonthestationarydiscretewavelettransform AT weijing ssawanewsequencesimilarityanalysismethodbasedonthestationarydiscretewavelettransform AT adjerohdonald ssawanewsequencesimilarityanalysismethodbasedonthestationarydiscretewavelettransform AT jiangbinghua ssawanewsequencesimilarityanalysismethodbasedonthestationarydiscretewavelettransform AT jiangyue ssawanewsequencesimilarityanalysismethodbasedonthestationarydiscretewavelettransform |