Cargando…

Solenoid and non-solenoid protein recognition using stationary wavelet packet transform

Motivation: Solenoid proteins are emerging as a protein class with properties intermediate between structured and intrinsically unstructured proteins. Containing repeating structural units, solenoid proteins are expected to share sequence similarities. However, in many cases, the sequence similariti...

Descripción completa

Detalles Bibliográficos
Autores principales: Vo, An, Nguyen, Nha, Huang, Heng
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2935422/
https://www.ncbi.nlm.nih.gov/pubmed/20823309
http://dx.doi.org/10.1093/bioinformatics/btq371
_version_ 1782186399334662144
author Vo, An
Nguyen, Nha
Huang, Heng
author_facet Vo, An
Nguyen, Nha
Huang, Heng
author_sort Vo, An
collection PubMed
description Motivation: Solenoid proteins are emerging as a protein class with properties intermediate between structured and intrinsically unstructured proteins. Containing repeating structural units, solenoid proteins are expected to share sequence similarities. However, in many cases, the sequence similarities are weak and non-detectable. Moreover, solenoids can be degenerated and widely vary in the number of units. So that it is difficult to detect them. Recently, several solenoid repeats detection methods have been proposed, such as self-alignment of the sequence, spectral analysis and discrete Fourier transform of sequence. Although these methods have shown good performance on certain data sets, they often fail to detect repeats with weak similarities. In this article, we propose a new approach to recognize solenoid repeats and non-solenoid proteins using stationary wavelet packet transform (SWPT). Our method associates with three advantages: (i) naturally representing five main factors of protein structure and properties by wavelet analysis technique; (ii) extracting novel wavelet features that can capture hidden components from solenoid sequence similarities and distinguish them from global proteins; (iii) obtaining statistics features that capture repeating motifs of solenoid proteins. Results: Our method analyzes the characteristics of amino acid sequence in both spectral and temporal domains using SWPT. Both global and local information of proteins are captured by SWPT coefficients. We obtain and integrate wavelet-based features and statistics-based features of amino acid sequence to improve the classification task. Our proposed method is evaluated by comparing to state-of-the-art methods such as HHrepID and REPETITA. The experimental results show that our algorithm consistently outperforms them in areas under ROC curve. At the same false positive rate, the sensitivity of our WAVELET method is higher than other methods. Availability: http://www.naaan.org/anvo/Software/Software.htm Contact: anphuocnhu.vo@mavs.uta.edu
format Text
id pubmed-2935422
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-29354222010-09-08 Solenoid and non-solenoid protein recognition using stationary wavelet packet transform Vo, An Nguyen, Nha Huang, Heng Bioinformatics Eccb 2010 Conference Proceedings September 26 to September 29, 2010, Ghent, Belgium Motivation: Solenoid proteins are emerging as a protein class with properties intermediate between structured and intrinsically unstructured proteins. Containing repeating structural units, solenoid proteins are expected to share sequence similarities. However, in many cases, the sequence similarities are weak and non-detectable. Moreover, solenoids can be degenerated and widely vary in the number of units. So that it is difficult to detect them. Recently, several solenoid repeats detection methods have been proposed, such as self-alignment of the sequence, spectral analysis and discrete Fourier transform of sequence. Although these methods have shown good performance on certain data sets, they often fail to detect repeats with weak similarities. In this article, we propose a new approach to recognize solenoid repeats and non-solenoid proteins using stationary wavelet packet transform (SWPT). Our method associates with three advantages: (i) naturally representing five main factors of protein structure and properties by wavelet analysis technique; (ii) extracting novel wavelet features that can capture hidden components from solenoid sequence similarities and distinguish them from global proteins; (iii) obtaining statistics features that capture repeating motifs of solenoid proteins. Results: Our method analyzes the characteristics of amino acid sequence in both spectral and temporal domains using SWPT. Both global and local information of proteins are captured by SWPT coefficients. We obtain and integrate wavelet-based features and statistics-based features of amino acid sequence to improve the classification task. Our proposed method is evaluated by comparing to state-of-the-art methods such as HHrepID and REPETITA. The experimental results show that our algorithm consistently outperforms them in areas under ROC curve. At the same false positive rate, the sensitivity of our WAVELET method is higher than other methods. Availability: http://www.naaan.org/anvo/Software/Software.htm Contact: anphuocnhu.vo@mavs.uta.edu Oxford University Press 2010-09-15 2010-09-04 /pmc/articles/PMC2935422/ /pubmed/20823309 http://dx.doi.org/10.1093/bioinformatics/btq371 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Eccb 2010 Conference Proceedings September 26 to September 29, 2010, Ghent, Belgium
Vo, An
Nguyen, Nha
Huang, Heng
Solenoid and non-solenoid protein recognition using stationary wavelet packet transform
title Solenoid and non-solenoid protein recognition using stationary wavelet packet transform
title_full Solenoid and non-solenoid protein recognition using stationary wavelet packet transform
title_fullStr Solenoid and non-solenoid protein recognition using stationary wavelet packet transform
title_full_unstemmed Solenoid and non-solenoid protein recognition using stationary wavelet packet transform
title_short Solenoid and non-solenoid protein recognition using stationary wavelet packet transform
title_sort solenoid and non-solenoid protein recognition using stationary wavelet packet transform
topic Eccb 2010 Conference Proceedings September 26 to September 29, 2010, Ghent, Belgium
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2935422/
https://www.ncbi.nlm.nih.gov/pubmed/20823309
http://dx.doi.org/10.1093/bioinformatics/btq371
work_keys_str_mv AT voan solenoidandnonsolenoidproteinrecognitionusingstationarywaveletpackettransform
AT nguyennha solenoidandnonsolenoidproteinrecognitionusingstationarywaveletpackettransform
AT huangheng solenoidandnonsolenoidproteinrecognitionusingstationarywaveletpackettransform