Cargando…
Solenoid and non-solenoid protein recognition using stationary wavelet packet transform
Motivation: Solenoid proteins are emerging as a protein class with properties intermediate between structured and intrinsically unstructured proteins. Containing repeating structural units, solenoid proteins are expected to share sequence similarities. However, in many cases, the sequence similariti...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2935422/ https://www.ncbi.nlm.nih.gov/pubmed/20823309 http://dx.doi.org/10.1093/bioinformatics/btq371 |
_version_ | 1782186399334662144 |
---|---|
author | Vo, An Nguyen, Nha Huang, Heng |
author_facet | Vo, An Nguyen, Nha Huang, Heng |
author_sort | Vo, An |
collection | PubMed |
description | Motivation: Solenoid proteins are emerging as a protein class with properties intermediate between structured and intrinsically unstructured proteins. Containing repeating structural units, solenoid proteins are expected to share sequence similarities. However, in many cases, the sequence similarities are weak and non-detectable. Moreover, solenoids can be degenerated and widely vary in the number of units. So that it is difficult to detect them. Recently, several solenoid repeats detection methods have been proposed, such as self-alignment of the sequence, spectral analysis and discrete Fourier transform of sequence. Although these methods have shown good performance on certain data sets, they often fail to detect repeats with weak similarities. In this article, we propose a new approach to recognize solenoid repeats and non-solenoid proteins using stationary wavelet packet transform (SWPT). Our method associates with three advantages: (i) naturally representing five main factors of protein structure and properties by wavelet analysis technique; (ii) extracting novel wavelet features that can capture hidden components from solenoid sequence similarities and distinguish them from global proteins; (iii) obtaining statistics features that capture repeating motifs of solenoid proteins. Results: Our method analyzes the characteristics of amino acid sequence in both spectral and temporal domains using SWPT. Both global and local information of proteins are captured by SWPT coefficients. We obtain and integrate wavelet-based features and statistics-based features of amino acid sequence to improve the classification task. Our proposed method is evaluated by comparing to state-of-the-art methods such as HHrepID and REPETITA. The experimental results show that our algorithm consistently outperforms them in areas under ROC curve. At the same false positive rate, the sensitivity of our WAVELET method is higher than other methods. Availability: http://www.naaan.org/anvo/Software/Software.htm Contact: anphuocnhu.vo@mavs.uta.edu |
format | Text |
id | pubmed-2935422 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-29354222010-09-08 Solenoid and non-solenoid protein recognition using stationary wavelet packet transform Vo, An Nguyen, Nha Huang, Heng Bioinformatics Eccb 2010 Conference Proceedings September 26 to September 29, 2010, Ghent, Belgium Motivation: Solenoid proteins are emerging as a protein class with properties intermediate between structured and intrinsically unstructured proteins. Containing repeating structural units, solenoid proteins are expected to share sequence similarities. However, in many cases, the sequence similarities are weak and non-detectable. Moreover, solenoids can be degenerated and widely vary in the number of units. So that it is difficult to detect them. Recently, several solenoid repeats detection methods have been proposed, such as self-alignment of the sequence, spectral analysis and discrete Fourier transform of sequence. Although these methods have shown good performance on certain data sets, they often fail to detect repeats with weak similarities. In this article, we propose a new approach to recognize solenoid repeats and non-solenoid proteins using stationary wavelet packet transform (SWPT). Our method associates with three advantages: (i) naturally representing five main factors of protein structure and properties by wavelet analysis technique; (ii) extracting novel wavelet features that can capture hidden components from solenoid sequence similarities and distinguish them from global proteins; (iii) obtaining statistics features that capture repeating motifs of solenoid proteins. Results: Our method analyzes the characteristics of amino acid sequence in both spectral and temporal domains using SWPT. Both global and local information of proteins are captured by SWPT coefficients. We obtain and integrate wavelet-based features and statistics-based features of amino acid sequence to improve the classification task. Our proposed method is evaluated by comparing to state-of-the-art methods such as HHrepID and REPETITA. The experimental results show that our algorithm consistently outperforms them in areas under ROC curve. At the same false positive rate, the sensitivity of our WAVELET method is higher than other methods. Availability: http://www.naaan.org/anvo/Software/Software.htm Contact: anphuocnhu.vo@mavs.uta.edu Oxford University Press 2010-09-15 2010-09-04 /pmc/articles/PMC2935422/ /pubmed/20823309 http://dx.doi.org/10.1093/bioinformatics/btq371 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Eccb 2010 Conference Proceedings September 26 to September 29, 2010, Ghent, Belgium Vo, An Nguyen, Nha Huang, Heng Solenoid and non-solenoid protein recognition using stationary wavelet packet transform |
title | Solenoid and non-solenoid protein recognition using stationary wavelet packet transform |
title_full | Solenoid and non-solenoid protein recognition using stationary wavelet packet transform |
title_fullStr | Solenoid and non-solenoid protein recognition using stationary wavelet packet transform |
title_full_unstemmed | Solenoid and non-solenoid protein recognition using stationary wavelet packet transform |
title_short | Solenoid and non-solenoid protein recognition using stationary wavelet packet transform |
title_sort | solenoid and non-solenoid protein recognition using stationary wavelet packet transform |
topic | Eccb 2010 Conference Proceedings September 26 to September 29, 2010, Ghent, Belgium |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2935422/ https://www.ncbi.nlm.nih.gov/pubmed/20823309 http://dx.doi.org/10.1093/bioinformatics/btq371 |
work_keys_str_mv | AT voan solenoidandnonsolenoidproteinrecognitionusingstationarywaveletpackettransform AT nguyennha solenoidandnonsolenoidproteinrecognitionusingstationarywaveletpackettransform AT huangheng solenoidandnonsolenoidproteinrecognitionusingstationarywaveletpackettransform |