Cargando…

An Empirical Study of Different Approaches for Protein Classification

Many domains would benefit from reliable and efficient systems for automatic protein classification. An area of particular interest in recent studies on automatic protein classification is the exploration of new methods for extracting features from a protein that work well for specific problems. The...

Descripción completa

Detalles Bibliográficos
Autores principales: Nanni, Loris, Lumini, Alessandra, Brahnam, Sheryl
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4084589/
https://www.ncbi.nlm.nih.gov/pubmed/25028675
http://dx.doi.org/10.1155/2014/236717
_version_ 1782324555065327616
author Nanni, Loris
Lumini, Alessandra
Brahnam, Sheryl
author_facet Nanni, Loris
Lumini, Alessandra
Brahnam, Sheryl
author_sort Nanni, Loris
collection PubMed
description Many domains would benefit from reliable and efficient systems for automatic protein classification. An area of particular interest in recent studies on automatic protein classification is the exploration of new methods for extracting features from a protein that work well for specific problems. These methods, however, are not generalizable and have proven useful in only a few domains. Our goal is to evaluate several feature extraction approaches for representing proteins by testing them across multiple datasets. Different types of protein representations are evaluated: those starting from the position specific scoring matrix of the proteins (PSSM), those derived from the amino-acid sequence, two matrix representations, and features taken from the 3D tertiary structure of the protein. We also test new variants of proteins descriptors. We develop our system experimentally by comparing and combining different descriptors taken from the protein representations. Each descriptor is used to train a separate support vector machine (SVM), and the results are combined by sum rule. Some stand-alone descriptors work well on some datasets but not on others. Through fusion, the different descriptors provide a performance that works well across all tested datasets, in some cases performing better than the state-of-the-art.
format Online
Article
Text
id pubmed-4084589
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-40845892014-07-15 An Empirical Study of Different Approaches for Protein Classification Nanni, Loris Lumini, Alessandra Brahnam, Sheryl ScientificWorldJournal Research Article Many domains would benefit from reliable and efficient systems for automatic protein classification. An area of particular interest in recent studies on automatic protein classification is the exploration of new methods for extracting features from a protein that work well for specific problems. These methods, however, are not generalizable and have proven useful in only a few domains. Our goal is to evaluate several feature extraction approaches for representing proteins by testing them across multiple datasets. Different types of protein representations are evaluated: those starting from the position specific scoring matrix of the proteins (PSSM), those derived from the amino-acid sequence, two matrix representations, and features taken from the 3D tertiary structure of the protein. We also test new variants of proteins descriptors. We develop our system experimentally by comparing and combining different descriptors taken from the protein representations. Each descriptor is used to train a separate support vector machine (SVM), and the results are combined by sum rule. Some stand-alone descriptors work well on some datasets but not on others. Through fusion, the different descriptors provide a performance that works well across all tested datasets, in some cases performing better than the state-of-the-art. Hindawi Publishing Corporation 2014 2014-06-15 /pmc/articles/PMC4084589/ /pubmed/25028675 http://dx.doi.org/10.1155/2014/236717 Text en Copyright © 2014 Loris Nanni et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Nanni, Loris
Lumini, Alessandra
Brahnam, Sheryl
An Empirical Study of Different Approaches for Protein Classification
title An Empirical Study of Different Approaches for Protein Classification
title_full An Empirical Study of Different Approaches for Protein Classification
title_fullStr An Empirical Study of Different Approaches for Protein Classification
title_full_unstemmed An Empirical Study of Different Approaches for Protein Classification
title_short An Empirical Study of Different Approaches for Protein Classification
title_sort empirical study of different approaches for protein classification
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4084589/
https://www.ncbi.nlm.nih.gov/pubmed/25028675
http://dx.doi.org/10.1155/2014/236717
work_keys_str_mv AT nanniloris anempiricalstudyofdifferentapproachesforproteinclassification
AT luminialessandra anempiricalstudyofdifferentapproachesforproteinclassification
AT brahnamsheryl anempiricalstudyofdifferentapproachesforproteinclassification
AT nanniloris empiricalstudyofdifferentapproachesforproteinclassification
AT luminialessandra empiricalstudyofdifferentapproachesforproteinclassification
AT brahnamsheryl empiricalstudyofdifferentapproachesforproteinclassification