Cargando…

Some remarks on protein attribute prediction and pseudo amino acid composition

With the accomplishment of human genome sequencing, the number of sequence-known proteins has increased explosively. In contrast, the pace is much slower in determining their biological attributes. As a consequence, the gap between sequence-known proteins and attribute-known proteins has become incr...

Descripción completa

Detalles Bibliográficos
Autor principal: Chou, Kuo-Chen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier Ltd. 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7125570/
https://www.ncbi.nlm.nih.gov/pubmed/21168420
http://dx.doi.org/10.1016/j.jtbi.2010.12.024
_version_ 1783515973335121920
author Chou, Kuo-Chen
author_facet Chou, Kuo-Chen
author_sort Chou, Kuo-Chen
collection PubMed
description With the accomplishment of human genome sequencing, the number of sequence-known proteins has increased explosively. In contrast, the pace is much slower in determining their biological attributes. As a consequence, the gap between sequence-known proteins and attribute-known proteins has become increasingly large. The unbalanced situation, which has critically limited our ability to timely utilize the newly discovered proteins for basic research and drug development, has called for developing computational methods or high-throughput automated tools for fast and reliably identifying various attributes of uncharacterized proteins based on their sequence information alone. Actually, during the last two decades or so, many methods in this regard have been established in hope to bridge such a gap. In the course of developing these methods, the following things were often needed to consider: (1) benchmark dataset construction, (2) protein sample formulation, (3) operating algorithm (or engine), (4) anticipated accuracy, and (5) web-server establishment. In this review, we are to discuss each of the five procedures, with a special focus on the introduction of pseudo amino acid composition (PseAAC), its different modes and applications as well as its recent development, particularly in how to use the general formulation of PseAAC to reflect the core and essential features that are deeply hidden in complicated protein sequences.
format Online
Article
Text
id pubmed-7125570
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Elsevier Ltd.
record_format MEDLINE/PubMed
spelling pubmed-71255702020-04-06 Some remarks on protein attribute prediction and pseudo amino acid composition Chou, Kuo-Chen J Theor Biol Article With the accomplishment of human genome sequencing, the number of sequence-known proteins has increased explosively. In contrast, the pace is much slower in determining their biological attributes. As a consequence, the gap between sequence-known proteins and attribute-known proteins has become increasingly large. The unbalanced situation, which has critically limited our ability to timely utilize the newly discovered proteins for basic research and drug development, has called for developing computational methods or high-throughput automated tools for fast and reliably identifying various attributes of uncharacterized proteins based on their sequence information alone. Actually, during the last two decades or so, many methods in this regard have been established in hope to bridge such a gap. In the course of developing these methods, the following things were often needed to consider: (1) benchmark dataset construction, (2) protein sample formulation, (3) operating algorithm (or engine), (4) anticipated accuracy, and (5) web-server establishment. In this review, we are to discuss each of the five procedures, with a special focus on the introduction of pseudo amino acid composition (PseAAC), its different modes and applications as well as its recent development, particularly in how to use the general formulation of PseAAC to reflect the core and essential features that are deeply hidden in complicated protein sequences. Elsevier Ltd. 2011-03-21 2010-12-17 /pmc/articles/PMC7125570/ /pubmed/21168420 http://dx.doi.org/10.1016/j.jtbi.2010.12.024 Text en Copyright © 2010 Elsevier Ltd. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Article
Chou, Kuo-Chen
Some remarks on protein attribute prediction and pseudo amino acid composition
title Some remarks on protein attribute prediction and pseudo amino acid composition
title_full Some remarks on protein attribute prediction and pseudo amino acid composition
title_fullStr Some remarks on protein attribute prediction and pseudo amino acid composition
title_full_unstemmed Some remarks on protein attribute prediction and pseudo amino acid composition
title_short Some remarks on protein attribute prediction and pseudo amino acid composition
title_sort some remarks on protein attribute prediction and pseudo amino acid composition
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7125570/
https://www.ncbi.nlm.nih.gov/pubmed/21168420
http://dx.doi.org/10.1016/j.jtbi.2010.12.024
work_keys_str_mv AT choukuochen someremarksonproteinattributepredictionandpseudoaminoacidcomposition