Cargando…

A knowledge base for predicting protein localization sites in eukaryotic cells

To automate examination of massive amounts of sequence data for biological function, it is important to computerize interpretation based on empirical knowledge of sequence-function relationships. For this purpose, we have been constructing a knowledge base by organizing various experimental and comp...

Descripción completa

Detalles Bibliográficos
Autores principales: Nakai, Kenta, Kanehisa, Minoru
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Published by Elsevier Inc. 1992
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7134799/
https://www.ncbi.nlm.nih.gov/pubmed/1478671
http://dx.doi.org/10.1016/S0888-7543(05)80111-9
_version_ 1783517912953257984
author Nakai, Kenta
Kanehisa, Minoru
author_facet Nakai, Kenta
Kanehisa, Minoru
author_sort Nakai, Kenta
collection PubMed
description To automate examination of massive amounts of sequence data for biological function, it is important to computerize interpretation based on empirical knowledge of sequence-function relationships. For this purpose, we have been constructing a knowledge base by organizing various experimental and computational observations as a collection of if—then rules. Here we report an expert system, which utilizes this knowledge base, for predicting localization sites of proteins only from the information on the amino acid sequence and the source origin. We collected data for 401 eukaryotic proteins with known localization sites (subcellular and extracellular) and divided them into training data and testing data. Fourteen localization sites were distinguished for animal cells and 17 for plant cells. When sorting signals were not well characterized experimentally, various sequence features were computationally derived from the training data. It was found that 66% of the training data and 59% of the testing data were correctly predicted by our expert system. This artificial intelligence approach is powerful and flexible enough to be used in genome analyses.
format Online
Article
Text
id pubmed-7134799
institution National Center for Biotechnology Information
language English
publishDate 1992
publisher Published by Elsevier Inc.
record_format MEDLINE/PubMed
spelling pubmed-71347992020-04-08 A knowledge base for predicting protein localization sites in eukaryotic cells Nakai, Kenta Kanehisa, Minoru Genomics Article To automate examination of massive amounts of sequence data for biological function, it is important to computerize interpretation based on empirical knowledge of sequence-function relationships. For this purpose, we have been constructing a knowledge base by organizing various experimental and computational observations as a collection of if—then rules. Here we report an expert system, which utilizes this knowledge base, for predicting localization sites of proteins only from the information on the amino acid sequence and the source origin. We collected data for 401 eukaryotic proteins with known localization sites (subcellular and extracellular) and divided them into training data and testing data. Fourteen localization sites were distinguished for animal cells and 17 for plant cells. When sorting signals were not well characterized experimentally, various sequence features were computationally derived from the training data. It was found that 66% of the training data and 59% of the testing data were correctly predicted by our expert system. This artificial intelligence approach is powerful and flexible enough to be used in genome analyses. Published by Elsevier Inc. 1992-12 2005-07-25 /pmc/articles/PMC7134799/ /pubmed/1478671 http://dx.doi.org/10.1016/S0888-7543(05)80111-9 Text en Copyright © 1992 Published by Elsevier Inc. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Article
Nakai, Kenta
Kanehisa, Minoru
A knowledge base for predicting protein localization sites in eukaryotic cells
title A knowledge base for predicting protein localization sites in eukaryotic cells
title_full A knowledge base for predicting protein localization sites in eukaryotic cells
title_fullStr A knowledge base for predicting protein localization sites in eukaryotic cells
title_full_unstemmed A knowledge base for predicting protein localization sites in eukaryotic cells
title_short A knowledge base for predicting protein localization sites in eukaryotic cells
title_sort knowledge base for predicting protein localization sites in eukaryotic cells
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7134799/
https://www.ncbi.nlm.nih.gov/pubmed/1478671
http://dx.doi.org/10.1016/S0888-7543(05)80111-9
work_keys_str_mv AT nakaikenta aknowledgebaseforpredictingproteinlocalizationsitesineukaryoticcells
AT kanehisaminoru aknowledgebaseforpredictingproteinlocalizationsitesineukaryoticcells
AT nakaikenta knowledgebaseforpredictingproteinlocalizationsitesineukaryoticcells
AT kanehisaminoru knowledgebaseforpredictingproteinlocalizationsitesineukaryoticcells