Cargando…

METSP: A Maximum-Entropy Classifier Based Text Mining Tool for Transporter-Substrate Identification with Semistructured Text

The substrates of a transporter are not only useful for inferring function of the transporter, but also important to discover compound-compound interaction and to reconstruct metabolic pathway. Though plenty of data has been accumulated with the developing of new technologies such as in vitro transp...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Min, Chen, Yanming, Qu, Dacheng, Qu, Hong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4606149/
https://www.ncbi.nlm.nih.gov/pubmed/26495291
http://dx.doi.org/10.1155/2015/254838
_version_ 1782395323644116992
author Zhao, Min
Chen, Yanming
Qu, Dacheng
Qu, Hong
author_facet Zhao, Min
Chen, Yanming
Qu, Dacheng
Qu, Hong
author_sort Zhao, Min
collection PubMed
description The substrates of a transporter are not only useful for inferring function of the transporter, but also important to discover compound-compound interaction and to reconstruct metabolic pathway. Though plenty of data has been accumulated with the developing of new technologies such as in vitro transporter assays, the search for substrates of transporters is far from complete. In this article, we introduce METSP, a maximum-entropy classifier devoted to retrieve transporter-substrate pairs (TSPs) from semistructured text. Based on the high quality annotation from UniProt, METSP achieves high precision and recall in cross-validation experiments. When METSP is applied to 182,829 human transporter annotation sentences in UniProt, it identifies 3942 sentences with transporter and compound information. Finally, 1547 confidential human TSPs are identified for further manual curation, among which 58.37% pairs with novel substrates not annotated in public transporter databases. METSP is the first efficient tool to extract TSPs from semistructured annotation text in UniProt. This tool can help to determine the precise substrates and drugs of transporters, thus facilitating drug-target prediction, metabolic network reconstruction, and literature classification.
format Online
Article
Text
id pubmed-4606149
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-46061492015-10-22 METSP: A Maximum-Entropy Classifier Based Text Mining Tool for Transporter-Substrate Identification with Semistructured Text Zhao, Min Chen, Yanming Qu, Dacheng Qu, Hong Biomed Res Int Research Article The substrates of a transporter are not only useful for inferring function of the transporter, but also important to discover compound-compound interaction and to reconstruct metabolic pathway. Though plenty of data has been accumulated with the developing of new technologies such as in vitro transporter assays, the search for substrates of transporters is far from complete. In this article, we introduce METSP, a maximum-entropy classifier devoted to retrieve transporter-substrate pairs (TSPs) from semistructured text. Based on the high quality annotation from UniProt, METSP achieves high precision and recall in cross-validation experiments. When METSP is applied to 182,829 human transporter annotation sentences in UniProt, it identifies 3942 sentences with transporter and compound information. Finally, 1547 confidential human TSPs are identified for further manual curation, among which 58.37% pairs with novel substrates not annotated in public transporter databases. METSP is the first efficient tool to extract TSPs from semistructured annotation text in UniProt. This tool can help to determine the precise substrates and drugs of transporters, thus facilitating drug-target prediction, metabolic network reconstruction, and literature classification. Hindawi Publishing Corporation 2015 2015-10-01 /pmc/articles/PMC4606149/ /pubmed/26495291 http://dx.doi.org/10.1155/2015/254838 Text en Copyright © 2015 Min Zhao et al. https://creativecommons.org/licenses/by/3.0/ This is an open access paper distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Zhao, Min
Chen, Yanming
Qu, Dacheng
Qu, Hong
METSP: A Maximum-Entropy Classifier Based Text Mining Tool for Transporter-Substrate Identification with Semistructured Text
title METSP: A Maximum-Entropy Classifier Based Text Mining Tool for Transporter-Substrate Identification with Semistructured Text
title_full METSP: A Maximum-Entropy Classifier Based Text Mining Tool for Transporter-Substrate Identification with Semistructured Text
title_fullStr METSP: A Maximum-Entropy Classifier Based Text Mining Tool for Transporter-Substrate Identification with Semistructured Text
title_full_unstemmed METSP: A Maximum-Entropy Classifier Based Text Mining Tool for Transporter-Substrate Identification with Semistructured Text
title_short METSP: A Maximum-Entropy Classifier Based Text Mining Tool for Transporter-Substrate Identification with Semistructured Text
title_sort metsp: a maximum-entropy classifier based text mining tool for transporter-substrate identification with semistructured text
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4606149/
https://www.ncbi.nlm.nih.gov/pubmed/26495291
http://dx.doi.org/10.1155/2015/254838
work_keys_str_mv AT zhaomin metspamaximumentropyclassifierbasedtextminingtoolfortransportersubstrateidentificationwithsemistructuredtext
AT chenyanming metspamaximumentropyclassifierbasedtextminingtoolfortransportersubstrateidentificationwithsemistructuredtext
AT qudacheng metspamaximumentropyclassifierbasedtextminingtoolfortransportersubstrateidentificationwithsemistructuredtext
AT quhong metspamaximumentropyclassifierbasedtextminingtoolfortransportersubstrateidentificationwithsemistructuredtext