Cargando…
METSP: A Maximum-Entropy Classifier Based Text Mining Tool for Transporter-Substrate Identification with Semistructured Text
The substrates of a transporter are not only useful for inferring function of the transporter, but also important to discover compound-compound interaction and to reconstruct metabolic pathway. Though plenty of data has been accumulated with the developing of new technologies such as in vitro transp...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi Publishing Corporation
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4606149/ https://www.ncbi.nlm.nih.gov/pubmed/26495291 http://dx.doi.org/10.1155/2015/254838 |
_version_ | 1782395323644116992 |
---|---|
author | Zhao, Min Chen, Yanming Qu, Dacheng Qu, Hong |
author_facet | Zhao, Min Chen, Yanming Qu, Dacheng Qu, Hong |
author_sort | Zhao, Min |
collection | PubMed |
description | The substrates of a transporter are not only useful for inferring function of the transporter, but also important to discover compound-compound interaction and to reconstruct metabolic pathway. Though plenty of data has been accumulated with the developing of new technologies such as in vitro transporter assays, the search for substrates of transporters is far from complete. In this article, we introduce METSP, a maximum-entropy classifier devoted to retrieve transporter-substrate pairs (TSPs) from semistructured text. Based on the high quality annotation from UniProt, METSP achieves high precision and recall in cross-validation experiments. When METSP is applied to 182,829 human transporter annotation sentences in UniProt, it identifies 3942 sentences with transporter and compound information. Finally, 1547 confidential human TSPs are identified for further manual curation, among which 58.37% pairs with novel substrates not annotated in public transporter databases. METSP is the first efficient tool to extract TSPs from semistructured annotation text in UniProt. This tool can help to determine the precise substrates and drugs of transporters, thus facilitating drug-target prediction, metabolic network reconstruction, and literature classification. |
format | Online Article Text |
id | pubmed-4606149 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Hindawi Publishing Corporation |
record_format | MEDLINE/PubMed |
spelling | pubmed-46061492015-10-22 METSP: A Maximum-Entropy Classifier Based Text Mining Tool for Transporter-Substrate Identification with Semistructured Text Zhao, Min Chen, Yanming Qu, Dacheng Qu, Hong Biomed Res Int Research Article The substrates of a transporter are not only useful for inferring function of the transporter, but also important to discover compound-compound interaction and to reconstruct metabolic pathway. Though plenty of data has been accumulated with the developing of new technologies such as in vitro transporter assays, the search for substrates of transporters is far from complete. In this article, we introduce METSP, a maximum-entropy classifier devoted to retrieve transporter-substrate pairs (TSPs) from semistructured text. Based on the high quality annotation from UniProt, METSP achieves high precision and recall in cross-validation experiments. When METSP is applied to 182,829 human transporter annotation sentences in UniProt, it identifies 3942 sentences with transporter and compound information. Finally, 1547 confidential human TSPs are identified for further manual curation, among which 58.37% pairs with novel substrates not annotated in public transporter databases. METSP is the first efficient tool to extract TSPs from semistructured annotation text in UniProt. This tool can help to determine the precise substrates and drugs of transporters, thus facilitating drug-target prediction, metabolic network reconstruction, and literature classification. Hindawi Publishing Corporation 2015 2015-10-01 /pmc/articles/PMC4606149/ /pubmed/26495291 http://dx.doi.org/10.1155/2015/254838 Text en Copyright © 2015 Min Zhao et al. https://creativecommons.org/licenses/by/3.0/ This is an open access paper distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Zhao, Min Chen, Yanming Qu, Dacheng Qu, Hong METSP: A Maximum-Entropy Classifier Based Text Mining Tool for Transporter-Substrate Identification with Semistructured Text |
title | METSP: A Maximum-Entropy Classifier Based Text Mining Tool for Transporter-Substrate Identification with Semistructured Text |
title_full | METSP: A Maximum-Entropy Classifier Based Text Mining Tool for Transporter-Substrate Identification with Semistructured Text |
title_fullStr | METSP: A Maximum-Entropy Classifier Based Text Mining Tool for Transporter-Substrate Identification with Semistructured Text |
title_full_unstemmed | METSP: A Maximum-Entropy Classifier Based Text Mining Tool for Transporter-Substrate Identification with Semistructured Text |
title_short | METSP: A Maximum-Entropy Classifier Based Text Mining Tool for Transporter-Substrate Identification with Semistructured Text |
title_sort | metsp: a maximum-entropy classifier based text mining tool for transporter-substrate identification with semistructured text |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4606149/ https://www.ncbi.nlm.nih.gov/pubmed/26495291 http://dx.doi.org/10.1155/2015/254838 |
work_keys_str_mv | AT zhaomin metspamaximumentropyclassifierbasedtextminingtoolfortransportersubstrateidentificationwithsemistructuredtext AT chenyanming metspamaximumentropyclassifierbasedtextminingtoolfortransportersubstrateidentificationwithsemistructuredtext AT qudacheng metspamaximumentropyclassifierbasedtextminingtoolfortransportersubstrateidentificationwithsemistructuredtext AT quhong metspamaximumentropyclassifierbasedtextminingtoolfortransportersubstrateidentificationwithsemistructuredtext |