Cargando…

DrugMetab: An Integrated Machine Learning and Lexicon Mapping Named Entity Recognition Method for Drug Metabolite

Drug metabolites (DMs) are critical in pharmacology research areas, such as drug metabolism pathways and drug‐drug interactions. However, there is no terminology dictionary containing comprehensive drug metabolite names, and there is no named entity recognition (NER) algorithm focusing on drug metab...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Heng‐Yi, Lu, Deshun, Hyder, Mustafa, Zhang, Shijun, Quinney, Sara K., Desta, Zeruesenay, Li, Lang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6263660/
https://www.ncbi.nlm.nih.gov/pubmed/30033622
http://dx.doi.org/10.1002/psp4.12340
_version_ 1783375335690076160
author Wu, Heng‐Yi
Lu, Deshun
Hyder, Mustafa
Zhang, Shijun
Quinney, Sara K.
Desta, Zeruesenay
Li, Lang
author_facet Wu, Heng‐Yi
Lu, Deshun
Hyder, Mustafa
Zhang, Shijun
Quinney, Sara K.
Desta, Zeruesenay
Li, Lang
author_sort Wu, Heng‐Yi
collection PubMed
description Drug metabolites (DMs) are critical in pharmacology research areas, such as drug metabolism pathways and drug‐drug interactions. However, there is no terminology dictionary containing comprehensive drug metabolite names, and there is no named entity recognition (NER) algorithm focusing on drug metabolite identification. In this article, we developed a novel NER system, DrugMetab, to identify DMs from the PubMed abstracts. DrugMetab utilizes the features characterized from the Part‐of‐Speech, drug index, and pre/suffix, and determines DMs within context. To evaluate the performance, a gold‐standard corpus was manually constructed. In this task, DrugMetab with sequential minimal optimization (SMO) classifier achieves 0.89 precision, 0.77 recall, and 0.83 F‐measure in the internal testing set; and 0.86 precision, 0.85 recall, and 0.86 F‐measure in the external validation set. We further compared the performance between DrugMetab and whatizitChemical, which was designed for identifying small molecules or chemical entities. DrugMetab outperformed whatizitChemical, which had a lower recall rate of 0.65.
format Online
Article
Text
id pubmed-6263660
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-62636602018-12-05 DrugMetab: An Integrated Machine Learning and Lexicon Mapping Named Entity Recognition Method for Drug Metabolite Wu, Heng‐Yi Lu, Deshun Hyder, Mustafa Zhang, Shijun Quinney, Sara K. Desta, Zeruesenay Li, Lang CPT Pharmacometrics Syst Pharmacol Research Drug metabolites (DMs) are critical in pharmacology research areas, such as drug metabolism pathways and drug‐drug interactions. However, there is no terminology dictionary containing comprehensive drug metabolite names, and there is no named entity recognition (NER) algorithm focusing on drug metabolite identification. In this article, we developed a novel NER system, DrugMetab, to identify DMs from the PubMed abstracts. DrugMetab utilizes the features characterized from the Part‐of‐Speech, drug index, and pre/suffix, and determines DMs within context. To evaluate the performance, a gold‐standard corpus was manually constructed. In this task, DrugMetab with sequential minimal optimization (SMO) classifier achieves 0.89 precision, 0.77 recall, and 0.83 F‐measure in the internal testing set; and 0.86 precision, 0.85 recall, and 0.86 F‐measure in the external validation set. We further compared the performance between DrugMetab and whatizitChemical, which was designed for identifying small molecules or chemical entities. DrugMetab outperformed whatizitChemical, which had a lower recall rate of 0.65. John Wiley and Sons Inc. 2018-09-29 2018-11 /pmc/articles/PMC6263660/ /pubmed/30033622 http://dx.doi.org/10.1002/psp4.12340 Text en © 2018 The Authors CPT: Pharmacometrics & Systems Pharmacology published by Wiley Periodicals, Inc. on behalf of American Society for Clinical Pharmacology and Therapeutics. This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.
spellingShingle Research
Wu, Heng‐Yi
Lu, Deshun
Hyder, Mustafa
Zhang, Shijun
Quinney, Sara K.
Desta, Zeruesenay
Li, Lang
DrugMetab: An Integrated Machine Learning and Lexicon Mapping Named Entity Recognition Method for Drug Metabolite
title DrugMetab: An Integrated Machine Learning and Lexicon Mapping Named Entity Recognition Method for Drug Metabolite
title_full DrugMetab: An Integrated Machine Learning and Lexicon Mapping Named Entity Recognition Method for Drug Metabolite
title_fullStr DrugMetab: An Integrated Machine Learning and Lexicon Mapping Named Entity Recognition Method for Drug Metabolite
title_full_unstemmed DrugMetab: An Integrated Machine Learning and Lexicon Mapping Named Entity Recognition Method for Drug Metabolite
title_short DrugMetab: An Integrated Machine Learning and Lexicon Mapping Named Entity Recognition Method for Drug Metabolite
title_sort drugmetab: an integrated machine learning and lexicon mapping named entity recognition method for drug metabolite
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6263660/
https://www.ncbi.nlm.nih.gov/pubmed/30033622
http://dx.doi.org/10.1002/psp4.12340
work_keys_str_mv AT wuhengyi drugmetabanintegratedmachinelearningandlexiconmappingnamedentityrecognitionmethodfordrugmetabolite
AT ludeshun drugmetabanintegratedmachinelearningandlexiconmappingnamedentityrecognitionmethodfordrugmetabolite
AT hydermustafa drugmetabanintegratedmachinelearningandlexiconmappingnamedentityrecognitionmethodfordrugmetabolite
AT zhangshijun drugmetabanintegratedmachinelearningandlexiconmappingnamedentityrecognitionmethodfordrugmetabolite
AT quinneysarak drugmetabanintegratedmachinelearningandlexiconmappingnamedentityrecognitionmethodfordrugmetabolite
AT destazeruesenay drugmetabanintegratedmachinelearningandlexiconmappingnamedentityrecognitionmethodfordrugmetabolite
AT lilang drugmetabanintegratedmachinelearningandlexiconmappingnamedentityrecognitionmethodfordrugmetabolite