Cargando…
DrugMetab: An Integrated Machine Learning and Lexicon Mapping Named Entity Recognition Method for Drug Metabolite
Drug metabolites (DMs) are critical in pharmacology research areas, such as drug metabolism pathways and drug‐drug interactions. However, there is no terminology dictionary containing comprehensive drug metabolite names, and there is no named entity recognition (NER) algorithm focusing on drug metab...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
John Wiley and Sons Inc.
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6263660/ https://www.ncbi.nlm.nih.gov/pubmed/30033622 http://dx.doi.org/10.1002/psp4.12340 |
_version_ | 1783375335690076160 |
---|---|
author | Wu, Heng‐Yi Lu, Deshun Hyder, Mustafa Zhang, Shijun Quinney, Sara K. Desta, Zeruesenay Li, Lang |
author_facet | Wu, Heng‐Yi Lu, Deshun Hyder, Mustafa Zhang, Shijun Quinney, Sara K. Desta, Zeruesenay Li, Lang |
author_sort | Wu, Heng‐Yi |
collection | PubMed |
description | Drug metabolites (DMs) are critical in pharmacology research areas, such as drug metabolism pathways and drug‐drug interactions. However, there is no terminology dictionary containing comprehensive drug metabolite names, and there is no named entity recognition (NER) algorithm focusing on drug metabolite identification. In this article, we developed a novel NER system, DrugMetab, to identify DMs from the PubMed abstracts. DrugMetab utilizes the features characterized from the Part‐of‐Speech, drug index, and pre/suffix, and determines DMs within context. To evaluate the performance, a gold‐standard corpus was manually constructed. In this task, DrugMetab with sequential minimal optimization (SMO) classifier achieves 0.89 precision, 0.77 recall, and 0.83 F‐measure in the internal testing set; and 0.86 precision, 0.85 recall, and 0.86 F‐measure in the external validation set. We further compared the performance between DrugMetab and whatizitChemical, which was designed for identifying small molecules or chemical entities. DrugMetab outperformed whatizitChemical, which had a lower recall rate of 0.65. |
format | Online Article Text |
id | pubmed-6263660 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | John Wiley and Sons Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-62636602018-12-05 DrugMetab: An Integrated Machine Learning and Lexicon Mapping Named Entity Recognition Method for Drug Metabolite Wu, Heng‐Yi Lu, Deshun Hyder, Mustafa Zhang, Shijun Quinney, Sara K. Desta, Zeruesenay Li, Lang CPT Pharmacometrics Syst Pharmacol Research Drug metabolites (DMs) are critical in pharmacology research areas, such as drug metabolism pathways and drug‐drug interactions. However, there is no terminology dictionary containing comprehensive drug metabolite names, and there is no named entity recognition (NER) algorithm focusing on drug metabolite identification. In this article, we developed a novel NER system, DrugMetab, to identify DMs from the PubMed abstracts. DrugMetab utilizes the features characterized from the Part‐of‐Speech, drug index, and pre/suffix, and determines DMs within context. To evaluate the performance, a gold‐standard corpus was manually constructed. In this task, DrugMetab with sequential minimal optimization (SMO) classifier achieves 0.89 precision, 0.77 recall, and 0.83 F‐measure in the internal testing set; and 0.86 precision, 0.85 recall, and 0.86 F‐measure in the external validation set. We further compared the performance between DrugMetab and whatizitChemical, which was designed for identifying small molecules or chemical entities. DrugMetab outperformed whatizitChemical, which had a lower recall rate of 0.65. John Wiley and Sons Inc. 2018-09-29 2018-11 /pmc/articles/PMC6263660/ /pubmed/30033622 http://dx.doi.org/10.1002/psp4.12340 Text en © 2018 The Authors CPT: Pharmacometrics & Systems Pharmacology published by Wiley Periodicals, Inc. on behalf of American Society for Clinical Pharmacology and Therapeutics. This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes. |
spellingShingle | Research Wu, Heng‐Yi Lu, Deshun Hyder, Mustafa Zhang, Shijun Quinney, Sara K. Desta, Zeruesenay Li, Lang DrugMetab: An Integrated Machine Learning and Lexicon Mapping Named Entity Recognition Method for Drug Metabolite |
title | DrugMetab: An Integrated Machine Learning and Lexicon Mapping Named Entity Recognition Method for Drug Metabolite |
title_full | DrugMetab: An Integrated Machine Learning and Lexicon Mapping Named Entity Recognition Method for Drug Metabolite |
title_fullStr | DrugMetab: An Integrated Machine Learning and Lexicon Mapping Named Entity Recognition Method for Drug Metabolite |
title_full_unstemmed | DrugMetab: An Integrated Machine Learning and Lexicon Mapping Named Entity Recognition Method for Drug Metabolite |
title_short | DrugMetab: An Integrated Machine Learning and Lexicon Mapping Named Entity Recognition Method for Drug Metabolite |
title_sort | drugmetab: an integrated machine learning and lexicon mapping named entity recognition method for drug metabolite |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6263660/ https://www.ncbi.nlm.nih.gov/pubmed/30033622 http://dx.doi.org/10.1002/psp4.12340 |
work_keys_str_mv | AT wuhengyi drugmetabanintegratedmachinelearningandlexiconmappingnamedentityrecognitionmethodfordrugmetabolite AT ludeshun drugmetabanintegratedmachinelearningandlexiconmappingnamedentityrecognitionmethodfordrugmetabolite AT hydermustafa drugmetabanintegratedmachinelearningandlexiconmappingnamedentityrecognitionmethodfordrugmetabolite AT zhangshijun drugmetabanintegratedmachinelearningandlexiconmappingnamedentityrecognitionmethodfordrugmetabolite AT quinneysarak drugmetabanintegratedmachinelearningandlexiconmappingnamedentityrecognitionmethodfordrugmetabolite AT destazeruesenay drugmetabanintegratedmachinelearningandlexiconmappingnamedentityrecognitionmethodfordrugmetabolite AT lilang drugmetabanintegratedmachinelearningandlexiconmappingnamedentityrecognitionmethodfordrugmetabolite |