Cargando…

DLI-IT: a deep learning approach to drug label identification through image and text embedding

BACKGROUND: Drug label, or packaging insert play a significant role in all the operations from production through drug distribution channels to the end consumer. Image of the label also called Display Panel or label could be used to identify illegal, illicit, unapproved and potentially dangerous dru...

Descripción completa

Detalles Bibliográficos
Autores principales:	Liu, Xiangwen, Meehan, Joe, Tong, Weida, Wu, Leihong, Xu, Xiaowei, Xu, Joshua
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2020
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7158001/ https://www.ncbi.nlm.nih.gov/pubmed/32293428 http://dx.doi.org/10.1186/s12911-020-1078-3

_version_	1783522448902193152
author	Liu, Xiangwen Meehan, Joe Tong, Weida Wu, Leihong Xu, Xiaowei Xu, Joshua
author_facet	Liu, Xiangwen Meehan, Joe Tong, Weida Wu, Leihong Xu, Xiaowei Xu, Joshua
author_sort	Liu, Xiangwen
collection	PubMed
description	BACKGROUND: Drug label, or packaging insert play a significant role in all the operations from production through drug distribution channels to the end consumer. Image of the label also called Display Panel or label could be used to identify illegal, illicit, unapproved and potentially dangerous drugs. Due to the time-consuming process and high labor cost of investigation, an artificial intelligence-based deep learning model is necessary for fast and accurate identification of the drugs. METHODS: In addition to image-based identification technology, we take advantages of rich text information on the pharmaceutical package insert of drug label images. In this study, we developed the Drug Label Identification through Image and Text embedding model (DLI-IT) to model text-based patterns of historical data for detection of suspicious drugs. In DLI-IT, we first trained a Connectionist Text Proposal Network (CTPN) to crop the raw image into sub-images based on the text. The texts from the cropped sub-images are recognized independently through the Tesseract OCR Engine and combined as one document for each raw image. Finally, we applied universal sentence embedding to transform these documents into vectors and find the most similar reference images to the test image through the cosine similarity. RESULTS: We trained the DLI-IT model on 1749 opioid and 2365 non-opioid drug label images. The model was then tested on 300 external opioid drug label images, the result demonstrated our model achieves up-to 88% of the precision in drug label identification, which outperforms previous image-based or text-based identification method by up-to 35% improvement. CONCLUSION: To conclude, by combining Image and Text embedding analysis under deep learning framework, our DLI-IT approach achieved a competitive performance in advancing drug label identification.
format	Online Article Text
id	pubmed-7158001
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-71580012020-04-20 DLI-IT: a deep learning approach to drug label identification through image and text embedding Liu, Xiangwen Meehan, Joe Tong, Weida Wu, Leihong Xu, Xiaowei Xu, Joshua BMC Med Inform Decis Mak Research Article BACKGROUND: Drug label, or packaging insert play a significant role in all the operations from production through drug distribution channels to the end consumer. Image of the label also called Display Panel or label could be used to identify illegal, illicit, unapproved and potentially dangerous drugs. Due to the time-consuming process and high labor cost of investigation, an artificial intelligence-based deep learning model is necessary for fast and accurate identification of the drugs. METHODS: In addition to image-based identification technology, we take advantages of rich text information on the pharmaceutical package insert of drug label images. In this study, we developed the Drug Label Identification through Image and Text embedding model (DLI-IT) to model text-based patterns of historical data for detection of suspicious drugs. In DLI-IT, we first trained a Connectionist Text Proposal Network (CTPN) to crop the raw image into sub-images based on the text. The texts from the cropped sub-images are recognized independently through the Tesseract OCR Engine and combined as one document for each raw image. Finally, we applied universal sentence embedding to transform these documents into vectors and find the most similar reference images to the test image through the cosine similarity. RESULTS: We trained the DLI-IT model on 1749 opioid and 2365 non-opioid drug label images. The model was then tested on 300 external opioid drug label images, the result demonstrated our model achieves up-to 88% of the precision in drug label identification, which outperforms previous image-based or text-based identification method by up-to 35% improvement. CONCLUSION: To conclude, by combining Image and Text embedding analysis under deep learning framework, our DLI-IT approach achieved a competitive performance in advancing drug label identification. BioMed Central 2020-04-15 /pmc/articles/PMC7158001/ /pubmed/32293428 http://dx.doi.org/10.1186/s12911-020-1078-3 Text en © The Author(s). 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Research Article Liu, Xiangwen Meehan, Joe Tong, Weida Wu, Leihong Xu, Xiaowei Xu, Joshua DLI-IT: a deep learning approach to drug label identification through image and text embedding
title	DLI-IT: a deep learning approach to drug label identification through image and text embedding
title_full	DLI-IT: a deep learning approach to drug label identification through image and text embedding
title_fullStr	DLI-IT: a deep learning approach to drug label identification through image and text embedding
title_full_unstemmed	DLI-IT: a deep learning approach to drug label identification through image and text embedding
title_short	DLI-IT: a deep learning approach to drug label identification through image and text embedding
title_sort	dli-it: a deep learning approach to drug label identification through image and text embedding
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7158001/ https://www.ncbi.nlm.nih.gov/pubmed/32293428 http://dx.doi.org/10.1186/s12911-020-1078-3
work_keys_str_mv	AT liuxiangwen dliitadeeplearningapproachtodruglabelidentificationthroughimageandtextembedding AT meehanjoe dliitadeeplearningapproachtodruglabelidentificationthroughimageandtextembedding AT tongweida dliitadeeplearningapproachtodruglabelidentificationthroughimageandtextembedding AT wuleihong dliitadeeplearningapproachtodruglabelidentificationthroughimageandtextembedding AT xuxiaowei dliitadeeplearningapproachtodruglabelidentificationthroughimageandtextembedding AT xujoshua dliitadeeplearningapproachtodruglabelidentificationthroughimageandtextembedding

DLI-IT: a deep learning approach to drug label identification through image and text embedding

Ejemplares similares