Cargando…

Transferability Based on Drug Structure Similarity in the Automatic Classification of Noncompliant Drug Use on Social Media: Natural Language Processing Approach

BACKGROUND: Medication noncompliance is a critical issue because of the increased number of drugs sold on the web. Web-based drug distribution is difficult to control, causing problems such as drug noncompliance and abuse. The existing medication compliance surveys lack completeness because it is im...

Descripción completa

Detalles Bibliográficos
Autores principales: Nishiyama, Tomohiro, Yada, Shuntaro, Wakamiya, Shoko, Hori, Satoko, Aramaki, Eiji
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10193216/
https://www.ncbi.nlm.nih.gov/pubmed/37133915
http://dx.doi.org/10.2196/44870
_version_ 1785043793050861568
author Nishiyama, Tomohiro
Yada, Shuntaro
Wakamiya, Shoko
Hori, Satoko
Aramaki, Eiji
author_facet Nishiyama, Tomohiro
Yada, Shuntaro
Wakamiya, Shoko
Hori, Satoko
Aramaki, Eiji
author_sort Nishiyama, Tomohiro
collection PubMed
description BACKGROUND: Medication noncompliance is a critical issue because of the increased number of drugs sold on the web. Web-based drug distribution is difficult to control, causing problems such as drug noncompliance and abuse. The existing medication compliance surveys lack completeness because it is impossible to cover patients who do not go to the hospital or provide accurate information to their doctors, so a social media–based approach is being explored to collect information about drug use. Social media data, which includes information on drug usage by users, can be used to detect drug abuse and medication compliance in patients. OBJECTIVE: This study aimed to assess how the structural similarity of drugs affects the efficiency of machine learning models for text classification of drug noncompliance. METHODS: This study analyzed 22,022 tweets about 20 different drugs. The tweets were labeled as either noncompliant use or mention, noncompliant sales, general use, or general mention. The study compares 2 methods for training machine learning models for text classification: single-sub-corpus transfer learning, in which a model is trained on tweets about a single drug and then tested on tweets about other drugs, and multi-sub-corpus incremental learning, in which models are trained on tweets about drugs in order of their structural similarity. The performance of a machine learning model trained on a single subcorpus (a data set of tweets about a specific category of drugs) was compared to the performance of a model trained on multiple subcorpora (data sets of tweets about multiple categories of drugs). RESULTS: The results showed that the performance of the model trained on a single subcorpus varied depending on the specific drug used for training. The Tanimoto similarity (a measure of the structural similarity between compounds) was weakly correlated with the classification results. The model trained by transfer learning a corpus of drugs with close structural similarity performed better than the model trained by randomly adding a subcorpus when the number of subcorpora was small. CONCLUSIONS: The results suggest that structural similarity improves the classification performance of messages about unknown drugs if the drugs in the training corpus are few. On the other hand, this indicates that there is little need to consider the influence of the Tanimoto structural similarity if a sufficient variety of drugs are ensured.
format Online
Article
Text
id pubmed-10193216
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-101932162023-05-19 Transferability Based on Drug Structure Similarity in the Automatic Classification of Noncompliant Drug Use on Social Media: Natural Language Processing Approach Nishiyama, Tomohiro Yada, Shuntaro Wakamiya, Shoko Hori, Satoko Aramaki, Eiji J Med Internet Res Original Paper BACKGROUND: Medication noncompliance is a critical issue because of the increased number of drugs sold on the web. Web-based drug distribution is difficult to control, causing problems such as drug noncompliance and abuse. The existing medication compliance surveys lack completeness because it is impossible to cover patients who do not go to the hospital or provide accurate information to their doctors, so a social media–based approach is being explored to collect information about drug use. Social media data, which includes information on drug usage by users, can be used to detect drug abuse and medication compliance in patients. OBJECTIVE: This study aimed to assess how the structural similarity of drugs affects the efficiency of machine learning models for text classification of drug noncompliance. METHODS: This study analyzed 22,022 tweets about 20 different drugs. The tweets were labeled as either noncompliant use or mention, noncompliant sales, general use, or general mention. The study compares 2 methods for training machine learning models for text classification: single-sub-corpus transfer learning, in which a model is trained on tweets about a single drug and then tested on tweets about other drugs, and multi-sub-corpus incremental learning, in which models are trained on tweets about drugs in order of their structural similarity. The performance of a machine learning model trained on a single subcorpus (a data set of tweets about a specific category of drugs) was compared to the performance of a model trained on multiple subcorpora (data sets of tweets about multiple categories of drugs). RESULTS: The results showed that the performance of the model trained on a single subcorpus varied depending on the specific drug used for training. The Tanimoto similarity (a measure of the structural similarity between compounds) was weakly correlated with the classification results. The model trained by transfer learning a corpus of drugs with close structural similarity performed better than the model trained by randomly adding a subcorpus when the number of subcorpora was small. CONCLUSIONS: The results suggest that structural similarity improves the classification performance of messages about unknown drugs if the drugs in the training corpus are few. On the other hand, this indicates that there is little need to consider the influence of the Tanimoto structural similarity if a sufficient variety of drugs are ensured. JMIR Publications 2023-05-03 /pmc/articles/PMC10193216/ /pubmed/37133915 http://dx.doi.org/10.2196/44870 Text en ©Tomohiro Nishiyama, Shuntaro Yada, Shoko Wakamiya, Satoko Hori, Eiji Aramaki. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 03.05.2023. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Nishiyama, Tomohiro
Yada, Shuntaro
Wakamiya, Shoko
Hori, Satoko
Aramaki, Eiji
Transferability Based on Drug Structure Similarity in the Automatic Classification of Noncompliant Drug Use on Social Media: Natural Language Processing Approach
title Transferability Based on Drug Structure Similarity in the Automatic Classification of Noncompliant Drug Use on Social Media: Natural Language Processing Approach
title_full Transferability Based on Drug Structure Similarity in the Automatic Classification of Noncompliant Drug Use on Social Media: Natural Language Processing Approach
title_fullStr Transferability Based on Drug Structure Similarity in the Automatic Classification of Noncompliant Drug Use on Social Media: Natural Language Processing Approach
title_full_unstemmed Transferability Based on Drug Structure Similarity in the Automatic Classification of Noncompliant Drug Use on Social Media: Natural Language Processing Approach
title_short Transferability Based on Drug Structure Similarity in the Automatic Classification of Noncompliant Drug Use on Social Media: Natural Language Processing Approach
title_sort transferability based on drug structure similarity in the automatic classification of noncompliant drug use on social media: natural language processing approach
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10193216/
https://www.ncbi.nlm.nih.gov/pubmed/37133915
http://dx.doi.org/10.2196/44870
work_keys_str_mv AT nishiyamatomohiro transferabilitybasedondrugstructuresimilarityintheautomaticclassificationofnoncompliantdruguseonsocialmedianaturallanguageprocessingapproach
AT yadashuntaro transferabilitybasedondrugstructuresimilarityintheautomaticclassificationofnoncompliantdruguseonsocialmedianaturallanguageprocessingapproach
AT wakamiyashoko transferabilitybasedondrugstructuresimilarityintheautomaticclassificationofnoncompliantdruguseonsocialmedianaturallanguageprocessingapproach
AT horisatoko transferabilitybasedondrugstructuresimilarityintheautomaticclassificationofnoncompliantdruguseonsocialmedianaturallanguageprocessingapproach
AT aramakieiji transferabilitybasedondrugstructuresimilarityintheautomaticclassificationofnoncompliantdruguseonsocialmedianaturallanguageprocessingapproach