Cargando…

Classification of Lapses in Smokers Attempting to Stop: A Supervised Machine Learning Approach Using Data From a Popular Smoking Cessation Smartphone App

INTRODUCTION: Smoking lapses after the quit date often lead to full relapse. To inform the development of real time, tailored lapse prevention support, we used observational data from a popular smoking cessation app to develop supervised machine learning algorithms to distinguish lapse from non-laps...

Descripción completa

Detalles Bibliográficos
Autores principales: Perski, Olga, Li, Kezhi, Pontikos, Nikolas, Simons, David, Goldstein, Stephanie P, Naughton, Felix, Brown, Jamie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10256890/
https://www.ncbi.nlm.nih.gov/pubmed/36971111
http://dx.doi.org/10.1093/ntr/ntad051
_version_ 1785057201516183552
author Perski, Olga
Li, Kezhi
Pontikos, Nikolas
Simons, David
Goldstein, Stephanie P
Naughton, Felix
Brown, Jamie
author_facet Perski, Olga
Li, Kezhi
Pontikos, Nikolas
Simons, David
Goldstein, Stephanie P
Naughton, Felix
Brown, Jamie
author_sort Perski, Olga
collection PubMed
description INTRODUCTION: Smoking lapses after the quit date often lead to full relapse. To inform the development of real time, tailored lapse prevention support, we used observational data from a popular smoking cessation app to develop supervised machine learning algorithms to distinguish lapse from non-lapse reports. AIMS AND METHODS: We used data from app users with ≥20 unprompted data entries, which included information about craving severity, mood, activity, social context, and lapse incidence. A series of group-level supervised machine learning algorithms (eg, Random Forest, XGBoost) were trained and tested. Their ability to classify lapses for out-of-sample (1) observations and (2) individuals were evaluated. Next, a series of individual-level and hybrid algorithms were trained and tested. RESULTS: Participants (N = 791) provided 37 002 data entries (7.6% lapses). The best-performing group-level algorithm had an area under the receiver operating characteristic curve (AUC) of 0.969 (95% confidence interval [CI] = 0.961 to 0.978). Its ability to classify lapses for out-of-sample individuals ranged from poor to excellent (AUC = 0.482–1.000). Individual-level algorithms could be constructed for 39/791 participants with sufficient data, with a median AUC of 0.938 (range: 0.518–1.000). Hybrid algorithms could be constructed for 184/791 participants and had a median AUC of 0.825 (range: 0.375–1.000). CONCLUSIONS: Using unprompted app data appeared feasible for constructing a high-performing group-level lapse classification algorithm but its performance was variable when applied to unseen individuals. Algorithms trained on each individual’s dataset, in addition to hybrid algorithms trained on the group plus a proportion of each individual’s data, had improved performance but could only be constructed for a minority of participants. IMPLICATIONS: This study used routinely collected data from a popular smartphone app to train and test a series of supervised machine learning algorithms to distinguish lapse from non-lapse events. Although a high-performing group-level algorithm was developed, it had variable performance when applied to new, unseen individuals. Individual-level and hybrid algorithms had somewhat greater performance but could not be constructed for all participants because of the lack of variability in the outcome measure. Triangulation of results with those from a prompted study design is recommended prior to intervention development, with real-world lapse prediction likely requiring a balance between unprompted and prompted app data.
format Online
Article
Text
id pubmed-10256890
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-102568902023-06-11 Classification of Lapses in Smokers Attempting to Stop: A Supervised Machine Learning Approach Using Data From a Popular Smoking Cessation Smartphone App Perski, Olga Li, Kezhi Pontikos, Nikolas Simons, David Goldstein, Stephanie P Naughton, Felix Brown, Jamie Nicotine Tob Res Original Investigations INTRODUCTION: Smoking lapses after the quit date often lead to full relapse. To inform the development of real time, tailored lapse prevention support, we used observational data from a popular smoking cessation app to develop supervised machine learning algorithms to distinguish lapse from non-lapse reports. AIMS AND METHODS: We used data from app users with ≥20 unprompted data entries, which included information about craving severity, mood, activity, social context, and lapse incidence. A series of group-level supervised machine learning algorithms (eg, Random Forest, XGBoost) were trained and tested. Their ability to classify lapses for out-of-sample (1) observations and (2) individuals were evaluated. Next, a series of individual-level and hybrid algorithms were trained and tested. RESULTS: Participants (N = 791) provided 37 002 data entries (7.6% lapses). The best-performing group-level algorithm had an area under the receiver operating characteristic curve (AUC) of 0.969 (95% confidence interval [CI] = 0.961 to 0.978). Its ability to classify lapses for out-of-sample individuals ranged from poor to excellent (AUC = 0.482–1.000). Individual-level algorithms could be constructed for 39/791 participants with sufficient data, with a median AUC of 0.938 (range: 0.518–1.000). Hybrid algorithms could be constructed for 184/791 participants and had a median AUC of 0.825 (range: 0.375–1.000). CONCLUSIONS: Using unprompted app data appeared feasible for constructing a high-performing group-level lapse classification algorithm but its performance was variable when applied to unseen individuals. Algorithms trained on each individual’s dataset, in addition to hybrid algorithms trained on the group plus a proportion of each individual’s data, had improved performance but could only be constructed for a minority of participants. IMPLICATIONS: This study used routinely collected data from a popular smartphone app to train and test a series of supervised machine learning algorithms to distinguish lapse from non-lapse events. Although a high-performing group-level algorithm was developed, it had variable performance when applied to new, unseen individuals. Individual-level and hybrid algorithms had somewhat greater performance but could not be constructed for all participants because of the lack of variability in the outcome measure. Triangulation of results with those from a prompted study design is recommended prior to intervention development, with real-world lapse prediction likely requiring a balance between unprompted and prompted app data. Oxford University Press 2023-03-27 /pmc/articles/PMC10256890/ /pubmed/36971111 http://dx.doi.org/10.1093/ntr/ntad051 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Investigations
Perski, Olga
Li, Kezhi
Pontikos, Nikolas
Simons, David
Goldstein, Stephanie P
Naughton, Felix
Brown, Jamie
Classification of Lapses in Smokers Attempting to Stop: A Supervised Machine Learning Approach Using Data From a Popular Smoking Cessation Smartphone App
title Classification of Lapses in Smokers Attempting to Stop: A Supervised Machine Learning Approach Using Data From a Popular Smoking Cessation Smartphone App
title_full Classification of Lapses in Smokers Attempting to Stop: A Supervised Machine Learning Approach Using Data From a Popular Smoking Cessation Smartphone App
title_fullStr Classification of Lapses in Smokers Attempting to Stop: A Supervised Machine Learning Approach Using Data From a Popular Smoking Cessation Smartphone App
title_full_unstemmed Classification of Lapses in Smokers Attempting to Stop: A Supervised Machine Learning Approach Using Data From a Popular Smoking Cessation Smartphone App
title_short Classification of Lapses in Smokers Attempting to Stop: A Supervised Machine Learning Approach Using Data From a Popular Smoking Cessation Smartphone App
title_sort classification of lapses in smokers attempting to stop: a supervised machine learning approach using data from a popular smoking cessation smartphone app
topic Original Investigations
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10256890/
https://www.ncbi.nlm.nih.gov/pubmed/36971111
http://dx.doi.org/10.1093/ntr/ntad051
work_keys_str_mv AT perskiolga classificationoflapsesinsmokersattemptingtostopasupervisedmachinelearningapproachusingdatafromapopularsmokingcessationsmartphoneapp
AT likezhi classificationoflapsesinsmokersattemptingtostopasupervisedmachinelearningapproachusingdatafromapopularsmokingcessationsmartphoneapp
AT pontikosnikolas classificationoflapsesinsmokersattemptingtostopasupervisedmachinelearningapproachusingdatafromapopularsmokingcessationsmartphoneapp
AT simonsdavid classificationoflapsesinsmokersattemptingtostopasupervisedmachinelearningapproachusingdatafromapopularsmokingcessationsmartphoneapp
AT goldsteinstephaniep classificationoflapsesinsmokersattemptingtostopasupervisedmachinelearningapproachusingdatafromapopularsmokingcessationsmartphoneapp
AT naughtonfelix classificationoflapsesinsmokersattemptingtostopasupervisedmachinelearningapproachusingdatafromapopularsmokingcessationsmartphoneapp
AT brownjamie classificationoflapsesinsmokersattemptingtostopasupervisedmachinelearningapproachusingdatafromapopularsmokingcessationsmartphoneapp