Cargando…

Aligning text mining and machine learning algorithms with best practices for study selection in systematic literature reviews

BACKGROUND: Despite existing research on text mining and machine learning for title and abstract screening, the role of machine learning within systematic literature reviews (SLRs) for health technology assessment (HTA) remains unclear given lack of extensive testing and of guidance from HTA agencie...

Descripción completa

Detalles Bibliográficos
Autores principales:	Popoff, E., Besada, M., Jansen, J. P., Cope, S., Kanters, S.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2020
Materias:	Methodology
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7734810/ https://www.ncbi.nlm.nih.gov/pubmed/33308292 http://dx.doi.org/10.1186/s13643-020-01520-5

_version_	1783622537100394496
author	Popoff, E. Besada, M. Jansen, J. P. Cope, S. Kanters, S.
author_facet	Popoff, E. Besada, M. Jansen, J. P. Cope, S. Kanters, S.
author_sort	Popoff, E.
collection	PubMed
description	BACKGROUND: Despite existing research on text mining and machine learning for title and abstract screening, the role of machine learning within systematic literature reviews (SLRs) for health technology assessment (HTA) remains unclear given lack of extensive testing and of guidance from HTA agencies. We sought to address two knowledge gaps: to extend ML algorithms to provide a reason for exclusion—to align with current practices—and to determine optimal parameter settings for feature-set generation and ML algorithms. METHODS: We used abstract and full-text selection data from five large SLRs (n = 3089 to 12,769 abstracts) across a variety of disease areas. Each SLR was split into training and test sets. We developed a multi-step algorithm to categorize each citation into the following categories: included; excluded for each PICOS criterion; or unclassified. We used a bag-of-words approach for feature-set generation and compared machine learning algorithms using support vector machines (SVMs), naïve Bayes (NB), and bagged classification and regression trees (CART) for classification. We also compared alternative training set strategies: using full data versus downsampling (i.e., reducing excludes to balance includes/excludes because machine learning algorithms perform better with balanced data), and using inclusion/exclusion decisions from abstract versus full-text screening. Performance comparisons were in terms of specificity, sensitivity, accuracy, and matching the reason for exclusion. RESULTS: The best-fitting model (optimized sensitivity and specificity) was based on the SVM algorithm using training data based on full-text decisions, downsampling, and excluding words occurring fewer than five times. The sensitivity and specificity of this model ranged from 94 to 100%, and 54 to 89%, respectively, across the five SLRs. On average, 75% of excluded citations were excluded with a reason and 83% of these citations matched the reviewers’ original reason for exclusion. Sensitivity significantly improved when both downsampling and abstract decisions were used. CONCLUSIONS: ML algorithms can improve the efficiency of the SLR process and the proposed algorithms could reduce the workload of a second reviewer by identifying exclusions with a relevant PICOS reason, thus aligning with HTA guidance. Downsampling can be used to improve study selection, and improvements using full-text exclusions have implications for a learn-as-you-go approach. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13643-020-01520-5.
format	Online Article Text
id	pubmed-7734810
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-77348102020-12-15 Aligning text mining and machine learning algorithms with best practices for study selection in systematic literature reviews Popoff, E. Besada, M. Jansen, J. P. Cope, S. Kanters, S. Syst Rev Methodology BACKGROUND: Despite existing research on text mining and machine learning for title and abstract screening, the role of machine learning within systematic literature reviews (SLRs) for health technology assessment (HTA) remains unclear given lack of extensive testing and of guidance from HTA agencies. We sought to address two knowledge gaps: to extend ML algorithms to provide a reason for exclusion—to align with current practices—and to determine optimal parameter settings for feature-set generation and ML algorithms. METHODS: We used abstract and full-text selection data from five large SLRs (n = 3089 to 12,769 abstracts) across a variety of disease areas. Each SLR was split into training and test sets. We developed a multi-step algorithm to categorize each citation into the following categories: included; excluded for each PICOS criterion; or unclassified. We used a bag-of-words approach for feature-set generation and compared machine learning algorithms using support vector machines (SVMs), naïve Bayes (NB), and bagged classification and regression trees (CART) for classification. We also compared alternative training set strategies: using full data versus downsampling (i.e., reducing excludes to balance includes/excludes because machine learning algorithms perform better with balanced data), and using inclusion/exclusion decisions from abstract versus full-text screening. Performance comparisons were in terms of specificity, sensitivity, accuracy, and matching the reason for exclusion. RESULTS: The best-fitting model (optimized sensitivity and specificity) was based on the SVM algorithm using training data based on full-text decisions, downsampling, and excluding words occurring fewer than five times. The sensitivity and specificity of this model ranged from 94 to 100%, and 54 to 89%, respectively, across the five SLRs. On average, 75% of excluded citations were excluded with a reason and 83% of these citations matched the reviewers’ original reason for exclusion. Sensitivity significantly improved when both downsampling and abstract decisions were used. CONCLUSIONS: ML algorithms can improve the efficiency of the SLR process and the proposed algorithms could reduce the workload of a second reviewer by identifying exclusions with a relevant PICOS reason, thus aligning with HTA guidance. Downsampling can be used to improve study selection, and improvements using full-text exclusions have implications for a learn-as-you-go approach. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13643-020-01520-5. BioMed Central 2020-12-13 /pmc/articles/PMC7734810/ /pubmed/33308292 http://dx.doi.org/10.1186/s13643-020-01520-5 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Methodology Popoff, E. Besada, M. Jansen, J. P. Cope, S. Kanters, S. Aligning text mining and machine learning algorithms with best practices for study selection in systematic literature reviews
title	Aligning text mining and machine learning algorithms with best practices for study selection in systematic literature reviews
title_full	Aligning text mining and machine learning algorithms with best practices for study selection in systematic literature reviews
title_fullStr	Aligning text mining and machine learning algorithms with best practices for study selection in systematic literature reviews
title_full_unstemmed	Aligning text mining and machine learning algorithms with best practices for study selection in systematic literature reviews
title_short	Aligning text mining and machine learning algorithms with best practices for study selection in systematic literature reviews
title_sort	aligning text mining and machine learning algorithms with best practices for study selection in systematic literature reviews
topic	Methodology
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7734810/ https://www.ncbi.nlm.nih.gov/pubmed/33308292 http://dx.doi.org/10.1186/s13643-020-01520-5
work_keys_str_mv	AT popoffe aligningtextminingandmachinelearningalgorithmswithbestpracticesforstudyselectioninsystematicliteraturereviews AT besadam aligningtextminingandmachinelearningalgorithmswithbestpracticesforstudyselectioninsystematicliteraturereviews AT jansenjp aligningtextminingandmachinelearningalgorithmswithbestpracticesforstudyselectioninsystematicliteraturereviews AT copes aligningtextminingandmachinelearningalgorithmswithbestpracticesforstudyselectioninsystematicliteraturereviews AT kanterss aligningtextminingandmachinelearningalgorithmswithbestpracticesforstudyselectioninsystematicliteraturereviews

Aligning text mining and machine learning algorithms with best practices for study selection in systematic literature reviews

Ejemplares similares