Cargando…

Text mining for identifying topics in the literatures about adolescent substance use and depression

BACKGROUND: Both adolescent substance use and adolescent depression are major public health problems, and have the tendency to co-occur. Thousands of articles on adolescent substance use or depression have been published. It is labor intensive and time consuming to extract huge amounts of informatio...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Shi-Heng, Ding, Yijun, Zhao, Weizhong, Huang, Yung-Hsiang, Perkins, Roger, Zou, Wen, Chen, James J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4799597/
https://www.ncbi.nlm.nih.gov/pubmed/26993983
http://dx.doi.org/10.1186/s12889-016-2932-1
_version_ 1782422380167036928
author Wang, Shi-Heng
Ding, Yijun
Zhao, Weizhong
Huang, Yung-Hsiang
Perkins, Roger
Zou, Wen
Chen, James J.
author_facet Wang, Shi-Heng
Ding, Yijun
Zhao, Weizhong
Huang, Yung-Hsiang
Perkins, Roger
Zou, Wen
Chen, James J.
author_sort Wang, Shi-Heng
collection PubMed
description BACKGROUND: Both adolescent substance use and adolescent depression are major public health problems, and have the tendency to co-occur. Thousands of articles on adolescent substance use or depression have been published. It is labor intensive and time consuming to extract huge amounts of information from the cumulated collections. Topic modeling offers a computational tool to find relevant topics by capturing meaningful structure among collections of documents. METHODS: In this study, a total of 17,723 abstracts from PubMed published from 2000 to 2014 on adolescent substance use and depression were downloaded as objects, and Latent Dirichlet allocation (LDA) was applied to perform text mining on the dataset. Word clouds were used to visually display the content of topics and demonstrate the distribution of vocabularies over each topic. RESULTS: The LDA topics recaptured the search keywords in PubMed, and further discovered relevant issues, such as intervention program, association links between adolescent substance use and adolescent depression, such as sexual experience and violence, and risk factors of adolescent substance use, such as family factors and peer networks. Using trend analysis to explore the dynamics of proportion of topics, we found that brain research was assessed as a hot issue by the coefficient of the trend test. CONCLUSIONS: Topic modeling has the ability to segregate a large collection of articles into distinct themes, and it could be used as a tool to understand the literature, not only by recapturing known facts but also by discovering other relevant topics. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12889-016-2932-1) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4799597
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-47995972016-03-20 Text mining for identifying topics in the literatures about adolescent substance use and depression Wang, Shi-Heng Ding, Yijun Zhao, Weizhong Huang, Yung-Hsiang Perkins, Roger Zou, Wen Chen, James J. BMC Public Health Research Article BACKGROUND: Both adolescent substance use and adolescent depression are major public health problems, and have the tendency to co-occur. Thousands of articles on adolescent substance use or depression have been published. It is labor intensive and time consuming to extract huge amounts of information from the cumulated collections. Topic modeling offers a computational tool to find relevant topics by capturing meaningful structure among collections of documents. METHODS: In this study, a total of 17,723 abstracts from PubMed published from 2000 to 2014 on adolescent substance use and depression were downloaded as objects, and Latent Dirichlet allocation (LDA) was applied to perform text mining on the dataset. Word clouds were used to visually display the content of topics and demonstrate the distribution of vocabularies over each topic. RESULTS: The LDA topics recaptured the search keywords in PubMed, and further discovered relevant issues, such as intervention program, association links between adolescent substance use and adolescent depression, such as sexual experience and violence, and risk factors of adolescent substance use, such as family factors and peer networks. Using trend analysis to explore the dynamics of proportion of topics, we found that brain research was assessed as a hot issue by the coefficient of the trend test. CONCLUSIONS: Topic modeling has the ability to segregate a large collection of articles into distinct themes, and it could be used as a tool to understand the literature, not only by recapturing known facts but also by discovering other relevant topics. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12889-016-2932-1) contains supplementary material, which is available to authorized users. BioMed Central 2016-03-19 /pmc/articles/PMC4799597/ /pubmed/26993983 http://dx.doi.org/10.1186/s12889-016-2932-1 Text en © Wang et al. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Wang, Shi-Heng
Ding, Yijun
Zhao, Weizhong
Huang, Yung-Hsiang
Perkins, Roger
Zou, Wen
Chen, James J.
Text mining for identifying topics in the literatures about adolescent substance use and depression
title Text mining for identifying topics in the literatures about adolescent substance use and depression
title_full Text mining for identifying topics in the literatures about adolescent substance use and depression
title_fullStr Text mining for identifying topics in the literatures about adolescent substance use and depression
title_full_unstemmed Text mining for identifying topics in the literatures about adolescent substance use and depression
title_short Text mining for identifying topics in the literatures about adolescent substance use and depression
title_sort text mining for identifying topics in the literatures about adolescent substance use and depression
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4799597/
https://www.ncbi.nlm.nih.gov/pubmed/26993983
http://dx.doi.org/10.1186/s12889-016-2932-1
work_keys_str_mv AT wangshiheng textminingforidentifyingtopicsintheliteraturesaboutadolescentsubstanceuseanddepression
AT dingyijun textminingforidentifyingtopicsintheliteraturesaboutadolescentsubstanceuseanddepression
AT zhaoweizhong textminingforidentifyingtopicsintheliteraturesaboutadolescentsubstanceuseanddepression
AT huangyunghsiang textminingforidentifyingtopicsintheliteraturesaboutadolescentsubstanceuseanddepression
AT perkinsroger textminingforidentifyingtopicsintheliteraturesaboutadolescentsubstanceuseanddepression
AT zouwen textminingforidentifyingtopicsintheliteraturesaboutadolescentsubstanceuseanddepression
AT chenjamesj textminingforidentifyingtopicsintheliteraturesaboutadolescentsubstanceuseanddepression