Cargando…

An overview of topic modeling and its current applications in bioinformatics

BACKGROUND: With the rapid accumulation of biological datasets, machine learning methods designed to automate data analysis are urgently needed. In recent years, so-called topic models that originated from the field of natural language processing have been receiving much attention in bioinformatics...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Lin, Tang, Lin, Dong, Wen, Yao, Shaowen, Zhou, Wei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5028368/
https://www.ncbi.nlm.nih.gov/pubmed/27652181
http://dx.doi.org/10.1186/s40064-016-3252-8
_version_ 1782454356250984448
author Liu, Lin
Tang, Lin
Dong, Wen
Yao, Shaowen
Zhou, Wei
author_facet Liu, Lin
Tang, Lin
Dong, Wen
Yao, Shaowen
Zhou, Wei
author_sort Liu, Lin
collection PubMed
description BACKGROUND: With the rapid accumulation of biological datasets, machine learning methods designed to automate data analysis are urgently needed. In recent years, so-called topic models that originated from the field of natural language processing have been receiving much attention in bioinformatics because of their interpretability. Our aim was to review the application and development of topic models for bioinformatics. DESCRIPTION: This paper starts with the description of a topic model, with a focus on the understanding of topic modeling. A general outline is provided on how to build an application in a topic model and how to develop a topic model. Meanwhile, the literature on application of topic models to biological data was searched and analyzed in depth. According to the types of models and the analogy between the concept of document-topic-word and a biological object (as well as the tasks of a topic model), we categorized the related studies and provided an outlook on the use of topic models for the development of bioinformatics applications. CONCLUSION: Topic modeling is a useful method (in contrast to the traditional means of data reduction in bioinformatics) and enhances researchers’ ability to interpret biological information. Nevertheless, due to the lack of topic models optimized for specific biological data, the studies on topic modeling in biological data still have a long and challenging road ahead. We believe that topic models are a promising method for various applications in bioinformatics research.
format Online
Article
Text
id pubmed-5028368
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-50283682016-09-20 An overview of topic modeling and its current applications in bioinformatics Liu, Lin Tang, Lin Dong, Wen Yao, Shaowen Zhou, Wei Springerplus Review BACKGROUND: With the rapid accumulation of biological datasets, machine learning methods designed to automate data analysis are urgently needed. In recent years, so-called topic models that originated from the field of natural language processing have been receiving much attention in bioinformatics because of their interpretability. Our aim was to review the application and development of topic models for bioinformatics. DESCRIPTION: This paper starts with the description of a topic model, with a focus on the understanding of topic modeling. A general outline is provided on how to build an application in a topic model and how to develop a topic model. Meanwhile, the literature on application of topic models to biological data was searched and analyzed in depth. According to the types of models and the analogy between the concept of document-topic-word and a biological object (as well as the tasks of a topic model), we categorized the related studies and provided an outlook on the use of topic models for the development of bioinformatics applications. CONCLUSION: Topic modeling is a useful method (in contrast to the traditional means of data reduction in bioinformatics) and enhances researchers’ ability to interpret biological information. Nevertheless, due to the lack of topic models optimized for specific biological data, the studies on topic modeling in biological data still have a long and challenging road ahead. We believe that topic models are a promising method for various applications in bioinformatics research. Springer International Publishing 2016-09-20 /pmc/articles/PMC5028368/ /pubmed/27652181 http://dx.doi.org/10.1186/s40064-016-3252-8 Text en © The Author(s) 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle Review
Liu, Lin
Tang, Lin
Dong, Wen
Yao, Shaowen
Zhou, Wei
An overview of topic modeling and its current applications in bioinformatics
title An overview of topic modeling and its current applications in bioinformatics
title_full An overview of topic modeling and its current applications in bioinformatics
title_fullStr An overview of topic modeling and its current applications in bioinformatics
title_full_unstemmed An overview of topic modeling and its current applications in bioinformatics
title_short An overview of topic modeling and its current applications in bioinformatics
title_sort overview of topic modeling and its current applications in bioinformatics
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5028368/
https://www.ncbi.nlm.nih.gov/pubmed/27652181
http://dx.doi.org/10.1186/s40064-016-3252-8
work_keys_str_mv AT liulin anoverviewoftopicmodelinganditscurrentapplicationsinbioinformatics
AT tanglin anoverviewoftopicmodelinganditscurrentapplicationsinbioinformatics
AT dongwen anoverviewoftopicmodelinganditscurrentapplicationsinbioinformatics
AT yaoshaowen anoverviewoftopicmodelinganditscurrentapplicationsinbioinformatics
AT zhouwei anoverviewoftopicmodelinganditscurrentapplicationsinbioinformatics
AT liulin overviewoftopicmodelinganditscurrentapplicationsinbioinformatics
AT tanglin overviewoftopicmodelinganditscurrentapplicationsinbioinformatics
AT dongwen overviewoftopicmodelinganditscurrentapplicationsinbioinformatics
AT yaoshaowen overviewoftopicmodelinganditscurrentapplicationsinbioinformatics
AT zhouwei overviewoftopicmodelinganditscurrentapplicationsinbioinformatics