Cargando…

Developing Sustainable Classification of Diseases via Deep Learning and Semi-Supervised Learning

Disease classification based on machine learning has become a crucial research topic in the fields of genetics and molecular biology. Generally, disease classification involves a supervised learning style; i.e., it requires a large number of labelled samples to achieve good classification performanc...

Descripción completa

Detalles Bibliográficos
Autores principales:	Yin, Chunwu, Chen, Zhanbo
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7551840/ https://www.ncbi.nlm.nih.gov/pubmed/32846941 http://dx.doi.org/10.3390/healthcare8030291

_version_	1783593267869253632
author	Yin, Chunwu Chen, Zhanbo
author_facet	Yin, Chunwu Chen, Zhanbo
author_sort	Yin, Chunwu
collection	PubMed
description	Disease classification based on machine learning has become a crucial research topic in the fields of genetics and molecular biology. Generally, disease classification involves a supervised learning style; i.e., it requires a large number of labelled samples to achieve good classification performance. However, in the majority of the cases, labelled samples are hard to obtain, so the amount of training data are limited. However, many unclassified (unlabelled) sequences have been deposited in public databases, which may help the training procedure. This method is called semi-supervised learning and is very useful in many applications. Self-training can be implemented using high- to low-confidence samples to prevent noisy samples from affecting the robustness of semi-supervised learning in the training process. The deep forest method with the hyperparameter settings used in this paper can achieve excellent performance. Therefore, in this work, we propose a novel combined deep learning model and semi-supervised learning with self-training approach to improve the performance in disease classification, which utilizes unlabelled samples to update a mechanism designed to increase the number of high-confidence pseudo-labelled samples. The experimental results show that our proposed model can achieve good performance in disease classification and disease-causing gene identification.
format	Online Article Text
id	pubmed-7551840
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-75518402020-10-14 Developing Sustainable Classification of Diseases via Deep Learning and Semi-Supervised Learning Yin, Chunwu Chen, Zhanbo Healthcare (Basel) Article Disease classification based on machine learning has become a crucial research topic in the fields of genetics and molecular biology. Generally, disease classification involves a supervised learning style; i.e., it requires a large number of labelled samples to achieve good classification performance. However, in the majority of the cases, labelled samples are hard to obtain, so the amount of training data are limited. However, many unclassified (unlabelled) sequences have been deposited in public databases, which may help the training procedure. This method is called semi-supervised learning and is very useful in many applications. Self-training can be implemented using high- to low-confidence samples to prevent noisy samples from affecting the robustness of semi-supervised learning in the training process. The deep forest method with the hyperparameter settings used in this paper can achieve excellent performance. Therefore, in this work, we propose a novel combined deep learning model and semi-supervised learning with self-training approach to improve the performance in disease classification, which utilizes unlabelled samples to update a mechanism designed to increase the number of high-confidence pseudo-labelled samples. The experimental results show that our proposed model can achieve good performance in disease classification and disease-causing gene identification. MDPI 2020-08-24 /pmc/articles/PMC7551840/ /pubmed/32846941 http://dx.doi.org/10.3390/healthcare8030291 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Yin, Chunwu Chen, Zhanbo Developing Sustainable Classification of Diseases via Deep Learning and Semi-Supervised Learning
title	Developing Sustainable Classification of Diseases via Deep Learning and Semi-Supervised Learning
title_full	Developing Sustainable Classification of Diseases via Deep Learning and Semi-Supervised Learning
title_fullStr	Developing Sustainable Classification of Diseases via Deep Learning and Semi-Supervised Learning
title_full_unstemmed	Developing Sustainable Classification of Diseases via Deep Learning and Semi-Supervised Learning
title_short	Developing Sustainable Classification of Diseases via Deep Learning and Semi-Supervised Learning
title_sort	developing sustainable classification of diseases via deep learning and semi-supervised learning
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7551840/ https://www.ncbi.nlm.nih.gov/pubmed/32846941 http://dx.doi.org/10.3390/healthcare8030291
work_keys_str_mv	AT yinchunwu developingsustainableclassificationofdiseasesviadeeplearningandsemisupervisedlearning AT chenzhanbo developingsustainableclassificationofdiseasesviadeeplearningandsemisupervisedlearning

Developing Sustainable Classification of Diseases via Deep Learning and Semi-Supervised Learning

Ejemplares similares