Cargando…

An Ensemble Multilabel Classification for Disease Risk Prediction

It is important to identify and prevent disease risk as early as possible through regular physical examinations. We formulate the disease risk prediction into a multilabel classification problem. A novel Ensemble Label Power-set Pruned datasets Joint Decomposition (ELPPJD) method is proposed in this...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Runzhi, Liu, Wei, Lin, Yusong, Zhao, Hongling, Zhang, Chaoyang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5494772/
https://www.ncbi.nlm.nih.gov/pubmed/29065647
http://dx.doi.org/10.1155/2017/8051673
_version_ 1783247718531989504
author Li, Runzhi
Liu, Wei
Lin, Yusong
Zhao, Hongling
Zhang, Chaoyang
author_facet Li, Runzhi
Liu, Wei
Lin, Yusong
Zhao, Hongling
Zhang, Chaoyang
author_sort Li, Runzhi
collection PubMed
description It is important to identify and prevent disease risk as early as possible through regular physical examinations. We formulate the disease risk prediction into a multilabel classification problem. A novel Ensemble Label Power-set Pruned datasets Joint Decomposition (ELPPJD) method is proposed in this work. First, we transform the multilabel classification into a multiclass classification. Then, we propose the pruned datasets and joint decomposition methods to deal with the imbalance learning problem. Two strategies size balanced (SB) and label similarity (LS) are designed to decompose the training dataset. In the experiments, the dataset is from the real physical examination records. We contrast the performance of the ELPPJD method with two different decomposition strategies. Moreover, the comparison between ELPPJD and the classic multilabel classification methods RAkEL and HOMER is carried out. The experimental results show that the ELPPJD method with label similarity strategy has outstanding performance.
format Online
Article
Text
id pubmed-5494772
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-54947722017-07-13 An Ensemble Multilabel Classification for Disease Risk Prediction Li, Runzhi Liu, Wei Lin, Yusong Zhao, Hongling Zhang, Chaoyang J Healthc Eng Research Article It is important to identify and prevent disease risk as early as possible through regular physical examinations. We formulate the disease risk prediction into a multilabel classification problem. A novel Ensemble Label Power-set Pruned datasets Joint Decomposition (ELPPJD) method is proposed in this work. First, we transform the multilabel classification into a multiclass classification. Then, we propose the pruned datasets and joint decomposition methods to deal with the imbalance learning problem. Two strategies size balanced (SB) and label similarity (LS) are designed to decompose the training dataset. In the experiments, the dataset is from the real physical examination records. We contrast the performance of the ELPPJD method with two different decomposition strategies. Moreover, the comparison between ELPPJD and the classic multilabel classification methods RAkEL and HOMER is carried out. The experimental results show that the ELPPJD method with label similarity strategy has outstanding performance. Hindawi 2017 2017-06-15 /pmc/articles/PMC5494772/ /pubmed/29065647 http://dx.doi.org/10.1155/2017/8051673 Text en Copyright © 2017 Runzhi Li et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Li, Runzhi
Liu, Wei
Lin, Yusong
Zhao, Hongling
Zhang, Chaoyang
An Ensemble Multilabel Classification for Disease Risk Prediction
title An Ensemble Multilabel Classification for Disease Risk Prediction
title_full An Ensemble Multilabel Classification for Disease Risk Prediction
title_fullStr An Ensemble Multilabel Classification for Disease Risk Prediction
title_full_unstemmed An Ensemble Multilabel Classification for Disease Risk Prediction
title_short An Ensemble Multilabel Classification for Disease Risk Prediction
title_sort ensemble multilabel classification for disease risk prediction
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5494772/
https://www.ncbi.nlm.nih.gov/pubmed/29065647
http://dx.doi.org/10.1155/2017/8051673
work_keys_str_mv AT lirunzhi anensemblemultilabelclassificationfordiseaseriskprediction
AT liuwei anensemblemultilabelclassificationfordiseaseriskprediction
AT linyusong anensemblemultilabelclassificationfordiseaseriskprediction
AT zhaohongling anensemblemultilabelclassificationfordiseaseriskprediction
AT zhangchaoyang anensemblemultilabelclassificationfordiseaseriskprediction
AT lirunzhi ensemblemultilabelclassificationfordiseaseriskprediction
AT liuwei ensemblemultilabelclassificationfordiseaseriskprediction
AT linyusong ensemblemultilabelclassificationfordiseaseriskprediction
AT zhaohongling ensemblemultilabelclassificationfordiseaseriskprediction
AT zhangchaoyang ensemblemultilabelclassificationfordiseaseriskprediction