Cargando…

Nearest Consensus Clustering Classification to Identify Subclasses and Predict Disease

Disease subtyping, which helps to develop personalized treatments, remains a challenge in data analysis because of the many different ways to group patients based upon their data. However, if we can identify subclasses of disease, then it will help to develop better models that are more specific to...

Descripción completa

Detalles Bibliográficos
Autores principales: Alyousef, Awad A., Nihtyanova, Svetlana, Denton, Chris, Bosoni, Pietro, Bellazzi, Riccardo, Tucker, Allan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6245235/
https://www.ncbi.nlm.nih.gov/pubmed/30533598
http://dx.doi.org/10.1007/s41666-018-0029-6
_version_ 1783372200511799296
author Alyousef, Awad A.
Nihtyanova, Svetlana
Denton, Chris
Bosoni, Pietro
Bellazzi, Riccardo
Tucker, Allan
author_facet Alyousef, Awad A.
Nihtyanova, Svetlana
Denton, Chris
Bosoni, Pietro
Bellazzi, Riccardo
Tucker, Allan
author_sort Alyousef, Awad A.
collection PubMed
description Disease subtyping, which helps to develop personalized treatments, remains a challenge in data analysis because of the many different ways to group patients based upon their data. However, if we can identify subclasses of disease, then it will help to develop better models that are more specific to individuals and should therefore improve prediction and understanding of the underlying characteristics of the disease in question. This paper proposes a new algorithm that integrates consensus clustering methods with classification in order to overcome issues with sample bias. The new algorithm combines K-means with consensus clustering in order build cohort-specific decision trees that improve classification as well as aid the understanding of the underlying differences of the discovered groups. The methods are tested on a real-world freely available breast cancer dataset and data from a London hospital on systemic sclerosis, a rare potentially fatal condition. Results show that “nearest consensus clustering classification” improves the accuracy and the prediction significantly when this algorithm has been compared with competitive similar methods. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1007/s41666-018-0029-6) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6245235
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-62452352018-12-06 Nearest Consensus Clustering Classification to Identify Subclasses and Predict Disease Alyousef, Awad A. Nihtyanova, Svetlana Denton, Chris Bosoni, Pietro Bellazzi, Riccardo Tucker, Allan J Healthc Inform Res Research Article Disease subtyping, which helps to develop personalized treatments, remains a challenge in data analysis because of the many different ways to group patients based upon their data. However, if we can identify subclasses of disease, then it will help to develop better models that are more specific to individuals and should therefore improve prediction and understanding of the underlying characteristics of the disease in question. This paper proposes a new algorithm that integrates consensus clustering methods with classification in order to overcome issues with sample bias. The new algorithm combines K-means with consensus clustering in order build cohort-specific decision trees that improve classification as well as aid the understanding of the underlying differences of the discovered groups. The methods are tested on a real-world freely available breast cancer dataset and data from a London hospital on systemic sclerosis, a rare potentially fatal condition. Results show that “nearest consensus clustering classification” improves the accuracy and the prediction significantly when this algorithm has been compared with competitive similar methods. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1007/s41666-018-0029-6) contains supplementary material, which is available to authorized users. Springer International Publishing 2018-07-30 /pmc/articles/PMC6245235/ /pubmed/30533598 http://dx.doi.org/10.1007/s41666-018-0029-6 Text en © The Author(s) 2018 https://creativecommons.org/licenses/by/4.0/Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle Research Article
Alyousef, Awad A.
Nihtyanova, Svetlana
Denton, Chris
Bosoni, Pietro
Bellazzi, Riccardo
Tucker, Allan
Nearest Consensus Clustering Classification to Identify Subclasses and Predict Disease
title Nearest Consensus Clustering Classification to Identify Subclasses and Predict Disease
title_full Nearest Consensus Clustering Classification to Identify Subclasses and Predict Disease
title_fullStr Nearest Consensus Clustering Classification to Identify Subclasses and Predict Disease
title_full_unstemmed Nearest Consensus Clustering Classification to Identify Subclasses and Predict Disease
title_short Nearest Consensus Clustering Classification to Identify Subclasses and Predict Disease
title_sort nearest consensus clustering classification to identify subclasses and predict disease
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6245235/
https://www.ncbi.nlm.nih.gov/pubmed/30533598
http://dx.doi.org/10.1007/s41666-018-0029-6
work_keys_str_mv AT alyousefawada nearestconsensusclusteringclassificationtoidentifysubclassesandpredictdisease
AT nihtyanovasvetlana nearestconsensusclusteringclassificationtoidentifysubclassesandpredictdisease
AT dentonchris nearestconsensusclusteringclassificationtoidentifysubclassesandpredictdisease
AT bosonipietro nearestconsensusclusteringclassificationtoidentifysubclassesandpredictdisease
AT bellazziriccardo nearestconsensusclusteringclassificationtoidentifysubclassesandpredictdisease
AT tuckerallan nearestconsensusclusteringclassificationtoidentifysubclassesandpredictdisease