Cargando…

Random forests algorithm boosts genetic risk prediction of systemic lupus erythematosus

Patients with systemic lupus erythematosus (SLE) present varied clinical manifestations, posing a diagnostic challenge for physicians. Genetic factors substantially contribute to SLE development. A polygenic risk scoring (PRS) model has been used to estimate the genetic risk of SLE in individuals. H...

Descripción completa

Detalles Bibliográficos
Autores principales: Ma, Wen, Lau, Yu-Lung, Yang, Wanling, Wang, Yong-Fei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9421562/
https://www.ncbi.nlm.nih.gov/pubmed/36046232
http://dx.doi.org/10.3389/fgene.2022.902793
_version_ 1784777621549088768
author Ma, Wen
Lau, Yu-Lung
Yang, Wanling
Wang, Yong-Fei
author_facet Ma, Wen
Lau, Yu-Lung
Yang, Wanling
Wang, Yong-Fei
author_sort Ma, Wen
collection PubMed
description Patients with systemic lupus erythematosus (SLE) present varied clinical manifestations, posing a diagnostic challenge for physicians. Genetic factors substantially contribute to SLE development. A polygenic risk scoring (PRS) model has been used to estimate the genetic risk of SLE in individuals. However, this approach assumes independent and additive contribution of genetic variants to disease development. We aimed to improve the accuracy of SLE prediction using machine-learning algorithms. We applied random forest (RF), support vector machine (SVM), and artificial neural network (ANN) to classify SLE cases and controls using the data from our previous genome-wide association studies (GWAS) conducted in either Chinese or European populations, including a total of 19,208 participants. The overall performances of these predictors were assessed by the value of area under the receiver-operator curve (AUC). The analyses in the Chinese GWAS showed that the RF model significantly outperformed other predictors, achieving a mean AUC value of 0.84, a 13% improvement upon the PRS model (AUC = 0.74). At the optimal cut-off, the RF predictor reached a sensitivity of 84% with a specificity of 68% in SLE classification. To validate these results, similar analyses were repeated in the European GWAS, and the RF model consistently outperformed other algorithms. Our study suggests that the RF model could be an additional and powerful predictor for SLE early diagnosis.
format Online
Article
Text
id pubmed-9421562
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-94215622022-08-30 Random forests algorithm boosts genetic risk prediction of systemic lupus erythematosus Ma, Wen Lau, Yu-Lung Yang, Wanling Wang, Yong-Fei Front Genet Genetics Patients with systemic lupus erythematosus (SLE) present varied clinical manifestations, posing a diagnostic challenge for physicians. Genetic factors substantially contribute to SLE development. A polygenic risk scoring (PRS) model has been used to estimate the genetic risk of SLE in individuals. However, this approach assumes independent and additive contribution of genetic variants to disease development. We aimed to improve the accuracy of SLE prediction using machine-learning algorithms. We applied random forest (RF), support vector machine (SVM), and artificial neural network (ANN) to classify SLE cases and controls using the data from our previous genome-wide association studies (GWAS) conducted in either Chinese or European populations, including a total of 19,208 participants. The overall performances of these predictors were assessed by the value of area under the receiver-operator curve (AUC). The analyses in the Chinese GWAS showed that the RF model significantly outperformed other predictors, achieving a mean AUC value of 0.84, a 13% improvement upon the PRS model (AUC = 0.74). At the optimal cut-off, the RF predictor reached a sensitivity of 84% with a specificity of 68% in SLE classification. To validate these results, similar analyses were repeated in the European GWAS, and the RF model consistently outperformed other algorithms. Our study suggests that the RF model could be an additional and powerful predictor for SLE early diagnosis. Frontiers Media S.A. 2022-08-15 /pmc/articles/PMC9421562/ /pubmed/36046232 http://dx.doi.org/10.3389/fgene.2022.902793 Text en Copyright © 2022 Ma, Lau, Yang and Wang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Ma, Wen
Lau, Yu-Lung
Yang, Wanling
Wang, Yong-Fei
Random forests algorithm boosts genetic risk prediction of systemic lupus erythematosus
title Random forests algorithm boosts genetic risk prediction of systemic lupus erythematosus
title_full Random forests algorithm boosts genetic risk prediction of systemic lupus erythematosus
title_fullStr Random forests algorithm boosts genetic risk prediction of systemic lupus erythematosus
title_full_unstemmed Random forests algorithm boosts genetic risk prediction of systemic lupus erythematosus
title_short Random forests algorithm boosts genetic risk prediction of systemic lupus erythematosus
title_sort random forests algorithm boosts genetic risk prediction of systemic lupus erythematosus
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9421562/
https://www.ncbi.nlm.nih.gov/pubmed/36046232
http://dx.doi.org/10.3389/fgene.2022.902793
work_keys_str_mv AT mawen randomforestsalgorithmboostsgeneticriskpredictionofsystemiclupuserythematosus
AT lauyulung randomforestsalgorithmboostsgeneticriskpredictionofsystemiclupuserythematosus
AT yangwanling randomforestsalgorithmboostsgeneticriskpredictionofsystemiclupuserythematosus
AT wangyongfei randomforestsalgorithmboostsgeneticriskpredictionofsystemiclupuserythematosus