Cargando…

Functional coding haplotypes and machine-learning feature elimination identifies predictors of Methotrexate Response in Rheumatoid Arthritis patients

BACKGROUND: Major challenges in large scale genetic association studies include not only the identification of causative single nucleotide polymorphisms (SNPs), but also accounting for SNP-SNP interactions. This study thus proposes a novel feature engineering approach integrating potentially functio...

Descripción completa

Detalles Bibliográficos
Autores principales: Lim, Ashley J.W., Lim, Lee Jin, Ooi, Brandon N.S., Koh, Ee Tzun, Tan, Justina Wei Lynn, Chong, Samuel S., Khor, Chiea Chuen, Tucker-Kellogg, Lisa, Leong, Khai Pang, Lee, Caroline G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8808170/
https://www.ncbi.nlm.nih.gov/pubmed/35022146
http://dx.doi.org/10.1016/j.ebiom.2021.103800
_version_ 1784643826995953664
author Lim, Ashley J.W.
Lim, Lee Jin
Ooi, Brandon N.S.
Koh, Ee Tzun
Tan, Justina Wei Lynn
Chong, Samuel S.
Khor, Chiea Chuen
Tucker-Kellogg, Lisa
Leong, Khai Pang
Lee, Caroline G.
author_facet Lim, Ashley J.W.
Lim, Lee Jin
Ooi, Brandon N.S.
Koh, Ee Tzun
Tan, Justina Wei Lynn
Chong, Samuel S.
Khor, Chiea Chuen
Tucker-Kellogg, Lisa
Leong, Khai Pang
Lee, Caroline G.
author_sort Lim, Ashley J.W.
collection PubMed
description BACKGROUND: Major challenges in large scale genetic association studies include not only the identification of causative single nucleotide polymorphisms (SNPs), but also accounting for SNP-SNP interactions. This study thus proposes a novel feature engineering approach integrating potentially functional coding haplotypes (pfcHap) with machine-learning (ML) feature selection to identify biologically meaningful, possibly causative genetic factors, that take into consideration potential SNP-SNP interactions within the pfcHap, to best predict for methotrexate (MTX) response in rheumatoid arthritis (RA) patients. METHODS: Exome sequencing from 349 RA patients were analysed, of which they were split into training and unseen test set. Inferred pfcHaps were combined with 30 non-genetic features to undergo ML recursive feature elimination with cross-validation using the training set. Predictive capacity and robustness of the selected features were assessed using six popular machine learning models through a train set cross-validation and evaluated in an unseen test set. FINDINGS: Significantly, 100 features (95 pfcHaps, 5 non-genetic factors) were identified to have good predictive performance (AUC: 0.776-0.828; Sensitivity: 0.656-0.813; Specificity: 0.684-0.868) across all six ML models in an unseen test dataset for the prediction of MTX response in RA patients. INTERPRETATION: Majority of the predictive pfcHap SNPs were predicted to be potentially functional and some of the genes in which the pfcHap resides in were identified to be associated with previously reported MTX/RA pathways. FUNDING: Singapore Ministry of Health's National Medical Research Council (NMRC) [NMRC/CBRG/0095/2015; CG12Aug17; CGAug16M012; NMRC/CG/017/2013]; National Cancer Center Research Fund and block funding Duke-NUS Medical School.; Singapore Ministry of Education Academic Research Fund Tier 2 grant MOE2019-T2-1-138.
format Online
Article
Text
id pubmed-8808170
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-88081702022-02-04 Functional coding haplotypes and machine-learning feature elimination identifies predictors of Methotrexate Response in Rheumatoid Arthritis patients Lim, Ashley J.W. Lim, Lee Jin Ooi, Brandon N.S. Koh, Ee Tzun Tan, Justina Wei Lynn Chong, Samuel S. Khor, Chiea Chuen Tucker-Kellogg, Lisa Leong, Khai Pang Lee, Caroline G. EBioMedicine Articles BACKGROUND: Major challenges in large scale genetic association studies include not only the identification of causative single nucleotide polymorphisms (SNPs), but also accounting for SNP-SNP interactions. This study thus proposes a novel feature engineering approach integrating potentially functional coding haplotypes (pfcHap) with machine-learning (ML) feature selection to identify biologically meaningful, possibly causative genetic factors, that take into consideration potential SNP-SNP interactions within the pfcHap, to best predict for methotrexate (MTX) response in rheumatoid arthritis (RA) patients. METHODS: Exome sequencing from 349 RA patients were analysed, of which they were split into training and unseen test set. Inferred pfcHaps were combined with 30 non-genetic features to undergo ML recursive feature elimination with cross-validation using the training set. Predictive capacity and robustness of the selected features were assessed using six popular machine learning models through a train set cross-validation and evaluated in an unseen test set. FINDINGS: Significantly, 100 features (95 pfcHaps, 5 non-genetic factors) were identified to have good predictive performance (AUC: 0.776-0.828; Sensitivity: 0.656-0.813; Specificity: 0.684-0.868) across all six ML models in an unseen test dataset for the prediction of MTX response in RA patients. INTERPRETATION: Majority of the predictive pfcHap SNPs were predicted to be potentially functional and some of the genes in which the pfcHap resides in were identified to be associated with previously reported MTX/RA pathways. FUNDING: Singapore Ministry of Health's National Medical Research Council (NMRC) [NMRC/CBRG/0095/2015; CG12Aug17; CGAug16M012; NMRC/CG/017/2013]; National Cancer Center Research Fund and block funding Duke-NUS Medical School.; Singapore Ministry of Education Academic Research Fund Tier 2 grant MOE2019-T2-1-138. Elsevier 2022-01-10 /pmc/articles/PMC8808170/ /pubmed/35022146 http://dx.doi.org/10.1016/j.ebiom.2021.103800 Text en © 2022 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Articles
Lim, Ashley J.W.
Lim, Lee Jin
Ooi, Brandon N.S.
Koh, Ee Tzun
Tan, Justina Wei Lynn
Chong, Samuel S.
Khor, Chiea Chuen
Tucker-Kellogg, Lisa
Leong, Khai Pang
Lee, Caroline G.
Functional coding haplotypes and machine-learning feature elimination identifies predictors of Methotrexate Response in Rheumatoid Arthritis patients
title Functional coding haplotypes and machine-learning feature elimination identifies predictors of Methotrexate Response in Rheumatoid Arthritis patients
title_full Functional coding haplotypes and machine-learning feature elimination identifies predictors of Methotrexate Response in Rheumatoid Arthritis patients
title_fullStr Functional coding haplotypes and machine-learning feature elimination identifies predictors of Methotrexate Response in Rheumatoid Arthritis patients
title_full_unstemmed Functional coding haplotypes and machine-learning feature elimination identifies predictors of Methotrexate Response in Rheumatoid Arthritis patients
title_short Functional coding haplotypes and machine-learning feature elimination identifies predictors of Methotrexate Response in Rheumatoid Arthritis patients
title_sort functional coding haplotypes and machine-learning feature elimination identifies predictors of methotrexate response in rheumatoid arthritis patients
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8808170/
https://www.ncbi.nlm.nih.gov/pubmed/35022146
http://dx.doi.org/10.1016/j.ebiom.2021.103800
work_keys_str_mv AT limashleyjw functionalcodinghaplotypesandmachinelearningfeatureeliminationidentifiespredictorsofmethotrexateresponseinrheumatoidarthritispatients
AT limleejin functionalcodinghaplotypesandmachinelearningfeatureeliminationidentifiespredictorsofmethotrexateresponseinrheumatoidarthritispatients
AT ooibrandonns functionalcodinghaplotypesandmachinelearningfeatureeliminationidentifiespredictorsofmethotrexateresponseinrheumatoidarthritispatients
AT koheetzun functionalcodinghaplotypesandmachinelearningfeatureeliminationidentifiespredictorsofmethotrexateresponseinrheumatoidarthritispatients
AT tanjustinaweilynn functionalcodinghaplotypesandmachinelearningfeatureeliminationidentifiespredictorsofmethotrexateresponseinrheumatoidarthritispatients
AT functionalcodinghaplotypesandmachinelearningfeatureeliminationidentifiespredictorsofmethotrexateresponseinrheumatoidarthritispatients
AT chongsamuels functionalcodinghaplotypesandmachinelearningfeatureeliminationidentifiespredictorsofmethotrexateresponseinrheumatoidarthritispatients
AT khorchieachuen functionalcodinghaplotypesandmachinelearningfeatureeliminationidentifiespredictorsofmethotrexateresponseinrheumatoidarthritispatients
AT tuckerkellogglisa functionalcodinghaplotypesandmachinelearningfeatureeliminationidentifiespredictorsofmethotrexateresponseinrheumatoidarthritispatients
AT leongkhaipang functionalcodinghaplotypesandmachinelearningfeatureeliminationidentifiespredictorsofmethotrexateresponseinrheumatoidarthritispatients
AT leecarolineg functionalcodinghaplotypesandmachinelearningfeatureeliminationidentifiespredictorsofmethotrexateresponseinrheumatoidarthritispatients