Cargando…

Improving efficiency of fitting Cox proportional hazards models for time-to-event outcomes in genome-wide association studies (GWAS)

SUMMARY: Technologies identifying single nucleotide polymorphisms (SNPs) in DNA sequencing yield an avalanche of data requiring analysis and interpretation. Standard methods may require many weeks of processing time. The use of statistical methods requiring data sorting, matrix inversions of a high-...

Descripción completa

Detalles Bibliográficos
Autores principales: Gebski, Val, Silva, S Sandun M, Byth, Karen, Jenkins, Alicia, Keech, Anthony
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10625458/
https://www.ncbi.nlm.nih.gov/pubmed/37928342
http://dx.doi.org/10.1093/bioadv/vbad148
_version_ 1785131137566244864
author Gebski, Val
Silva, S Sandun M
Byth, Karen
Jenkins, Alicia
Keech, Anthony
author_facet Gebski, Val
Silva, S Sandun M
Byth, Karen
Jenkins, Alicia
Keech, Anthony
author_sort Gebski, Val
collection PubMed
description SUMMARY: Technologies identifying single nucleotide polymorphisms (SNPs) in DNA sequencing yield an avalanche of data requiring analysis and interpretation. Standard methods may require many weeks of processing time. The use of statistical methods requiring data sorting, matrix inversions of a high-dimension and replication in subsets of the data on multiple outcomes exacerbate these times. A method which reduces the computational time in problems with time-to-event outcomes and hundreds of thousands/millions of SNPs using Cox–Snell residuals after fitting the Cox proportional hazards model (PH) to a fixed set of concomitant variables is proposed. This yields coefficients for SNP effect from a Cox–Snell adjusted Poisson model and shows a high concordance to the adjusted PH model. The method is illustrated with a sample of 10 000 SNPs from a genome-wide association study in a diabetic population. The gain in processing efficiency using the proposed method based on Poisson modelling can be as high as 62%. This could result in saving of over three weeks processing time if 5 million SNPs require analysis. The method involves only a single predictor variable (SNP), offering a simpler, computationally more stable approach to examining and identifying SNP patterns associated with the outcome(s) allowing for a faster development of genetic signatures. Use of deviance residuals from the PH model to screen SNPs demonstrates a large discordance rate at a 0.2% threshold of concordance. This rate is 15 times larger than that based on the Cox–Snell residuals from the Cox–Snell adjusted Poisson model. AVAILABILITY AND IMPLEMENTATION: The method is simple to implement as the procedures are available in most statistical packges. The approach involves obtaining Cox-Snell residuals from a PH model, to a binary time-to-event outcome, for factors which need to be common when assessing each SNP. Each SNP is then fitted as a predictor to the outcome of interest using a Poisson model with the Cox-Snell as the exposure variable.
format Online
Article
Text
id pubmed-10625458
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-106254582023-11-05 Improving efficiency of fitting Cox proportional hazards models for time-to-event outcomes in genome-wide association studies (GWAS) Gebski, Val Silva, S Sandun M Byth, Karen Jenkins, Alicia Keech, Anthony Bioinform Adv Original Article SUMMARY: Technologies identifying single nucleotide polymorphisms (SNPs) in DNA sequencing yield an avalanche of data requiring analysis and interpretation. Standard methods may require many weeks of processing time. The use of statistical methods requiring data sorting, matrix inversions of a high-dimension and replication in subsets of the data on multiple outcomes exacerbate these times. A method which reduces the computational time in problems with time-to-event outcomes and hundreds of thousands/millions of SNPs using Cox–Snell residuals after fitting the Cox proportional hazards model (PH) to a fixed set of concomitant variables is proposed. This yields coefficients for SNP effect from a Cox–Snell adjusted Poisson model and shows a high concordance to the adjusted PH model. The method is illustrated with a sample of 10 000 SNPs from a genome-wide association study in a diabetic population. The gain in processing efficiency using the proposed method based on Poisson modelling can be as high as 62%. This could result in saving of over three weeks processing time if 5 million SNPs require analysis. The method involves only a single predictor variable (SNP), offering a simpler, computationally more stable approach to examining and identifying SNP patterns associated with the outcome(s) allowing for a faster development of genetic signatures. Use of deviance residuals from the PH model to screen SNPs demonstrates a large discordance rate at a 0.2% threshold of concordance. This rate is 15 times larger than that based on the Cox–Snell residuals from the Cox–Snell adjusted Poisson model. AVAILABILITY AND IMPLEMENTATION: The method is simple to implement as the procedures are available in most statistical packges. The approach involves obtaining Cox-Snell residuals from a PH model, to a binary time-to-event outcome, for factors which need to be common when assessing each SNP. Each SNP is then fitted as a predictor to the outcome of interest using a Poisson model with the Cox-Snell as the exposure variable. Oxford University Press 2023-10-13 /pmc/articles/PMC10625458/ /pubmed/37928342 http://dx.doi.org/10.1093/bioadv/vbad148 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Gebski, Val
Silva, S Sandun M
Byth, Karen
Jenkins, Alicia
Keech, Anthony
Improving efficiency of fitting Cox proportional hazards models for time-to-event outcomes in genome-wide association studies (GWAS)
title Improving efficiency of fitting Cox proportional hazards models for time-to-event outcomes in genome-wide association studies (GWAS)
title_full Improving efficiency of fitting Cox proportional hazards models for time-to-event outcomes in genome-wide association studies (GWAS)
title_fullStr Improving efficiency of fitting Cox proportional hazards models for time-to-event outcomes in genome-wide association studies (GWAS)
title_full_unstemmed Improving efficiency of fitting Cox proportional hazards models for time-to-event outcomes in genome-wide association studies (GWAS)
title_short Improving efficiency of fitting Cox proportional hazards models for time-to-event outcomes in genome-wide association studies (GWAS)
title_sort improving efficiency of fitting cox proportional hazards models for time-to-event outcomes in genome-wide association studies (gwas)
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10625458/
https://www.ncbi.nlm.nih.gov/pubmed/37928342
http://dx.doi.org/10.1093/bioadv/vbad148
work_keys_str_mv AT gebskival improvingefficiencyoffittingcoxproportionalhazardsmodelsfortimetoeventoutcomesingenomewideassociationstudiesgwas
AT silvassandunm improvingefficiencyoffittingcoxproportionalhazardsmodelsfortimetoeventoutcomesingenomewideassociationstudiesgwas
AT bythkaren improvingefficiencyoffittingcoxproportionalhazardsmodelsfortimetoeventoutcomesingenomewideassociationstudiesgwas
AT jenkinsalicia improvingefficiencyoffittingcoxproportionalhazardsmodelsfortimetoeventoutcomesingenomewideassociationstudiesgwas
AT keechanthony improvingefficiencyoffittingcoxproportionalhazardsmodelsfortimetoeventoutcomesingenomewideassociationstudiesgwas