Cargando…
Improving efficiency of fitting Cox proportional hazards models for time-to-event outcomes in genome-wide association studies (GWAS)
SUMMARY: Technologies identifying single nucleotide polymorphisms (SNPs) in DNA sequencing yield an avalanche of data requiring analysis and interpretation. Standard methods may require many weeks of processing time. The use of statistical methods requiring data sorting, matrix inversions of a high-...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10625458/ https://www.ncbi.nlm.nih.gov/pubmed/37928342 http://dx.doi.org/10.1093/bioadv/vbad148 |
_version_ | 1785131137566244864 |
---|---|
author | Gebski, Val Silva, S Sandun M Byth, Karen Jenkins, Alicia Keech, Anthony |
author_facet | Gebski, Val Silva, S Sandun M Byth, Karen Jenkins, Alicia Keech, Anthony |
author_sort | Gebski, Val |
collection | PubMed |
description | SUMMARY: Technologies identifying single nucleotide polymorphisms (SNPs) in DNA sequencing yield an avalanche of data requiring analysis and interpretation. Standard methods may require many weeks of processing time. The use of statistical methods requiring data sorting, matrix inversions of a high-dimension and replication in subsets of the data on multiple outcomes exacerbate these times. A method which reduces the computational time in problems with time-to-event outcomes and hundreds of thousands/millions of SNPs using Cox–Snell residuals after fitting the Cox proportional hazards model (PH) to a fixed set of concomitant variables is proposed. This yields coefficients for SNP effect from a Cox–Snell adjusted Poisson model and shows a high concordance to the adjusted PH model. The method is illustrated with a sample of 10 000 SNPs from a genome-wide association study in a diabetic population. The gain in processing efficiency using the proposed method based on Poisson modelling can be as high as 62%. This could result in saving of over three weeks processing time if 5 million SNPs require analysis. The method involves only a single predictor variable (SNP), offering a simpler, computationally more stable approach to examining and identifying SNP patterns associated with the outcome(s) allowing for a faster development of genetic signatures. Use of deviance residuals from the PH model to screen SNPs demonstrates a large discordance rate at a 0.2% threshold of concordance. This rate is 15 times larger than that based on the Cox–Snell residuals from the Cox–Snell adjusted Poisson model. AVAILABILITY AND IMPLEMENTATION: The method is simple to implement as the procedures are available in most statistical packges. The approach involves obtaining Cox-Snell residuals from a PH model, to a binary time-to-event outcome, for factors which need to be common when assessing each SNP. Each SNP is then fitted as a predictor to the outcome of interest using a Poisson model with the Cox-Snell as the exposure variable. |
format | Online Article Text |
id | pubmed-10625458 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-106254582023-11-05 Improving efficiency of fitting Cox proportional hazards models for time-to-event outcomes in genome-wide association studies (GWAS) Gebski, Val Silva, S Sandun M Byth, Karen Jenkins, Alicia Keech, Anthony Bioinform Adv Original Article SUMMARY: Technologies identifying single nucleotide polymorphisms (SNPs) in DNA sequencing yield an avalanche of data requiring analysis and interpretation. Standard methods may require many weeks of processing time. The use of statistical methods requiring data sorting, matrix inversions of a high-dimension and replication in subsets of the data on multiple outcomes exacerbate these times. A method which reduces the computational time in problems with time-to-event outcomes and hundreds of thousands/millions of SNPs using Cox–Snell residuals after fitting the Cox proportional hazards model (PH) to a fixed set of concomitant variables is proposed. This yields coefficients for SNP effect from a Cox–Snell adjusted Poisson model and shows a high concordance to the adjusted PH model. The method is illustrated with a sample of 10 000 SNPs from a genome-wide association study in a diabetic population. The gain in processing efficiency using the proposed method based on Poisson modelling can be as high as 62%. This could result in saving of over three weeks processing time if 5 million SNPs require analysis. The method involves only a single predictor variable (SNP), offering a simpler, computationally more stable approach to examining and identifying SNP patterns associated with the outcome(s) allowing for a faster development of genetic signatures. Use of deviance residuals from the PH model to screen SNPs demonstrates a large discordance rate at a 0.2% threshold of concordance. This rate is 15 times larger than that based on the Cox–Snell residuals from the Cox–Snell adjusted Poisson model. AVAILABILITY AND IMPLEMENTATION: The method is simple to implement as the procedures are available in most statistical packges. The approach involves obtaining Cox-Snell residuals from a PH model, to a binary time-to-event outcome, for factors which need to be common when assessing each SNP. Each SNP is then fitted as a predictor to the outcome of interest using a Poisson model with the Cox-Snell as the exposure variable. Oxford University Press 2023-10-13 /pmc/articles/PMC10625458/ /pubmed/37928342 http://dx.doi.org/10.1093/bioadv/vbad148 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Article Gebski, Val Silva, S Sandun M Byth, Karen Jenkins, Alicia Keech, Anthony Improving efficiency of fitting Cox proportional hazards models for time-to-event outcomes in genome-wide association studies (GWAS) |
title | Improving efficiency of fitting Cox proportional hazards models for time-to-event outcomes in genome-wide association studies (GWAS) |
title_full | Improving efficiency of fitting Cox proportional hazards models for time-to-event outcomes in genome-wide association studies (GWAS) |
title_fullStr | Improving efficiency of fitting Cox proportional hazards models for time-to-event outcomes in genome-wide association studies (GWAS) |
title_full_unstemmed | Improving efficiency of fitting Cox proportional hazards models for time-to-event outcomes in genome-wide association studies (GWAS) |
title_short | Improving efficiency of fitting Cox proportional hazards models for time-to-event outcomes in genome-wide association studies (GWAS) |
title_sort | improving efficiency of fitting cox proportional hazards models for time-to-event outcomes in genome-wide association studies (gwas) |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10625458/ https://www.ncbi.nlm.nih.gov/pubmed/37928342 http://dx.doi.org/10.1093/bioadv/vbad148 |
work_keys_str_mv | AT gebskival improvingefficiencyoffittingcoxproportionalhazardsmodelsfortimetoeventoutcomesingenomewideassociationstudiesgwas AT silvassandunm improvingefficiencyoffittingcoxproportionalhazardsmodelsfortimetoeventoutcomesingenomewideassociationstudiesgwas AT bythkaren improvingefficiencyoffittingcoxproportionalhazardsmodelsfortimetoeventoutcomesingenomewideassociationstudiesgwas AT jenkinsalicia improvingefficiencyoffittingcoxproportionalhazardsmodelsfortimetoeventoutcomesingenomewideassociationstudiesgwas AT keechanthony improvingefficiencyoffittingcoxproportionalhazardsmodelsfortimetoeventoutcomesingenomewideassociationstudiesgwas |