Cargando…

An empirical workflow for genome-wide single nucleotide polymorphism-based predictive modeling

Technology is constantly evolving, necessitating the development of workflows for efficient use of high-dimensional data. We develop and test an empirical workflow for predictive modeling based on single nucleotide polymorphisms (SNP) from genome-wide association study (GWAS) datasets. To this aim,...

Descripción completa

Detalles Bibliográficos
Autores principales: Floudas, Charalampos S., Balasubramanian, Jeya Balaji, Romkes, Marjorie, Gopalakrishnan, Vanathi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Medical Informatics Association 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3814469/
https://www.ncbi.nlm.nih.gov/pubmed/24303297
Descripción
Sumario:Technology is constantly evolving, necessitating the development of workflows for efficient use of high-dimensional data. We develop and test an empirical workflow for predictive modeling based on single nucleotide polymorphisms (SNP) from genome-wide association study (GWAS) datasets. To this aim, we use as a case study SNP-based prediction of survival for non-small cell lung cancer (NSCLC) with a Bayesian rule learner system (BRL+). Lung cancer is a leading cause of mortality. Standard treatment for early stages of NSCLC is surgery. Adjuvant chemotherapy would be beneficial for patients with early recurrence; consequently, we need models capable of such prediction. This workflow outlines the challenges involved in processing GWAS datasets from one popular platform (Affymetrix®), from the results files of the hybridization experiment to the model construction. Our results show that our workflow is feasible and efficient for processing such data while also yielding SNP based models with high predictive accuracy over cross validation.