Cargando…

An adaptive threshold determination method of feature screening for genomic selection

BACKGROUND: Although the dimension of the entire genome can be extremely large, only a parsimonious set of influential SNPs are correlated with a particular complex trait and are important to the prediction of the trait. Efficiently and accurately selecting these influential SNPs from millions of ca...

Descripción completa

Detalles Bibliográficos
Autores principales: Fu, Guifang, Wang, Gang, Dai, Xiaotian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5389084/
https://www.ncbi.nlm.nih.gov/pubmed/28403836
http://dx.doi.org/10.1186/s12859-017-1617-9
Descripción
Sumario:BACKGROUND: Although the dimension of the entire genome can be extremely large, only a parsimonious set of influential SNPs are correlated with a particular complex trait and are important to the prediction of the trait. Efficiently and accurately selecting these influential SNPs from millions of candidates is in high demand, but poses challenges. We propose a backward elimination iterative distance correlation (BE-IDC) procedure to select the smallest subset of SNPs that guarantees sufficient prediction accuracy, while also solving the unclear threshold issue for traditional feature screening approaches. RESULTS: Verified through six simulations, the adaptive threshold estimated by the BE-IDC performed uniformly better than fixed threshold methods that have been used in the current literature. We also applied BE-IDC to an Arabidopsis thaliana genome-wide data. Out of 216,130 SNPs, BE-IDC selected four influential SNPs, and confirmed the same FRIGIDA gene that was reported by two other traditional methods. CONCLUSIONS: BE-IDC accommodates both the prediction accuracy and the computational speed that are highly demanded in the genomic selection. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-017-1617-9) contains supplementary material, which is available to authorized users.