Cargando…

Combining Random Forests and a Signal Detection Method Leads to the Robust Detection of Genotype-Phenotype Associations

Genome wide association studies (GWAS) are a well established methodology to identify genomic variants and genes that are responsible for traits of interest in all branches of the life sciences. Despite the long time this methodology has had to mature the reliable detection of genotype–phenotype ass...

Descripción completa

Detalles Bibliográficos
Autores principales: Ramzan, Faisal, Gültas, Mehmet, Bertram, Hendrik, Cavero, David, Schmitt, Armin Otto
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7465705/
https://www.ncbi.nlm.nih.gov/pubmed/32764260
http://dx.doi.org/10.3390/genes11080892
_version_ 1783577647991750656
author Ramzan, Faisal
Gültas, Mehmet
Bertram, Hendrik
Cavero, David
Schmitt, Armin Otto
author_facet Ramzan, Faisal
Gültas, Mehmet
Bertram, Hendrik
Cavero, David
Schmitt, Armin Otto
author_sort Ramzan, Faisal
collection PubMed
description Genome wide association studies (GWAS) are a well established methodology to identify genomic variants and genes that are responsible for traits of interest in all branches of the life sciences. Despite the long time this methodology has had to mature the reliable detection of genotype–phenotype associations is still a challenge for many quantitative traits mainly because of the large number of genomic loci with weak individual effects on the trait under investigation. Thus, it can be hypothesized that many genomic variants that have a small, however real, effect remain unnoticed in many GWAS approaches. Here, we propose a two-step procedure to address this problem. In a first step, cubic splines are fitted to the test statistic values and genomic regions with spline-peaks that are higher than expected by chance are considered as quantitative trait loci (QTL). Then the SNPs in these QTLs are prioritized with respect to the strength of their association with the phenotype using a Random Forests approach. As a case study, we apply our procedure to real data sets and find trustworthy numbers of, partially novel, genomic variants and genes involved in various egg quality traits.
format Online
Article
Text
id pubmed-7465705
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-74657052020-09-04 Combining Random Forests and a Signal Detection Method Leads to the Robust Detection of Genotype-Phenotype Associations Ramzan, Faisal Gültas, Mehmet Bertram, Hendrik Cavero, David Schmitt, Armin Otto Genes (Basel) Article Genome wide association studies (GWAS) are a well established methodology to identify genomic variants and genes that are responsible for traits of interest in all branches of the life sciences. Despite the long time this methodology has had to mature the reliable detection of genotype–phenotype associations is still a challenge for many quantitative traits mainly because of the large number of genomic loci with weak individual effects on the trait under investigation. Thus, it can be hypothesized that many genomic variants that have a small, however real, effect remain unnoticed in many GWAS approaches. Here, we propose a two-step procedure to address this problem. In a first step, cubic splines are fitted to the test statistic values and genomic regions with spline-peaks that are higher than expected by chance are considered as quantitative trait loci (QTL). Then the SNPs in these QTLs are prioritized with respect to the strength of their association with the phenotype using a Random Forests approach. As a case study, we apply our procedure to real data sets and find trustworthy numbers of, partially novel, genomic variants and genes involved in various egg quality traits. MDPI 2020-08-05 /pmc/articles/PMC7465705/ /pubmed/32764260 http://dx.doi.org/10.3390/genes11080892 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Ramzan, Faisal
Gültas, Mehmet
Bertram, Hendrik
Cavero, David
Schmitt, Armin Otto
Combining Random Forests and a Signal Detection Method Leads to the Robust Detection of Genotype-Phenotype Associations
title Combining Random Forests and a Signal Detection Method Leads to the Robust Detection of Genotype-Phenotype Associations
title_full Combining Random Forests and a Signal Detection Method Leads to the Robust Detection of Genotype-Phenotype Associations
title_fullStr Combining Random Forests and a Signal Detection Method Leads to the Robust Detection of Genotype-Phenotype Associations
title_full_unstemmed Combining Random Forests and a Signal Detection Method Leads to the Robust Detection of Genotype-Phenotype Associations
title_short Combining Random Forests and a Signal Detection Method Leads to the Robust Detection of Genotype-Phenotype Associations
title_sort combining random forests and a signal detection method leads to the robust detection of genotype-phenotype associations
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7465705/
https://www.ncbi.nlm.nih.gov/pubmed/32764260
http://dx.doi.org/10.3390/genes11080892
work_keys_str_mv AT ramzanfaisal combiningrandomforestsandasignaldetectionmethodleadstotherobustdetectionofgenotypephenotypeassociations
AT gultasmehmet combiningrandomforestsandasignaldetectionmethodleadstotherobustdetectionofgenotypephenotypeassociations
AT bertramhendrik combiningrandomforestsandasignaldetectionmethodleadstotherobustdetectionofgenotypephenotypeassociations
AT caverodavid combiningrandomforestsandasignaldetectionmethodleadstotherobustdetectionofgenotypephenotypeassociations
AT schmittarminotto combiningrandomforestsandasignaldetectionmethodleadstotherobustdetectionofgenotypephenotypeassociations