Cargando…
Combining Random Forests and a Signal Detection Method Leads to the Robust Detection of Genotype-Phenotype Associations
Genome wide association studies (GWAS) are a well established methodology to identify genomic variants and genes that are responsible for traits of interest in all branches of the life sciences. Despite the long time this methodology has had to mature the reliable detection of genotype–phenotype ass...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7465705/ https://www.ncbi.nlm.nih.gov/pubmed/32764260 http://dx.doi.org/10.3390/genes11080892 |
_version_ | 1783577647991750656 |
---|---|
author | Ramzan, Faisal Gültas, Mehmet Bertram, Hendrik Cavero, David Schmitt, Armin Otto |
author_facet | Ramzan, Faisal Gültas, Mehmet Bertram, Hendrik Cavero, David Schmitt, Armin Otto |
author_sort | Ramzan, Faisal |
collection | PubMed |
description | Genome wide association studies (GWAS) are a well established methodology to identify genomic variants and genes that are responsible for traits of interest in all branches of the life sciences. Despite the long time this methodology has had to mature the reliable detection of genotype–phenotype associations is still a challenge for many quantitative traits mainly because of the large number of genomic loci with weak individual effects on the trait under investigation. Thus, it can be hypothesized that many genomic variants that have a small, however real, effect remain unnoticed in many GWAS approaches. Here, we propose a two-step procedure to address this problem. In a first step, cubic splines are fitted to the test statistic values and genomic regions with spline-peaks that are higher than expected by chance are considered as quantitative trait loci (QTL). Then the SNPs in these QTLs are prioritized with respect to the strength of their association with the phenotype using a Random Forests approach. As a case study, we apply our procedure to real data sets and find trustworthy numbers of, partially novel, genomic variants and genes involved in various egg quality traits. |
format | Online Article Text |
id | pubmed-7465705 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-74657052020-09-04 Combining Random Forests and a Signal Detection Method Leads to the Robust Detection of Genotype-Phenotype Associations Ramzan, Faisal Gültas, Mehmet Bertram, Hendrik Cavero, David Schmitt, Armin Otto Genes (Basel) Article Genome wide association studies (GWAS) are a well established methodology to identify genomic variants and genes that are responsible for traits of interest in all branches of the life sciences. Despite the long time this methodology has had to mature the reliable detection of genotype–phenotype associations is still a challenge for many quantitative traits mainly because of the large number of genomic loci with weak individual effects on the trait under investigation. Thus, it can be hypothesized that many genomic variants that have a small, however real, effect remain unnoticed in many GWAS approaches. Here, we propose a two-step procedure to address this problem. In a first step, cubic splines are fitted to the test statistic values and genomic regions with spline-peaks that are higher than expected by chance are considered as quantitative trait loci (QTL). Then the SNPs in these QTLs are prioritized with respect to the strength of their association with the phenotype using a Random Forests approach. As a case study, we apply our procedure to real data sets and find trustworthy numbers of, partially novel, genomic variants and genes involved in various egg quality traits. MDPI 2020-08-05 /pmc/articles/PMC7465705/ /pubmed/32764260 http://dx.doi.org/10.3390/genes11080892 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Ramzan, Faisal Gültas, Mehmet Bertram, Hendrik Cavero, David Schmitt, Armin Otto Combining Random Forests and a Signal Detection Method Leads to the Robust Detection of Genotype-Phenotype Associations |
title | Combining Random Forests and a Signal Detection Method Leads to the Robust Detection of Genotype-Phenotype Associations |
title_full | Combining Random Forests and a Signal Detection Method Leads to the Robust Detection of Genotype-Phenotype Associations |
title_fullStr | Combining Random Forests and a Signal Detection Method Leads to the Robust Detection of Genotype-Phenotype Associations |
title_full_unstemmed | Combining Random Forests and a Signal Detection Method Leads to the Robust Detection of Genotype-Phenotype Associations |
title_short | Combining Random Forests and a Signal Detection Method Leads to the Robust Detection of Genotype-Phenotype Associations |
title_sort | combining random forests and a signal detection method leads to the robust detection of genotype-phenotype associations |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7465705/ https://www.ncbi.nlm.nih.gov/pubmed/32764260 http://dx.doi.org/10.3390/genes11080892 |
work_keys_str_mv | AT ramzanfaisal combiningrandomforestsandasignaldetectionmethodleadstotherobustdetectionofgenotypephenotypeassociations AT gultasmehmet combiningrandomforestsandasignaldetectionmethodleadstotherobustdetectionofgenotypephenotypeassociations AT bertramhendrik combiningrandomforestsandasignaldetectionmethodleadstotherobustdetectionofgenotypephenotypeassociations AT caverodavid combiningrandomforestsandasignaldetectionmethodleadstotherobustdetectionofgenotypephenotypeassociations AT schmittarminotto combiningrandomforestsandasignaldetectionmethodleadstotherobustdetectionofgenotypephenotypeassociations |