Cargando…
Genome wide association analysis of the 16th QTL- MAS Workshop dataset using the Random Forest machine learning approach
BACKGROUND: Genome wide association studies are now widely used in the livestock sector to estimate the association among single nucleotide polymorphisms (SNPs) distributed across the whole genome and one or more trait. As computational power increases, the use of machine learning techniques to anal...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4195406/ https://www.ncbi.nlm.nih.gov/pubmed/25519518 http://dx.doi.org/10.1186/1753-6561-8-S5-S4 |
_version_ | 1782339309219610624 |
---|---|
author | Minozzi, Giulietta Pedretti , Andrea Biffani, Stefano Nicolazzi, Ezequiel Luis Stella, Alessandra |
author_facet | Minozzi, Giulietta Pedretti , Andrea Biffani, Stefano Nicolazzi, Ezequiel Luis Stella, Alessandra |
author_sort | Minozzi, Giulietta |
collection | PubMed |
description | BACKGROUND: Genome wide association studies are now widely used in the livestock sector to estimate the association among single nucleotide polymorphisms (SNPs) distributed across the whole genome and one or more trait. As computational power increases, the use of machine learning techniques to analyze large genome wide datasets becomes possible. METHODS: The objective of this study was to identify SNPs associated with the three traits simulated in the 16th MAS-QTL workshop dataset using the Random Forest (RF) approach. The approach was applied to single and multiple trait estimated breeding values, and on yield deviations and to compare them with the results of the GRAMMAR-CG method. RESULTS: The two QTL mapping methods used, GRAMMAR-CG and RF, were successful in identifying the main QTLs for trait 1 on chromosomes 1 and 4, for trait 2 on chromosomes 1, 4 and 5 and for trait 3 on chromosomes 1, 2 and 3. CONCLUSIONS: The results of the RF approach were confirmed by the GRAMMAR-CG method and validated by the effective QTL position, even if their approach to unravel cryptic genetic structure is different. Furthermore, both methods showed complementary findings. However, when the variance explained by the QTL is low, they both failed to detect significant associations. |
format | Online Article Text |
id | pubmed-4195406 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-41954062014-11-05 Genome wide association analysis of the 16th QTL- MAS Workshop dataset using the Random Forest machine learning approach Minozzi, Giulietta Pedretti , Andrea Biffani, Stefano Nicolazzi, Ezequiel Luis Stella, Alessandra BMC Proc Proceedings BACKGROUND: Genome wide association studies are now widely used in the livestock sector to estimate the association among single nucleotide polymorphisms (SNPs) distributed across the whole genome and one or more trait. As computational power increases, the use of machine learning techniques to analyze large genome wide datasets becomes possible. METHODS: The objective of this study was to identify SNPs associated with the three traits simulated in the 16th MAS-QTL workshop dataset using the Random Forest (RF) approach. The approach was applied to single and multiple trait estimated breeding values, and on yield deviations and to compare them with the results of the GRAMMAR-CG method. RESULTS: The two QTL mapping methods used, GRAMMAR-CG and RF, were successful in identifying the main QTLs for trait 1 on chromosomes 1 and 4, for trait 2 on chromosomes 1, 4 and 5 and for trait 3 on chromosomes 1, 2 and 3. CONCLUSIONS: The results of the RF approach were confirmed by the GRAMMAR-CG method and validated by the effective QTL position, even if their approach to unravel cryptic genetic structure is different. Furthermore, both methods showed complementary findings. However, when the variance explained by the QTL is low, they both failed to detect significant associations. BioMed Central 2014-10-07 /pmc/articles/PMC4195406/ /pubmed/25519518 http://dx.doi.org/10.1186/1753-6561-8-S5-S4 Text en Copyright © 2014 Minozzi et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Proceedings Minozzi, Giulietta Pedretti , Andrea Biffani, Stefano Nicolazzi, Ezequiel Luis Stella, Alessandra Genome wide association analysis of the 16th QTL- MAS Workshop dataset using the Random Forest machine learning approach |
title | Genome wide association analysis of the 16th QTL- MAS Workshop dataset using the Random Forest machine learning approach |
title_full | Genome wide association analysis of the 16th QTL- MAS Workshop dataset using the Random Forest machine learning approach |
title_fullStr | Genome wide association analysis of the 16th QTL- MAS Workshop dataset using the Random Forest machine learning approach |
title_full_unstemmed | Genome wide association analysis of the 16th QTL- MAS Workshop dataset using the Random Forest machine learning approach |
title_short | Genome wide association analysis of the 16th QTL- MAS Workshop dataset using the Random Forest machine learning approach |
title_sort | genome wide association analysis of the 16th qtl- mas workshop dataset using the random forest machine learning approach |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4195406/ https://www.ncbi.nlm.nih.gov/pubmed/25519518 http://dx.doi.org/10.1186/1753-6561-8-S5-S4 |
work_keys_str_mv | AT minozzigiulietta genomewideassociationanalysisofthe16thqtlmasworkshopdatasetusingtherandomforestmachinelearningapproach AT pedrettiandrea genomewideassociationanalysisofthe16thqtlmasworkshopdatasetusingtherandomforestmachinelearningapproach AT biffanistefano genomewideassociationanalysisofthe16thqtlmasworkshopdatasetusingtherandomforestmachinelearningapproach AT nicolazziezequielluis genomewideassociationanalysisofthe16thqtlmasworkshopdatasetusingtherandomforestmachinelearningapproach AT stellaalessandra genomewideassociationanalysisofthe16thqtlmasworkshopdatasetusingtherandomforestmachinelearningapproach |