Cargando…

Comparing Different Statistical Models and Multiple Testing Corrections for Association Mapping in Soybean and Maize

Association mapping (AM) is a powerful tool for fine mapping complex trait variation down to nucleotide sequences by exploiting historical recombination events. A major problem in AM is controlling false positives that can arise from population structure and family relatedness. False positives are o...

Descripción completa

Detalles Bibliográficos
Autores principales: Kaler, Avjinder S., Gillman, Jason D., Beissinger, Timothy, Purcell, Larry C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7052329/
https://www.ncbi.nlm.nih.gov/pubmed/32158452
http://dx.doi.org/10.3389/fpls.2019.01794
_version_ 1783502848163577856
author Kaler, Avjinder S.
Gillman, Jason D.
Beissinger, Timothy
Purcell, Larry C.
author_facet Kaler, Avjinder S.
Gillman, Jason D.
Beissinger, Timothy
Purcell, Larry C.
author_sort Kaler, Avjinder S.
collection PubMed
description Association mapping (AM) is a powerful tool for fine mapping complex trait variation down to nucleotide sequences by exploiting historical recombination events. A major problem in AM is controlling false positives that can arise from population structure and family relatedness. False positives are often controlled by incorporating covariates for structure and kinship in mixed linear models (MLM). These MLM-based methods are single locus models and can introduce false negatives due to over fitting of the model. In this study, eight different statistical models, ranging from single-locus to multilocus, were compared for AM for three traits differing in heritability in two crop species: soybean (Glycine max L.) and maize (Zea mays L.). Soybean and maize were chosen, in part, due to their highly differentiated rate of linkage disequilibrium (LD) decay, which can influence false positive and false negative rates. The fixed and random model circulating probability unification (FarmCPU) performed better than other models based on an analysis of Q-Q plots and on the identification of the known number of quantitative trait loci (QTLs) in a simulated data set. These results indicate that the FarmCPU controls both false positives and false negatives. Six qualitative traits in soybean with known published genomic positions were also used to compare these models, and results indicated that the FarmCPU consistently identified a single highly significant SNP closest to these known published genes. Multiple comparison adjustments (Bonferroni, false discovery rate, and positive false discovery rate) were compared for these models using a simulated trait having 60% heritability and 20 QTLs. Multiple comparison adjustments were overly conservative for MLM, CMLM, ECMLM, and MLMM and did not find any significant markers; in contrast, ANOVA, GLM, and SUPER models found an excessive number of markers, far more than 20 QTLs. The FarmCPU model, using less conservative methods (false discovery rate, and positive false discovery rate) identified 10 QTLs, which was closer to the simulated number of QTLs than the number found by other models.
format Online
Article
Text
id pubmed-7052329
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-70523292020-03-10 Comparing Different Statistical Models and Multiple Testing Corrections for Association Mapping in Soybean and Maize Kaler, Avjinder S. Gillman, Jason D. Beissinger, Timothy Purcell, Larry C. Front Plant Sci Plant Science Association mapping (AM) is a powerful tool for fine mapping complex trait variation down to nucleotide sequences by exploiting historical recombination events. A major problem in AM is controlling false positives that can arise from population structure and family relatedness. False positives are often controlled by incorporating covariates for structure and kinship in mixed linear models (MLM). These MLM-based methods are single locus models and can introduce false negatives due to over fitting of the model. In this study, eight different statistical models, ranging from single-locus to multilocus, were compared for AM for three traits differing in heritability in two crop species: soybean (Glycine max L.) and maize (Zea mays L.). Soybean and maize were chosen, in part, due to their highly differentiated rate of linkage disequilibrium (LD) decay, which can influence false positive and false negative rates. The fixed and random model circulating probability unification (FarmCPU) performed better than other models based on an analysis of Q-Q plots and on the identification of the known number of quantitative trait loci (QTLs) in a simulated data set. These results indicate that the FarmCPU controls both false positives and false negatives. Six qualitative traits in soybean with known published genomic positions were also used to compare these models, and results indicated that the FarmCPU consistently identified a single highly significant SNP closest to these known published genes. Multiple comparison adjustments (Bonferroni, false discovery rate, and positive false discovery rate) were compared for these models using a simulated trait having 60% heritability and 20 QTLs. Multiple comparison adjustments were overly conservative for MLM, CMLM, ECMLM, and MLMM and did not find any significant markers; in contrast, ANOVA, GLM, and SUPER models found an excessive number of markers, far more than 20 QTLs. The FarmCPU model, using less conservative methods (false discovery rate, and positive false discovery rate) identified 10 QTLs, which was closer to the simulated number of QTLs than the number found by other models. Frontiers Media S.A. 2020-02-25 /pmc/articles/PMC7052329/ /pubmed/32158452 http://dx.doi.org/10.3389/fpls.2019.01794 Text en Copyright © 2020 Kaler, Gillman, Beissinger and Purcell http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Plant Science
Kaler, Avjinder S.
Gillman, Jason D.
Beissinger, Timothy
Purcell, Larry C.
Comparing Different Statistical Models and Multiple Testing Corrections for Association Mapping in Soybean and Maize
title Comparing Different Statistical Models and Multiple Testing Corrections for Association Mapping in Soybean and Maize
title_full Comparing Different Statistical Models and Multiple Testing Corrections for Association Mapping in Soybean and Maize
title_fullStr Comparing Different Statistical Models and Multiple Testing Corrections for Association Mapping in Soybean and Maize
title_full_unstemmed Comparing Different Statistical Models and Multiple Testing Corrections for Association Mapping in Soybean and Maize
title_short Comparing Different Statistical Models and Multiple Testing Corrections for Association Mapping in Soybean and Maize
title_sort comparing different statistical models and multiple testing corrections for association mapping in soybean and maize
topic Plant Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7052329/
https://www.ncbi.nlm.nih.gov/pubmed/32158452
http://dx.doi.org/10.3389/fpls.2019.01794
work_keys_str_mv AT kaleravjinders comparingdifferentstatisticalmodelsandmultipletestingcorrectionsforassociationmappinginsoybeanandmaize
AT gillmanjasond comparingdifferentstatisticalmodelsandmultipletestingcorrectionsforassociationmappinginsoybeanandmaize
AT beissingertimothy comparingdifferentstatisticalmodelsandmultipletestingcorrectionsforassociationmappinginsoybeanandmaize
AT purcelllarryc comparingdifferentstatisticalmodelsandmultipletestingcorrectionsforassociationmappinginsoybeanandmaize