Cargando…

Association studies using family pools of outcrossing crops based on allele-frequency estimates from DNA sequencing

KEY MESSAGE: We propose a method in which GBS data can be conveniently analyzed without calling genotypes. ABSTRACT: F2 families are frequently used in breeding of outcrossing species, for instance to obtain trait measurements on plots. We propose to perform association studies by obtaining a matchi...

Descripción completa

Detalles Bibliográficos
Autores principales: Ashraf, Bilal H., Jensen, Just, Asp, Torben, Janss, Luc L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Berlin Heidelberg 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4035547/
https://www.ncbi.nlm.nih.gov/pubmed/24668443
http://dx.doi.org/10.1007/s00122-014-2300-4
_version_ 1782318060847235072
author Ashraf, Bilal H.
Jensen, Just
Asp, Torben
Janss, Luc L.
author_facet Ashraf, Bilal H.
Jensen, Just
Asp, Torben
Janss, Luc L.
author_sort Ashraf, Bilal H.
collection PubMed
description KEY MESSAGE: We propose a method in which GBS data can be conveniently analyzed without calling genotypes. ABSTRACT: F2 families are frequently used in breeding of outcrossing species, for instance to obtain trait measurements on plots. We propose to perform association studies by obtaining a matching “family genotype” from sequencing a pooled sample of the family, and to directly use allele frequencies computed from sequence read-counts for mapping. We show that, under additivity assumptions, there is a linear relationship between the family phenotype and family allele frequency, and that a regression of family phenotype on family allele frequency will estimate twice the allele substitution effect at a locus. However, medium-to-low sequencing depth causes underestimation of the true allele substitution effect. An expression for this underestimation is derived for the case that parents are diploid, such that F2 families have up to four dosages of every allele. Using simulation studies, estimation of the allele effect from F2-family pools was verified and it was shown that the underestimation of the allele effect is correctly described. The optimal design for an association study when sequencing budget would be fixed is obtained using large sample size and lower sequence depth, and using higher SNP density (resulting in higher LD with causative mutations) and lower sequencing depth. Therefore, association studies using genotyping by sequencing are optimal and use low sequencing depth per sample. The developed framework for association studies using allele frequencies from sequencing can be modified for other types of family pools and is also directly applicable for association studies in polyploids. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s00122-014-2300-4) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4035547
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Springer Berlin Heidelberg
record_format MEDLINE/PubMed
spelling pubmed-40355472014-05-29 Association studies using family pools of outcrossing crops based on allele-frequency estimates from DNA sequencing Ashraf, Bilal H. Jensen, Just Asp, Torben Janss, Luc L. Theor Appl Genet Original Paper KEY MESSAGE: We propose a method in which GBS data can be conveniently analyzed without calling genotypes. ABSTRACT: F2 families are frequently used in breeding of outcrossing species, for instance to obtain trait measurements on plots. We propose to perform association studies by obtaining a matching “family genotype” from sequencing a pooled sample of the family, and to directly use allele frequencies computed from sequence read-counts for mapping. We show that, under additivity assumptions, there is a linear relationship between the family phenotype and family allele frequency, and that a regression of family phenotype on family allele frequency will estimate twice the allele substitution effect at a locus. However, medium-to-low sequencing depth causes underestimation of the true allele substitution effect. An expression for this underestimation is derived for the case that parents are diploid, such that F2 families have up to four dosages of every allele. Using simulation studies, estimation of the allele effect from F2-family pools was verified and it was shown that the underestimation of the allele effect is correctly described. The optimal design for an association study when sequencing budget would be fixed is obtained using large sample size and lower sequence depth, and using higher SNP density (resulting in higher LD with causative mutations) and lower sequencing depth. Therefore, association studies using genotyping by sequencing are optimal and use low sequencing depth per sample. The developed framework for association studies using allele frequencies from sequencing can be modified for other types of family pools and is also directly applicable for association studies in polyploids. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s00122-014-2300-4) contains supplementary material, which is available to authorized users. Springer Berlin Heidelberg 2014-03-26 2014 /pmc/articles/PMC4035547/ /pubmed/24668443 http://dx.doi.org/10.1007/s00122-014-2300-4 Text en © The Author(s) 2014 https://creativecommons.org/licenses/by/4.0/ Open AccessThis article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.
spellingShingle Original Paper
Ashraf, Bilal H.
Jensen, Just
Asp, Torben
Janss, Luc L.
Association studies using family pools of outcrossing crops based on allele-frequency estimates from DNA sequencing
title Association studies using family pools of outcrossing crops based on allele-frequency estimates from DNA sequencing
title_full Association studies using family pools of outcrossing crops based on allele-frequency estimates from DNA sequencing
title_fullStr Association studies using family pools of outcrossing crops based on allele-frequency estimates from DNA sequencing
title_full_unstemmed Association studies using family pools of outcrossing crops based on allele-frequency estimates from DNA sequencing
title_short Association studies using family pools of outcrossing crops based on allele-frequency estimates from DNA sequencing
title_sort association studies using family pools of outcrossing crops based on allele-frequency estimates from dna sequencing
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4035547/
https://www.ncbi.nlm.nih.gov/pubmed/24668443
http://dx.doi.org/10.1007/s00122-014-2300-4
work_keys_str_mv AT ashrafbilalh associationstudiesusingfamilypoolsofoutcrossingcropsbasedonallelefrequencyestimatesfromdnasequencing
AT jensenjust associationstudiesusingfamilypoolsofoutcrossingcropsbasedonallelefrequencyestimatesfromdnasequencing
AT asptorben associationstudiesusingfamilypoolsofoutcrossingcropsbasedonallelefrequencyestimatesfromdnasequencing
AT jansslucl associationstudiesusingfamilypoolsofoutcrossingcropsbasedonallelefrequencyestimatesfromdnasequencing