Cargando…

Genome-Wide Association Study Based on Multiple Imputation with Low-Depth Sequencing Data: Application to Biofuel Traits in Reed Canarygrass

Genotyping by sequencing allows for large-scale genetic analyses in plant species with no reference genome, but sets the challenge of sound inference in presence of uncertain genotypes. We report an imputation-based genome-wide association study (GWAS) in reed canarygrass (Phalaris arundinacea L., P...

Descripción completa

Detalles Bibliográficos
Autores principales: Ramstein, Guillaume P., Lipka, Alexander E., Lu, Fei, Costich, Denise E., Cherney, Jerome H., Buckler, Edward S., Casler, Michael D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Genetics Society of America 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4426374/
https://www.ncbi.nlm.nih.gov/pubmed/25770100
http://dx.doi.org/10.1534/g3.115.017533
_version_ 1782370585737691136
author Ramstein, Guillaume P.
Lipka, Alexander E.
Lu, Fei
Costich, Denise E.
Cherney, Jerome H.
Buckler, Edward S.
Casler, Michael D.
author_facet Ramstein, Guillaume P.
Lipka, Alexander E.
Lu, Fei
Costich, Denise E.
Cherney, Jerome H.
Buckler, Edward S.
Casler, Michael D.
author_sort Ramstein, Guillaume P.
collection PubMed
description Genotyping by sequencing allows for large-scale genetic analyses in plant species with no reference genome, but sets the challenge of sound inference in presence of uncertain genotypes. We report an imputation-based genome-wide association study (GWAS) in reed canarygrass (Phalaris arundinacea L., Phalaris caesia Nees), a cool-season grass species with potential as a biofuel crop. Our study involved two linkage populations and an association panel of 590 reed canarygrass genotypes. Plants were assayed for up to 5228 single nucleotide polymorphism markers and 35 traits. The genotypic markers were derived from low-depth sequencing with 78% missing data on average. To soundly infer marker-trait associations, multiple imputation (MI) was used: several imputes of the marker data were generated to reflect imputation uncertainty and association tests were performed on marker effects across imputes. A total of nine significant markers were identified, three of which showed significant homology with the Brachypodium dystachion genome. Because no physical map of the reed canarygrass genome was available, imputation was conducted using classification trees. In general, MI showed good consistency with the complete-case analysis and adequate control over imputation uncertainty. A gain in significance of marker effects was achieved through MI, but only for rare cases when missing data were <45%. In addition to providing insight into the genetic basis of important traits in reed canarygrass, this study presents one of the first applications of MI to genome-wide analyses and provides useful guidelines for conducting GWAS based on genotyping-by-sequencing data.
format Online
Article
Text
id pubmed-4426374
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Genetics Society of America
record_format MEDLINE/PubMed
spelling pubmed-44263742015-05-13 Genome-Wide Association Study Based on Multiple Imputation with Low-Depth Sequencing Data: Application to Biofuel Traits in Reed Canarygrass Ramstein, Guillaume P. Lipka, Alexander E. Lu, Fei Costich, Denise E. Cherney, Jerome H. Buckler, Edward S. Casler, Michael D. G3 (Bethesda) Investigations Genotyping by sequencing allows for large-scale genetic analyses in plant species with no reference genome, but sets the challenge of sound inference in presence of uncertain genotypes. We report an imputation-based genome-wide association study (GWAS) in reed canarygrass (Phalaris arundinacea L., Phalaris caesia Nees), a cool-season grass species with potential as a biofuel crop. Our study involved two linkage populations and an association panel of 590 reed canarygrass genotypes. Plants were assayed for up to 5228 single nucleotide polymorphism markers and 35 traits. The genotypic markers were derived from low-depth sequencing with 78% missing data on average. To soundly infer marker-trait associations, multiple imputation (MI) was used: several imputes of the marker data were generated to reflect imputation uncertainty and association tests were performed on marker effects across imputes. A total of nine significant markers were identified, three of which showed significant homology with the Brachypodium dystachion genome. Because no physical map of the reed canarygrass genome was available, imputation was conducted using classification trees. In general, MI showed good consistency with the complete-case analysis and adequate control over imputation uncertainty. A gain in significance of marker effects was achieved through MI, but only for rare cases when missing data were <45%. In addition to providing insight into the genetic basis of important traits in reed canarygrass, this study presents one of the first applications of MI to genome-wide analyses and provides useful guidelines for conducting GWAS based on genotyping-by-sequencing data. Genetics Society of America 2015-03-12 /pmc/articles/PMC4426374/ /pubmed/25770100 http://dx.doi.org/10.1534/g3.115.017533 Text en Copyright © 2015 Ramstein et al. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution Unported License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Investigations
Ramstein, Guillaume P.
Lipka, Alexander E.
Lu, Fei
Costich, Denise E.
Cherney, Jerome H.
Buckler, Edward S.
Casler, Michael D.
Genome-Wide Association Study Based on Multiple Imputation with Low-Depth Sequencing Data: Application to Biofuel Traits in Reed Canarygrass
title Genome-Wide Association Study Based on Multiple Imputation with Low-Depth Sequencing Data: Application to Biofuel Traits in Reed Canarygrass
title_full Genome-Wide Association Study Based on Multiple Imputation with Low-Depth Sequencing Data: Application to Biofuel Traits in Reed Canarygrass
title_fullStr Genome-Wide Association Study Based on Multiple Imputation with Low-Depth Sequencing Data: Application to Biofuel Traits in Reed Canarygrass
title_full_unstemmed Genome-Wide Association Study Based on Multiple Imputation with Low-Depth Sequencing Data: Application to Biofuel Traits in Reed Canarygrass
title_short Genome-Wide Association Study Based on Multiple Imputation with Low-Depth Sequencing Data: Application to Biofuel Traits in Reed Canarygrass
title_sort genome-wide association study based on multiple imputation with low-depth sequencing data: application to biofuel traits in reed canarygrass
topic Investigations
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4426374/
https://www.ncbi.nlm.nih.gov/pubmed/25770100
http://dx.doi.org/10.1534/g3.115.017533
work_keys_str_mv AT ramsteinguillaumep genomewideassociationstudybasedonmultipleimputationwithlowdepthsequencingdataapplicationtobiofueltraitsinreedcanarygrass
AT lipkaalexandere genomewideassociationstudybasedonmultipleimputationwithlowdepthsequencingdataapplicationtobiofueltraitsinreedcanarygrass
AT lufei genomewideassociationstudybasedonmultipleimputationwithlowdepthsequencingdataapplicationtobiofueltraitsinreedcanarygrass
AT costichdenisee genomewideassociationstudybasedonmultipleimputationwithlowdepthsequencingdataapplicationtobiofueltraitsinreedcanarygrass
AT cherneyjeromeh genomewideassociationstudybasedonmultipleimputationwithlowdepthsequencingdataapplicationtobiofueltraitsinreedcanarygrass
AT buckleredwards genomewideassociationstudybasedonmultipleimputationwithlowdepthsequencingdataapplicationtobiofueltraitsinreedcanarygrass
AT caslermichaeld genomewideassociationstudybasedonmultipleimputationwithlowdepthsequencingdataapplicationtobiofueltraitsinreedcanarygrass