Cargando…
Genome-Wide Association Study Based on Multiple Imputation with Low-Depth Sequencing Data: Application to Biofuel Traits in Reed Canarygrass
Genotyping by sequencing allows for large-scale genetic analyses in plant species with no reference genome, but sets the challenge of sound inference in presence of uncertain genotypes. We report an imputation-based genome-wide association study (GWAS) in reed canarygrass (Phalaris arundinacea L., P...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Genetics Society of America
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4426374/ https://www.ncbi.nlm.nih.gov/pubmed/25770100 http://dx.doi.org/10.1534/g3.115.017533 |
_version_ | 1782370585737691136 |
---|---|
author | Ramstein, Guillaume P. Lipka, Alexander E. Lu, Fei Costich, Denise E. Cherney, Jerome H. Buckler, Edward S. Casler, Michael D. |
author_facet | Ramstein, Guillaume P. Lipka, Alexander E. Lu, Fei Costich, Denise E. Cherney, Jerome H. Buckler, Edward S. Casler, Michael D. |
author_sort | Ramstein, Guillaume P. |
collection | PubMed |
description | Genotyping by sequencing allows for large-scale genetic analyses in plant species with no reference genome, but sets the challenge of sound inference in presence of uncertain genotypes. We report an imputation-based genome-wide association study (GWAS) in reed canarygrass (Phalaris arundinacea L., Phalaris caesia Nees), a cool-season grass species with potential as a biofuel crop. Our study involved two linkage populations and an association panel of 590 reed canarygrass genotypes. Plants were assayed for up to 5228 single nucleotide polymorphism markers and 35 traits. The genotypic markers were derived from low-depth sequencing with 78% missing data on average. To soundly infer marker-trait associations, multiple imputation (MI) was used: several imputes of the marker data were generated to reflect imputation uncertainty and association tests were performed on marker effects across imputes. A total of nine significant markers were identified, three of which showed significant homology with the Brachypodium dystachion genome. Because no physical map of the reed canarygrass genome was available, imputation was conducted using classification trees. In general, MI showed good consistency with the complete-case analysis and adequate control over imputation uncertainty. A gain in significance of marker effects was achieved through MI, but only for rare cases when missing data were <45%. In addition to providing insight into the genetic basis of important traits in reed canarygrass, this study presents one of the first applications of MI to genome-wide analyses and provides useful guidelines for conducting GWAS based on genotyping-by-sequencing data. |
format | Online Article Text |
id | pubmed-4426374 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Genetics Society of America |
record_format | MEDLINE/PubMed |
spelling | pubmed-44263742015-05-13 Genome-Wide Association Study Based on Multiple Imputation with Low-Depth Sequencing Data: Application to Biofuel Traits in Reed Canarygrass Ramstein, Guillaume P. Lipka, Alexander E. Lu, Fei Costich, Denise E. Cherney, Jerome H. Buckler, Edward S. Casler, Michael D. G3 (Bethesda) Investigations Genotyping by sequencing allows for large-scale genetic analyses in plant species with no reference genome, but sets the challenge of sound inference in presence of uncertain genotypes. We report an imputation-based genome-wide association study (GWAS) in reed canarygrass (Phalaris arundinacea L., Phalaris caesia Nees), a cool-season grass species with potential as a biofuel crop. Our study involved two linkage populations and an association panel of 590 reed canarygrass genotypes. Plants were assayed for up to 5228 single nucleotide polymorphism markers and 35 traits. The genotypic markers were derived from low-depth sequencing with 78% missing data on average. To soundly infer marker-trait associations, multiple imputation (MI) was used: several imputes of the marker data were generated to reflect imputation uncertainty and association tests were performed on marker effects across imputes. A total of nine significant markers were identified, three of which showed significant homology with the Brachypodium dystachion genome. Because no physical map of the reed canarygrass genome was available, imputation was conducted using classification trees. In general, MI showed good consistency with the complete-case analysis and adequate control over imputation uncertainty. A gain in significance of marker effects was achieved through MI, but only for rare cases when missing data were <45%. In addition to providing insight into the genetic basis of important traits in reed canarygrass, this study presents one of the first applications of MI to genome-wide analyses and provides useful guidelines for conducting GWAS based on genotyping-by-sequencing data. Genetics Society of America 2015-03-12 /pmc/articles/PMC4426374/ /pubmed/25770100 http://dx.doi.org/10.1534/g3.115.017533 Text en Copyright © 2015 Ramstein et al. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution Unported License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Investigations Ramstein, Guillaume P. Lipka, Alexander E. Lu, Fei Costich, Denise E. Cherney, Jerome H. Buckler, Edward S. Casler, Michael D. Genome-Wide Association Study Based on Multiple Imputation with Low-Depth Sequencing Data: Application to Biofuel Traits in Reed Canarygrass |
title | Genome-Wide Association Study Based on Multiple Imputation with Low-Depth Sequencing Data: Application to Biofuel Traits in Reed Canarygrass |
title_full | Genome-Wide Association Study Based on Multiple Imputation with Low-Depth Sequencing Data: Application to Biofuel Traits in Reed Canarygrass |
title_fullStr | Genome-Wide Association Study Based on Multiple Imputation with Low-Depth Sequencing Data: Application to Biofuel Traits in Reed Canarygrass |
title_full_unstemmed | Genome-Wide Association Study Based on Multiple Imputation with Low-Depth Sequencing Data: Application to Biofuel Traits in Reed Canarygrass |
title_short | Genome-Wide Association Study Based on Multiple Imputation with Low-Depth Sequencing Data: Application to Biofuel Traits in Reed Canarygrass |
title_sort | genome-wide association study based on multiple imputation with low-depth sequencing data: application to biofuel traits in reed canarygrass |
topic | Investigations |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4426374/ https://www.ncbi.nlm.nih.gov/pubmed/25770100 http://dx.doi.org/10.1534/g3.115.017533 |
work_keys_str_mv | AT ramsteinguillaumep genomewideassociationstudybasedonmultipleimputationwithlowdepthsequencingdataapplicationtobiofueltraitsinreedcanarygrass AT lipkaalexandere genomewideassociationstudybasedonmultipleimputationwithlowdepthsequencingdataapplicationtobiofueltraitsinreedcanarygrass AT lufei genomewideassociationstudybasedonmultipleimputationwithlowdepthsequencingdataapplicationtobiofueltraitsinreedcanarygrass AT costichdenisee genomewideassociationstudybasedonmultipleimputationwithlowdepthsequencingdataapplicationtobiofueltraitsinreedcanarygrass AT cherneyjeromeh genomewideassociationstudybasedonmultipleimputationwithlowdepthsequencingdataapplicationtobiofueltraitsinreedcanarygrass AT buckleredwards genomewideassociationstudybasedonmultipleimputationwithlowdepthsequencingdataapplicationtobiofueltraitsinreedcanarygrass AT caslermichaeld genomewideassociationstudybasedonmultipleimputationwithlowdepthsequencingdataapplicationtobiofueltraitsinreedcanarygrass |