Cargando…
Imputation of 3 million SNPs in the Arabidopsis regional mapping population
Natural variation has become a prime resource to identify genetic variants that contribute to phenotypic variation. The regional mapping (RegMap) population is one of the most important populations for studying natural variation in Arabidopsis thaliana, and has been used in a large number of associa...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
John Wiley and Sons Inc.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7318218/ https://www.ncbi.nlm.nih.gov/pubmed/31856318 http://dx.doi.org/10.1111/tpj.14659 |
_version_ | 1783550796469633024 |
---|---|
author | Arouisse, Bader Korte, Arthur van Eeuwijk, Fred Kruijer, Willem |
author_facet | Arouisse, Bader Korte, Arthur van Eeuwijk, Fred Kruijer, Willem |
author_sort | Arouisse, Bader |
collection | PubMed |
description | Natural variation has become a prime resource to identify genetic variants that contribute to phenotypic variation. The regional mapping (RegMap) population is one of the most important populations for studying natural variation in Arabidopsis thaliana, and has been used in a large number of association studies and in studies on climatic adaptation. However, only 413 RegMap accessions have been completely sequenced, as part of the 1001 Genomes (1001G) Project, while the remaining 894 accessions have only been genotyped with the Affymetrix 250k chip. As a consequence, most association studies involving the RegMap are either restricted to the sequenced accessions, reducing power, or rely on a limited set of SNPs. Here we impute millions of SNPs to the 894 accessions that are exclusive to the RegMap, using the 1135 accessions of the 1001G Project as the reference panel. We assess imputation accuracy using a novel cross‐validation scheme, which we show provides a more reliable measure of accuracy than existing methods. After filtering out low accuracy SNPs, we obtain high‐quality genotypic information for 2029 accessions and 3 million markers. To illustrate the benefits of these imputed data, we reconducted genome‐wide association studies on five stress‐related traits and could identify novel candidate genes. |
format | Online Article Text |
id | pubmed-7318218 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | John Wiley and Sons Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-73182182020-06-29 Imputation of 3 million SNPs in the Arabidopsis regional mapping population Arouisse, Bader Korte, Arthur van Eeuwijk, Fred Kruijer, Willem Plant J Resource Natural variation has become a prime resource to identify genetic variants that contribute to phenotypic variation. The regional mapping (RegMap) population is one of the most important populations for studying natural variation in Arabidopsis thaliana, and has been used in a large number of association studies and in studies on climatic adaptation. However, only 413 RegMap accessions have been completely sequenced, as part of the 1001 Genomes (1001G) Project, while the remaining 894 accessions have only been genotyped with the Affymetrix 250k chip. As a consequence, most association studies involving the RegMap are either restricted to the sequenced accessions, reducing power, or rely on a limited set of SNPs. Here we impute millions of SNPs to the 894 accessions that are exclusive to the RegMap, using the 1135 accessions of the 1001G Project as the reference panel. We assess imputation accuracy using a novel cross‐validation scheme, which we show provides a more reliable measure of accuracy than existing methods. After filtering out low accuracy SNPs, we obtain high‐quality genotypic information for 2029 accessions and 3 million markers. To illustrate the benefits of these imputed data, we reconducted genome‐wide association studies on five stress‐related traits and could identify novel candidate genes. John Wiley and Sons Inc. 2020-02-11 2020-05 /pmc/articles/PMC7318218/ /pubmed/31856318 http://dx.doi.org/10.1111/tpj.14659 Text en © 2019 The Authors The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Resource Arouisse, Bader Korte, Arthur van Eeuwijk, Fred Kruijer, Willem Imputation of 3 million SNPs in the Arabidopsis regional mapping population |
title | Imputation of 3 million SNPs in the Arabidopsis regional mapping population |
title_full | Imputation of 3 million SNPs in the Arabidopsis regional mapping population |
title_fullStr | Imputation of 3 million SNPs in the Arabidopsis regional mapping population |
title_full_unstemmed | Imputation of 3 million SNPs in the Arabidopsis regional mapping population |
title_short | Imputation of 3 million SNPs in the Arabidopsis regional mapping population |
title_sort | imputation of 3 million snps in the arabidopsis regional mapping population |
topic | Resource |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7318218/ https://www.ncbi.nlm.nih.gov/pubmed/31856318 http://dx.doi.org/10.1111/tpj.14659 |
work_keys_str_mv | AT arouissebader imputationof3millionsnpsinthearabidopsisregionalmappingpopulation AT kortearthur imputationof3millionsnpsinthearabidopsisregionalmappingpopulation AT vaneeuwijkfred imputationof3millionsnpsinthearabidopsisregionalmappingpopulation AT kruijerwillem imputationof3millionsnpsinthearabidopsisregionalmappingpopulation |