Cargando…

Imputation of 3 million SNPs in the Arabidopsis regional mapping population

Natural variation has become a prime resource to identify genetic variants that contribute to phenotypic variation. The regional mapping (RegMap) population is one of the most important populations for studying natural variation in Arabidopsis thaliana, and has been used in a large number of associa...

Descripción completa

Detalles Bibliográficos
Autores principales: Arouisse, Bader, Korte, Arthur, van Eeuwijk, Fred, Kruijer, Willem
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7318218/
https://www.ncbi.nlm.nih.gov/pubmed/31856318
http://dx.doi.org/10.1111/tpj.14659
_version_ 1783550796469633024
author Arouisse, Bader
Korte, Arthur
van Eeuwijk, Fred
Kruijer, Willem
author_facet Arouisse, Bader
Korte, Arthur
van Eeuwijk, Fred
Kruijer, Willem
author_sort Arouisse, Bader
collection PubMed
description Natural variation has become a prime resource to identify genetic variants that contribute to phenotypic variation. The regional mapping (RegMap) population is one of the most important populations for studying natural variation in Arabidopsis thaliana, and has been used in a large number of association studies and in studies on climatic adaptation. However, only 413 RegMap accessions have been completely sequenced, as part of the 1001 Genomes (1001G) Project, while the remaining 894 accessions have only been genotyped with the Affymetrix 250k chip. As a consequence, most association studies involving the RegMap are either restricted to the sequenced accessions, reducing power, or rely on a limited set of SNPs. Here we impute millions of SNPs to the 894 accessions that are exclusive to the RegMap, using the 1135 accessions of the 1001G Project as the reference panel. We assess imputation accuracy using a novel cross‐validation scheme, which we show provides a more reliable measure of accuracy than existing methods. After filtering out low accuracy SNPs, we obtain high‐quality genotypic information for 2029 accessions and 3 million markers. To illustrate the benefits of these imputed data, we reconducted genome‐wide association studies on five stress‐related traits and could identify novel candidate genes.
format Online
Article
Text
id pubmed-7318218
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-73182182020-06-29 Imputation of 3 million SNPs in the Arabidopsis regional mapping population Arouisse, Bader Korte, Arthur van Eeuwijk, Fred Kruijer, Willem Plant J Resource Natural variation has become a prime resource to identify genetic variants that contribute to phenotypic variation. The regional mapping (RegMap) population is one of the most important populations for studying natural variation in Arabidopsis thaliana, and has been used in a large number of association studies and in studies on climatic adaptation. However, only 413 RegMap accessions have been completely sequenced, as part of the 1001 Genomes (1001G) Project, while the remaining 894 accessions have only been genotyped with the Affymetrix 250k chip. As a consequence, most association studies involving the RegMap are either restricted to the sequenced accessions, reducing power, or rely on a limited set of SNPs. Here we impute millions of SNPs to the 894 accessions that are exclusive to the RegMap, using the 1135 accessions of the 1001G Project as the reference panel. We assess imputation accuracy using a novel cross‐validation scheme, which we show provides a more reliable measure of accuracy than existing methods. After filtering out low accuracy SNPs, we obtain high‐quality genotypic information for 2029 accessions and 3 million markers. To illustrate the benefits of these imputed data, we reconducted genome‐wide association studies on five stress‐related traits and could identify novel candidate genes. John Wiley and Sons Inc. 2020-02-11 2020-05 /pmc/articles/PMC7318218/ /pubmed/31856318 http://dx.doi.org/10.1111/tpj.14659 Text en © 2019 The Authors The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Resource
Arouisse, Bader
Korte, Arthur
van Eeuwijk, Fred
Kruijer, Willem
Imputation of 3 million SNPs in the Arabidopsis regional mapping population
title Imputation of 3 million SNPs in the Arabidopsis regional mapping population
title_full Imputation of 3 million SNPs in the Arabidopsis regional mapping population
title_fullStr Imputation of 3 million SNPs in the Arabidopsis regional mapping population
title_full_unstemmed Imputation of 3 million SNPs in the Arabidopsis regional mapping population
title_short Imputation of 3 million SNPs in the Arabidopsis regional mapping population
title_sort imputation of 3 million snps in the arabidopsis regional mapping population
topic Resource
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7318218/
https://www.ncbi.nlm.nih.gov/pubmed/31856318
http://dx.doi.org/10.1111/tpj.14659
work_keys_str_mv AT arouissebader imputationof3millionsnpsinthearabidopsisregionalmappingpopulation
AT kortearthur imputationof3millionsnpsinthearabidopsisregionalmappingpopulation
AT vaneeuwijkfred imputationof3millionsnpsinthearabidopsisregionalmappingpopulation
AT kruijerwillem imputationof3millionsnpsinthearabidopsisregionalmappingpopulation