Cargando…

A linkage disequilibrium-based approach to position unmapped SNPs in crop species

BACKGROUND: High-density SNP arrays are now available for a wide range of crop species. Despite the development of many tools for generating genetic maps, the genome position of many SNPs from these arrays is unknown. Here we propose a linkage disequilibrium (LD)-based algorithm to allocate unassign...

Descripción completa

Detalles Bibliográficos
Autores principales: Yadav, Seema, Ross, Elizabeth M., Aitken, Karen S., Hickey, Lee T., Powell, Owen, Wei, Xianming, Voss-Fels, Kai P., Hayes, Ben J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8555328/
https://www.ncbi.nlm.nih.gov/pubmed/34715779
http://dx.doi.org/10.1186/s12864-021-08116-w
_version_ 1784591956115980288
author Yadav, Seema
Ross, Elizabeth M.
Aitken, Karen S.
Hickey, Lee T.
Powell, Owen
Wei, Xianming
Voss-Fels, Kai P.
Hayes, Ben J.
author_facet Yadav, Seema
Ross, Elizabeth M.
Aitken, Karen S.
Hickey, Lee T.
Powell, Owen
Wei, Xianming
Voss-Fels, Kai P.
Hayes, Ben J.
author_sort Yadav, Seema
collection PubMed
description BACKGROUND: High-density SNP arrays are now available for a wide range of crop species. Despite the development of many tools for generating genetic maps, the genome position of many SNPs from these arrays is unknown. Here we propose a linkage disequilibrium (LD)-based algorithm to allocate unassigned SNPs to chromosome regions from sparse genetic maps. This algorithm was tested on sugarcane, wheat, and barley data sets. We calculated the algorithm’s efficiency by masking SNPs with known locations, then assigning their position to the map with the algorithm, and finally comparing the assigned and true positions. RESULTS: In the 20-fold cross-validation, the mean proportion of masked mapped SNPs that were placed by the algorithm to a chromosome was 89.53, 94.25, and 97.23% for sugarcane, wheat, and barley, respectively. Of the markers that were placed in the genome, 98.73, 96.45 and 98.53% of the SNPs were positioned on the correct chromosome. The mean correlations between known and new estimated SNP positions were 0.97, 0.98, and 0.97 for sugarcane, wheat, and barley. The LD-based algorithm was used to assign 5920 out of 21,251 unpositioned markers to the current Q208 sugarcane genetic map, representing the highest density genetic map for this species to date. CONCLUSIONS: Our LD-based approach can be used to accurately assign unpositioned SNPs to existing genetic maps, improving genome-wide association studies and genomic prediction in crop species with fragmented and incomplete genome assemblies. This approach will facilitate genomic-assisted breeding for many orphan crops that lack genetic and genomic resources.
format Online
Article
Text
id pubmed-8555328
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-85553282021-10-29 A linkage disequilibrium-based approach to position unmapped SNPs in crop species Yadav, Seema Ross, Elizabeth M. Aitken, Karen S. Hickey, Lee T. Powell, Owen Wei, Xianming Voss-Fels, Kai P. Hayes, Ben J. BMC Genomics Research BACKGROUND: High-density SNP arrays are now available for a wide range of crop species. Despite the development of many tools for generating genetic maps, the genome position of many SNPs from these arrays is unknown. Here we propose a linkage disequilibrium (LD)-based algorithm to allocate unassigned SNPs to chromosome regions from sparse genetic maps. This algorithm was tested on sugarcane, wheat, and barley data sets. We calculated the algorithm’s efficiency by masking SNPs with known locations, then assigning their position to the map with the algorithm, and finally comparing the assigned and true positions. RESULTS: In the 20-fold cross-validation, the mean proportion of masked mapped SNPs that were placed by the algorithm to a chromosome was 89.53, 94.25, and 97.23% for sugarcane, wheat, and barley, respectively. Of the markers that were placed in the genome, 98.73, 96.45 and 98.53% of the SNPs were positioned on the correct chromosome. The mean correlations between known and new estimated SNP positions were 0.97, 0.98, and 0.97 for sugarcane, wheat, and barley. The LD-based algorithm was used to assign 5920 out of 21,251 unpositioned markers to the current Q208 sugarcane genetic map, representing the highest density genetic map for this species to date. CONCLUSIONS: Our LD-based approach can be used to accurately assign unpositioned SNPs to existing genetic maps, improving genome-wide association studies and genomic prediction in crop species with fragmented and incomplete genome assemblies. This approach will facilitate genomic-assisted breeding for many orphan crops that lack genetic and genomic resources. BioMed Central 2021-10-29 /pmc/articles/PMC8555328/ /pubmed/34715779 http://dx.doi.org/10.1186/s12864-021-08116-w Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Yadav, Seema
Ross, Elizabeth M.
Aitken, Karen S.
Hickey, Lee T.
Powell, Owen
Wei, Xianming
Voss-Fels, Kai P.
Hayes, Ben J.
A linkage disequilibrium-based approach to position unmapped SNPs in crop species
title A linkage disequilibrium-based approach to position unmapped SNPs in crop species
title_full A linkage disequilibrium-based approach to position unmapped SNPs in crop species
title_fullStr A linkage disequilibrium-based approach to position unmapped SNPs in crop species
title_full_unstemmed A linkage disequilibrium-based approach to position unmapped SNPs in crop species
title_short A linkage disequilibrium-based approach to position unmapped SNPs in crop species
title_sort linkage disequilibrium-based approach to position unmapped snps in crop species
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8555328/
https://www.ncbi.nlm.nih.gov/pubmed/34715779
http://dx.doi.org/10.1186/s12864-021-08116-w
work_keys_str_mv AT yadavseema alinkagedisequilibriumbasedapproachtopositionunmappedsnpsincropspecies
AT rosselizabethm alinkagedisequilibriumbasedapproachtopositionunmappedsnpsincropspecies
AT aitkenkarens alinkagedisequilibriumbasedapproachtopositionunmappedsnpsincropspecies
AT hickeyleet alinkagedisequilibriumbasedapproachtopositionunmappedsnpsincropspecies
AT powellowen alinkagedisequilibriumbasedapproachtopositionunmappedsnpsincropspecies
AT weixianming alinkagedisequilibriumbasedapproachtopositionunmappedsnpsincropspecies
AT vossfelskaip alinkagedisequilibriumbasedapproachtopositionunmappedsnpsincropspecies
AT hayesbenj alinkagedisequilibriumbasedapproachtopositionunmappedsnpsincropspecies
AT yadavseema linkagedisequilibriumbasedapproachtopositionunmappedsnpsincropspecies
AT rosselizabethm linkagedisequilibriumbasedapproachtopositionunmappedsnpsincropspecies
AT aitkenkarens linkagedisequilibriumbasedapproachtopositionunmappedsnpsincropspecies
AT hickeyleet linkagedisequilibriumbasedapproachtopositionunmappedsnpsincropspecies
AT powellowen linkagedisequilibriumbasedapproachtopositionunmappedsnpsincropspecies
AT weixianming linkagedisequilibriumbasedapproachtopositionunmappedsnpsincropspecies
AT vossfelskaip linkagedisequilibriumbasedapproachtopositionunmappedsnpsincropspecies
AT hayesbenj linkagedisequilibriumbasedapproachtopositionunmappedsnpsincropspecies