Cargando…

Evaluation of the Minimum Sampling Design for Population Genomic and Microsatellite Studies: An Analysis Based on Wild Maize

Massive parallel sequencing (MPS) is revolutionizing the field of molecular ecology by allowing us to understand better the evolutionary history of populations and species, and to detect genomic regions that could be under selection. However, the economic and computational resources needed generate...

Descripción completa

Detalles Bibliográficos
Autores principales: Aguirre-Liguori, Jonás A., Luna-Sánchez, Javier A., Gasca-Pineda, Jaime, Eguiarte, Luis E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7531271/
https://www.ncbi.nlm.nih.gov/pubmed/33193568
http://dx.doi.org/10.3389/fgene.2020.00870
_version_ 1783589730749775872
author Aguirre-Liguori, Jonás A.
Luna-Sánchez, Javier A.
Gasca-Pineda, Jaime
Eguiarte, Luis E.
author_facet Aguirre-Liguori, Jonás A.
Luna-Sánchez, Javier A.
Gasca-Pineda, Jaime
Eguiarte, Luis E.
author_sort Aguirre-Liguori, Jonás A.
collection PubMed
description Massive parallel sequencing (MPS) is revolutionizing the field of molecular ecology by allowing us to understand better the evolutionary history of populations and species, and to detect genomic regions that could be under selection. However, the economic and computational resources needed generate a tradeoff between the amount of loci that can be obtained and the number of populations or individuals that can be sequenced. In this work, we analyzed and compared two simulated genomic datasets fitting a hierarchical structure, two extensive empirical genomic datasets, and a dataset comprising microsatellite information. For all datasets, we generated different subsampling designs by changing the number of loci, individuals, populations, and individuals per population to test for deviations in classic population genetics parameters (H(S), F(IS), F(ST)). For the empirical datasets we also analyzed the effect of sampling design on landscape genetic tests (isolation by distance and environment, central abundance hypothesis). We also tested the effect of sampling a different number of populations in the detection of outlier SNPs. We found that the microsatellite dataset is very sensitive to the number of individuals sampled when obtaining summary statistics. F(IS) was particularly sensitive to a low sampling of individuals in the simulated, genomic, and microsatellite datasets. For the empirical and simulated genomic datasets, we found that as long as many populations are sampled, few individuals and loci are needed. For the empirical datasets, we found that increasing the number of populations sampled was important in obtaining precise landscape genetic estimates. Finally, we corroborated that outlier tests are sensitive to the number of populations sampled. We conclude by proposing different sampling designs depending on the objectives.
format Online
Article
Text
id pubmed-7531271
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-75312712020-11-13 Evaluation of the Minimum Sampling Design for Population Genomic and Microsatellite Studies: An Analysis Based on Wild Maize Aguirre-Liguori, Jonás A. Luna-Sánchez, Javier A. Gasca-Pineda, Jaime Eguiarte, Luis E. Front Genet Genetics Massive parallel sequencing (MPS) is revolutionizing the field of molecular ecology by allowing us to understand better the evolutionary history of populations and species, and to detect genomic regions that could be under selection. However, the economic and computational resources needed generate a tradeoff between the amount of loci that can be obtained and the number of populations or individuals that can be sequenced. In this work, we analyzed and compared two simulated genomic datasets fitting a hierarchical structure, two extensive empirical genomic datasets, and a dataset comprising microsatellite information. For all datasets, we generated different subsampling designs by changing the number of loci, individuals, populations, and individuals per population to test for deviations in classic population genetics parameters (H(S), F(IS), F(ST)). For the empirical datasets we also analyzed the effect of sampling design on landscape genetic tests (isolation by distance and environment, central abundance hypothesis). We also tested the effect of sampling a different number of populations in the detection of outlier SNPs. We found that the microsatellite dataset is very sensitive to the number of individuals sampled when obtaining summary statistics. F(IS) was particularly sensitive to a low sampling of individuals in the simulated, genomic, and microsatellite datasets. For the empirical and simulated genomic datasets, we found that as long as many populations are sampled, few individuals and loci are needed. For the empirical datasets, we found that increasing the number of populations sampled was important in obtaining precise landscape genetic estimates. Finally, we corroborated that outlier tests are sensitive to the number of populations sampled. We conclude by proposing different sampling designs depending on the objectives. Frontiers Media S.A. 2020-09-18 /pmc/articles/PMC7531271/ /pubmed/33193568 http://dx.doi.org/10.3389/fgene.2020.00870 Text en Copyright © 2020 Aguirre-Liguori, Luna-Sánchez, Gasca-Pineda and Eguiarte. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Aguirre-Liguori, Jonás A.
Luna-Sánchez, Javier A.
Gasca-Pineda, Jaime
Eguiarte, Luis E.
Evaluation of the Minimum Sampling Design for Population Genomic and Microsatellite Studies: An Analysis Based on Wild Maize
title Evaluation of the Minimum Sampling Design for Population Genomic and Microsatellite Studies: An Analysis Based on Wild Maize
title_full Evaluation of the Minimum Sampling Design for Population Genomic and Microsatellite Studies: An Analysis Based on Wild Maize
title_fullStr Evaluation of the Minimum Sampling Design for Population Genomic and Microsatellite Studies: An Analysis Based on Wild Maize
title_full_unstemmed Evaluation of the Minimum Sampling Design for Population Genomic and Microsatellite Studies: An Analysis Based on Wild Maize
title_short Evaluation of the Minimum Sampling Design for Population Genomic and Microsatellite Studies: An Analysis Based on Wild Maize
title_sort evaluation of the minimum sampling design for population genomic and microsatellite studies: an analysis based on wild maize
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7531271/
https://www.ncbi.nlm.nih.gov/pubmed/33193568
http://dx.doi.org/10.3389/fgene.2020.00870
work_keys_str_mv AT aguirreliguorijonasa evaluationoftheminimumsamplingdesignforpopulationgenomicandmicrosatellitestudiesananalysisbasedonwildmaize
AT lunasanchezjaviera evaluationoftheminimumsamplingdesignforpopulationgenomicandmicrosatellitestudiesananalysisbasedonwildmaize
AT gascapinedajaime evaluationoftheminimumsamplingdesignforpopulationgenomicandmicrosatellitestudiesananalysisbasedonwildmaize
AT eguiarteluise evaluationoftheminimumsamplingdesignforpopulationgenomicandmicrosatellitestudiesananalysisbasedonwildmaize