Cargando…

A simulation study of sample size for DNA barcoding

For some groups of organisms, DNA barcoding can provide a useful tool in taxonomy, evolutionary biology, and biodiversity assessment. However, the efficacy of DNA barcoding depends on the degree of sampling per species, because a large enough sample size is needed to provide a reliable estimate of g...

Descripción completa

Detalles Bibliográficos
Autores principales: Luo, Arong, Lan, Haiqiang, Ling, Cheng, Zhang, Aibing, Shi, Lei, Ho, Simon Y. W., Zhu, Chaodong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4717336/
https://www.ncbi.nlm.nih.gov/pubmed/26811761
http://dx.doi.org/10.1002/ece3.1846
_version_ 1782410635658657792
author Luo, Arong
Lan, Haiqiang
Ling, Cheng
Zhang, Aibing
Shi, Lei
Ho, Simon Y. W.
Zhu, Chaodong
author_facet Luo, Arong
Lan, Haiqiang
Ling, Cheng
Zhang, Aibing
Shi, Lei
Ho, Simon Y. W.
Zhu, Chaodong
author_sort Luo, Arong
collection PubMed
description For some groups of organisms, DNA barcoding can provide a useful tool in taxonomy, evolutionary biology, and biodiversity assessment. However, the efficacy of DNA barcoding depends on the degree of sampling per species, because a large enough sample size is needed to provide a reliable estimate of genetic polymorphism and for delimiting species. We used a simulation approach to examine the effects of sample size on four estimators of genetic polymorphism related to DNA barcoding: mismatch distribution, nucleotide diversity, the number of haplotypes, and maximum pairwise distance. Our results showed that mismatch distributions derived from subsamples of ≥20 individuals usually bore a close resemblance to that of the full dataset. Estimates of nucleotide diversity from subsamples of ≥20 individuals tended to be bell‐shaped around that of the full dataset, whereas estimates from smaller subsamples were not. As expected, greater sampling generally led to an increase in the number of haplotypes. We also found that subsamples of ≥20 individuals allowed a good estimate of the maximum pairwise distance of the full dataset, while smaller ones were associated with a high probability of underestimation. Overall, our study confirms the expectation that larger samples are beneficial for the efficacy of DNA barcoding and suggests that a minimum sample size of 20 individuals is needed in practice for each population.
format Online
Article
Text
id pubmed-4717336
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-47173362016-01-25 A simulation study of sample size for DNA barcoding Luo, Arong Lan, Haiqiang Ling, Cheng Zhang, Aibing Shi, Lei Ho, Simon Y. W. Zhu, Chaodong Ecol Evol Original Research For some groups of organisms, DNA barcoding can provide a useful tool in taxonomy, evolutionary biology, and biodiversity assessment. However, the efficacy of DNA barcoding depends on the degree of sampling per species, because a large enough sample size is needed to provide a reliable estimate of genetic polymorphism and for delimiting species. We used a simulation approach to examine the effects of sample size on four estimators of genetic polymorphism related to DNA barcoding: mismatch distribution, nucleotide diversity, the number of haplotypes, and maximum pairwise distance. Our results showed that mismatch distributions derived from subsamples of ≥20 individuals usually bore a close resemblance to that of the full dataset. Estimates of nucleotide diversity from subsamples of ≥20 individuals tended to be bell‐shaped around that of the full dataset, whereas estimates from smaller subsamples were not. As expected, greater sampling generally led to an increase in the number of haplotypes. We also found that subsamples of ≥20 individuals allowed a good estimate of the maximum pairwise distance of the full dataset, while smaller ones were associated with a high probability of underestimation. Overall, our study confirms the expectation that larger samples are beneficial for the efficacy of DNA barcoding and suggests that a minimum sample size of 20 individuals is needed in practice for each population. John Wiley and Sons Inc. 2015-12-01 /pmc/articles/PMC4717336/ /pubmed/26811761 http://dx.doi.org/10.1002/ece3.1846 Text en © 2015 The Authors. Ecology and Evolution published by John Wiley & Sons Ltd. This is an open access article under the terms of the Creative Commons Attribution (http://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Research
Luo, Arong
Lan, Haiqiang
Ling, Cheng
Zhang, Aibing
Shi, Lei
Ho, Simon Y. W.
Zhu, Chaodong
A simulation study of sample size for DNA barcoding
title A simulation study of sample size for DNA barcoding
title_full A simulation study of sample size for DNA barcoding
title_fullStr A simulation study of sample size for DNA barcoding
title_full_unstemmed A simulation study of sample size for DNA barcoding
title_short A simulation study of sample size for DNA barcoding
title_sort simulation study of sample size for dna barcoding
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4717336/
https://www.ncbi.nlm.nih.gov/pubmed/26811761
http://dx.doi.org/10.1002/ece3.1846
work_keys_str_mv AT luoarong asimulationstudyofsamplesizefordnabarcoding
AT lanhaiqiang asimulationstudyofsamplesizefordnabarcoding
AT lingcheng asimulationstudyofsamplesizefordnabarcoding
AT zhangaibing asimulationstudyofsamplesizefordnabarcoding
AT shilei asimulationstudyofsamplesizefordnabarcoding
AT hosimonyw asimulationstudyofsamplesizefordnabarcoding
AT zhuchaodong asimulationstudyofsamplesizefordnabarcoding
AT luoarong simulationstudyofsamplesizefordnabarcoding
AT lanhaiqiang simulationstudyofsamplesizefordnabarcoding
AT lingcheng simulationstudyofsamplesizefordnabarcoding
AT zhangaibing simulationstudyofsamplesizefordnabarcoding
AT shilei simulationstudyofsamplesizefordnabarcoding
AT hosimonyw simulationstudyofsamplesizefordnabarcoding
AT zhuchaodong simulationstudyofsamplesizefordnabarcoding