Cargando…

Introducing an rbcL and a trnL reference library to aid in the metabarcoding analysis of foraged plants from two semi-arid eastern South African savanna bioregions

Success of a metabarcoding study is determined by the extent of taxonomic coverage and the quality of records available in the DNA barcode reference database used. This study aimed to create an rbcL and a trnL (UAA) DNA barcode sequence reference database of plant species that are potential herbivor...

Descripción completa

Detalles Bibliográficos
Autores principales: Botha, Danielle, du Plessis, Mornè, Siebert, Frances, Barnard, Sandra
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10198553/
https://www.ncbi.nlm.nih.gov/pubmed/37205700
http://dx.doi.org/10.1371/journal.pone.0286144
Descripción
Sumario:Success of a metabarcoding study is determined by the extent of taxonomic coverage and the quality of records available in the DNA barcode reference database used. This study aimed to create an rbcL and a trnL (UAA) DNA barcode sequence reference database of plant species that are potential herbivore foraging targets and commonly found in semi-arid savannas of eastern South Africa. An area-specific species list of 765 species was compiled according to plant collection records available and areas comparable to an eastern semi-arid South African savanna. Thereafter, rbcL and trnL sequences of species from this list were mined from GenBank and BOLD sequence databases according to specific quality criteria to ensure accurate taxonomic coverage and resolution. These were supplemented with sequences of 24 species sequenced for this study. A phylogenetic approach, employing Neighbor-Joining, was used to verify the topology of the reference libraries to known angiosperm phylogeny. The taxonomic reliability of these reference libraries was evaluated by testing for the presence of a barcode gap, identifying a data-appropriate identification threshold, and determining the identification accuracy of reference sequences via primary distance-based criteria. The final rbcL reference dataset consisted of 1238 sequences representing 318 genera and 562 species. The final trnL dataset consisted of 921 sequences representing 270 genera and 461 species. Barcode gaps were found for 76% of the taxa in the rbcL barcode reference dataset and 68% of the taxa in the trnL barcode reference dataset. The identification success rate, calculated with the k-nn criterion was 85.86% for the rbcL dataset and 73.72% for the trnL dataset. The datasets for rbcL and trnL combined during this study are not presented as complete DNA reference libraries, but rather as two datasets that should be used in unison to identify plants present in the semi-arid eastern savannas of South Africa.