Cargando…
Categorization of species as native or nonnative using DNA sequence signatures without a complete reference library
New genetic diagnostic approaches have greatly aided efforts to document global biodiversity and improve biosecurity. This is especially true for organismal groups in which species diversity has been underestimated historically due to difficulties associated with sampling, the lack of clear morpholo...
Autores principales: | , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
John Wiley and Sons Inc.
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7079013/ https://www.ncbi.nlm.nih.gov/pubmed/31050090 http://dx.doi.org/10.1002/eap.1914 |
_version_ | 1783507740004450304 |
---|---|
author | Andersen, Jeremy C. Oboyski, Peter Davies, Neil Charlat, Sylvain Ewing, Curtis Meyer, Christopher Krehenwinkel, Henrik Lim, Jun Ying Noriyuki, Suzuki Ramage, Thibault Gillespie, Rosemary G. Roderick, George K. |
author_facet | Andersen, Jeremy C. Oboyski, Peter Davies, Neil Charlat, Sylvain Ewing, Curtis Meyer, Christopher Krehenwinkel, Henrik Lim, Jun Ying Noriyuki, Suzuki Ramage, Thibault Gillespie, Rosemary G. Roderick, George K. |
author_sort | Andersen, Jeremy C. |
collection | PubMed |
description | New genetic diagnostic approaches have greatly aided efforts to document global biodiversity and improve biosecurity. This is especially true for organismal groups in which species diversity has been underestimated historically due to difficulties associated with sampling, the lack of clear morphological characteristics, and/or limited availability of taxonomic expertise. Among these methods, DNA sequence barcoding (also known as “DNA barcoding”) and by extension, meta‐barcoding for biological communities, has emerged as one of the most frequently utilized methods for DNA‐based species identifications. Unfortunately, the use of DNA barcoding is limited by the availability of complete reference libraries (i.e., a collection of DNA sequences from morphologically identified species), and by the fact that the vast majority of species do not have sequences present in reference databases. Such conditions are critical especially in tropical locations that are simultaneously biodiversity rich and suffer from a lack of exploration and DNA characterization by trained taxonomic specialists. To facilitate efforts to document biodiversity in regions lacking complete reference libraries, we developed a novel statistical approach that categorizes unidentified species as being either likely native or likely nonnative based solely on measures of nucleotide diversity. We demonstrate the utility of this approach by categorizing a large sample of specimens of terrestrial insects and spiders (collected as part of the Moorea BioCode project) using a generalized linear mixed model (GLMM). Using a training data set of known endemic (n = 45) and known introduced species (n = 102), we then estimated the likely native/nonnative status for 4,663 specimens representing an estimated 1,288 species (412 identified species), including both those specimens that were either unidentified or whose endemic/introduced status was uncertain. Using this approach, we were able to increase the number of categorized specimens by a factor of 4.4 (from 794 to 3,497), and the number of categorized species by a factor of 4.8 from (147 to 707) at a rate much greater than chance (77.6% accuracy). The study identifies phylogenetic signatures of both native and nonnative species and suggests several practical applications for this approach including monitoring biodiversity and facilitating biosecurity. |
format | Online Article Text |
id | pubmed-7079013 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | John Wiley and Sons Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-70790132020-03-19 Categorization of species as native or nonnative using DNA sequence signatures without a complete reference library Andersen, Jeremy C. Oboyski, Peter Davies, Neil Charlat, Sylvain Ewing, Curtis Meyer, Christopher Krehenwinkel, Henrik Lim, Jun Ying Noriyuki, Suzuki Ramage, Thibault Gillespie, Rosemary G. Roderick, George K. Ecol Appl Articles New genetic diagnostic approaches have greatly aided efforts to document global biodiversity and improve biosecurity. This is especially true for organismal groups in which species diversity has been underestimated historically due to difficulties associated with sampling, the lack of clear morphological characteristics, and/or limited availability of taxonomic expertise. Among these methods, DNA sequence barcoding (also known as “DNA barcoding”) and by extension, meta‐barcoding for biological communities, has emerged as one of the most frequently utilized methods for DNA‐based species identifications. Unfortunately, the use of DNA barcoding is limited by the availability of complete reference libraries (i.e., a collection of DNA sequences from morphologically identified species), and by the fact that the vast majority of species do not have sequences present in reference databases. Such conditions are critical especially in tropical locations that are simultaneously biodiversity rich and suffer from a lack of exploration and DNA characterization by trained taxonomic specialists. To facilitate efforts to document biodiversity in regions lacking complete reference libraries, we developed a novel statistical approach that categorizes unidentified species as being either likely native or likely nonnative based solely on measures of nucleotide diversity. We demonstrate the utility of this approach by categorizing a large sample of specimens of terrestrial insects and spiders (collected as part of the Moorea BioCode project) using a generalized linear mixed model (GLMM). Using a training data set of known endemic (n = 45) and known introduced species (n = 102), we then estimated the likely native/nonnative status for 4,663 specimens representing an estimated 1,288 species (412 identified species), including both those specimens that were either unidentified or whose endemic/introduced status was uncertain. Using this approach, we were able to increase the number of categorized specimens by a factor of 4.4 (from 794 to 3,497), and the number of categorized species by a factor of 4.8 from (147 to 707) at a rate much greater than chance (77.6% accuracy). The study identifies phylogenetic signatures of both native and nonnative species and suggests several practical applications for this approach including monitoring biodiversity and facilitating biosecurity. John Wiley and Sons Inc. 2019-06-12 2019-07 /pmc/articles/PMC7079013/ /pubmed/31050090 http://dx.doi.org/10.1002/eap.1914 Text en © 2019 The Authors. Ecological Applications published by Wiley Periodicals, Inc. on behalf of Ecological Society of America This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Articles Andersen, Jeremy C. Oboyski, Peter Davies, Neil Charlat, Sylvain Ewing, Curtis Meyer, Christopher Krehenwinkel, Henrik Lim, Jun Ying Noriyuki, Suzuki Ramage, Thibault Gillespie, Rosemary G. Roderick, George K. Categorization of species as native or nonnative using DNA sequence signatures without a complete reference library |
title | Categorization of species as native or nonnative using DNA sequence signatures without a complete reference library |
title_full | Categorization of species as native or nonnative using DNA sequence signatures without a complete reference library |
title_fullStr | Categorization of species as native or nonnative using DNA sequence signatures without a complete reference library |
title_full_unstemmed | Categorization of species as native or nonnative using DNA sequence signatures without a complete reference library |
title_short | Categorization of species as native or nonnative using DNA sequence signatures without a complete reference library |
title_sort | categorization of species as native or nonnative using dna sequence signatures without a complete reference library |
topic | Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7079013/ https://www.ncbi.nlm.nih.gov/pubmed/31050090 http://dx.doi.org/10.1002/eap.1914 |
work_keys_str_mv | AT andersenjeremyc categorizationofspeciesasnativeornonnativeusingdnasequencesignatureswithoutacompletereferencelibrary AT oboyskipeter categorizationofspeciesasnativeornonnativeusingdnasequencesignatureswithoutacompletereferencelibrary AT daviesneil categorizationofspeciesasnativeornonnativeusingdnasequencesignatureswithoutacompletereferencelibrary AT charlatsylvain categorizationofspeciesasnativeornonnativeusingdnasequencesignatureswithoutacompletereferencelibrary AT ewingcurtis categorizationofspeciesasnativeornonnativeusingdnasequencesignatureswithoutacompletereferencelibrary AT meyerchristopher categorizationofspeciesasnativeornonnativeusingdnasequencesignatureswithoutacompletereferencelibrary AT krehenwinkelhenrik categorizationofspeciesasnativeornonnativeusingdnasequencesignatureswithoutacompletereferencelibrary AT limjunying categorizationofspeciesasnativeornonnativeusingdnasequencesignatureswithoutacompletereferencelibrary AT noriyukisuzuki categorizationofspeciesasnativeornonnativeusingdnasequencesignatureswithoutacompletereferencelibrary AT ramagethibault categorizationofspeciesasnativeornonnativeusingdnasequencesignatureswithoutacompletereferencelibrary AT gillespierosemaryg categorizationofspeciesasnativeornonnativeusingdnasequencesignatureswithoutacompletereferencelibrary AT roderickgeorgek categorizationofspeciesasnativeornonnativeusingdnasequencesignatureswithoutacompletereferencelibrary |