Cargando…
Applying convolutional neural networks to speed up environmental DNA annotation in a highly diverse ecosystem
High-throughput DNA sequencing is becoming an increasingly important tool to monitor and better understand biodiversity responses to environmental changes in a standardized and reproducible way. Environmental DNA (eDNA) from organisms can be captured in ecosystem samples and sequenced using metabarc...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9205931/ https://www.ncbi.nlm.nih.gov/pubmed/35715444 http://dx.doi.org/10.1038/s41598-022-13412-w |
_version_ | 1784729233641177088 |
---|---|
author | Flück, Benjamin Mathon, Laëtitia Manel, Stéphanie Valentini, Alice Dejean, Tony Albouy, Camille Mouillot, David Thuiller, Wilfried Murienne, Jérôme Brosse, Sébastien Pellissier, Loïc |
author_facet | Flück, Benjamin Mathon, Laëtitia Manel, Stéphanie Valentini, Alice Dejean, Tony Albouy, Camille Mouillot, David Thuiller, Wilfried Murienne, Jérôme Brosse, Sébastien Pellissier, Loïc |
author_sort | Flück, Benjamin |
collection | PubMed |
description | High-throughput DNA sequencing is becoming an increasingly important tool to monitor and better understand biodiversity responses to environmental changes in a standardized and reproducible way. Environmental DNA (eDNA) from organisms can be captured in ecosystem samples and sequenced using metabarcoding, but processing large volumes of eDNA data and annotating sequences to recognized taxa remains computationally expensive. Speed and accuracy are two major bottlenecks in this critical step. Here, we evaluated the ability of convolutional neural networks (CNNs) to process short eDNA sequences and associate them with taxonomic labels. Using a unique eDNA data set collected in highly diverse Tropical South America, we compared the speed and accuracy of CNNs with that of a well-known bioinformatic pipeline (OBITools) in processing a small region (60 bp) of the 12S ribosomal DNA targeting freshwater fishes. We found that the taxonomic labels from the CNNs were comparable to those from OBITools, with high correlation levels for the composition of the regional fish fauna. The CNNs enabled the processing of raw fastq files at a rate of approximately 1 million sequences per minute, which was about 150 times faster than with OBITools. Given the good performance of CNNs in the highly diverse ecosystem considered here, the development of more elaborate CNNs promises fast deployment for future biodiversity inventories using eDNA. |
format | Online Article Text |
id | pubmed-9205931 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-92059312022-06-19 Applying convolutional neural networks to speed up environmental DNA annotation in a highly diverse ecosystem Flück, Benjamin Mathon, Laëtitia Manel, Stéphanie Valentini, Alice Dejean, Tony Albouy, Camille Mouillot, David Thuiller, Wilfried Murienne, Jérôme Brosse, Sébastien Pellissier, Loïc Sci Rep Article High-throughput DNA sequencing is becoming an increasingly important tool to monitor and better understand biodiversity responses to environmental changes in a standardized and reproducible way. Environmental DNA (eDNA) from organisms can be captured in ecosystem samples and sequenced using metabarcoding, but processing large volumes of eDNA data and annotating sequences to recognized taxa remains computationally expensive. Speed and accuracy are two major bottlenecks in this critical step. Here, we evaluated the ability of convolutional neural networks (CNNs) to process short eDNA sequences and associate them with taxonomic labels. Using a unique eDNA data set collected in highly diverse Tropical South America, we compared the speed and accuracy of CNNs with that of a well-known bioinformatic pipeline (OBITools) in processing a small region (60 bp) of the 12S ribosomal DNA targeting freshwater fishes. We found that the taxonomic labels from the CNNs were comparable to those from OBITools, with high correlation levels for the composition of the regional fish fauna. The CNNs enabled the processing of raw fastq files at a rate of approximately 1 million sequences per minute, which was about 150 times faster than with OBITools. Given the good performance of CNNs in the highly diverse ecosystem considered here, the development of more elaborate CNNs promises fast deployment for future biodiversity inventories using eDNA. Nature Publishing Group UK 2022-06-17 /pmc/articles/PMC9205931/ /pubmed/35715444 http://dx.doi.org/10.1038/s41598-022-13412-w Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Flück, Benjamin Mathon, Laëtitia Manel, Stéphanie Valentini, Alice Dejean, Tony Albouy, Camille Mouillot, David Thuiller, Wilfried Murienne, Jérôme Brosse, Sébastien Pellissier, Loïc Applying convolutional neural networks to speed up environmental DNA annotation in a highly diverse ecosystem |
title | Applying convolutional neural networks to speed up environmental DNA annotation in a highly diverse ecosystem |
title_full | Applying convolutional neural networks to speed up environmental DNA annotation in a highly diverse ecosystem |
title_fullStr | Applying convolutional neural networks to speed up environmental DNA annotation in a highly diverse ecosystem |
title_full_unstemmed | Applying convolutional neural networks to speed up environmental DNA annotation in a highly diverse ecosystem |
title_short | Applying convolutional neural networks to speed up environmental DNA annotation in a highly diverse ecosystem |
title_sort | applying convolutional neural networks to speed up environmental dna annotation in a highly diverse ecosystem |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9205931/ https://www.ncbi.nlm.nih.gov/pubmed/35715444 http://dx.doi.org/10.1038/s41598-022-13412-w |
work_keys_str_mv | AT fluckbenjamin applyingconvolutionalneuralnetworkstospeedupenvironmentaldnaannotationinahighlydiverseecosystem AT mathonlaetitia applyingconvolutionalneuralnetworkstospeedupenvironmentaldnaannotationinahighlydiverseecosystem AT manelstephanie applyingconvolutionalneuralnetworkstospeedupenvironmentaldnaannotationinahighlydiverseecosystem AT valentinialice applyingconvolutionalneuralnetworkstospeedupenvironmentaldnaannotationinahighlydiverseecosystem AT dejeantony applyingconvolutionalneuralnetworkstospeedupenvironmentaldnaannotationinahighlydiverseecosystem AT albouycamille applyingconvolutionalneuralnetworkstospeedupenvironmentaldnaannotationinahighlydiverseecosystem AT mouillotdavid applyingconvolutionalneuralnetworkstospeedupenvironmentaldnaannotationinahighlydiverseecosystem AT thuillerwilfried applyingconvolutionalneuralnetworkstospeedupenvironmentaldnaannotationinahighlydiverseecosystem AT muriennejerome applyingconvolutionalneuralnetworkstospeedupenvironmentaldnaannotationinahighlydiverseecosystem AT brossesebastien applyingconvolutionalneuralnetworkstospeedupenvironmentaldnaannotationinahighlydiverseecosystem AT pellissierloic applyingconvolutionalneuralnetworkstospeedupenvironmentaldnaannotationinahighlydiverseecosystem |