Cargando…

DUGMO: tool for the detection of unknown genetically modified organisms with high-throughput sequencing data for pure bacterial samples

BACKGROUND: The European Community has adopted very restrictive policies regarding the dissemination and use of genetically modified organisms (GMOs). In fact, a maximum threshold of 0.9% of contaminating GMOs is tolerated for a “GMO-free” label. In recent years, imports of undescribed GMOs have bee...

Descripción completa

Detalles Bibliográficos
Autores principales: Hurel, Julie, Schbath, Sophie, Bougeard, Stéphanie, Rolland, Mathieu, Petrillo, Mauro, Touzain, Fabrice
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7336441/
https://www.ncbi.nlm.nih.gov/pubmed/32631215
http://dx.doi.org/10.1186/s12859-020-03611-5
_version_ 1783554320228155392
author Hurel, Julie
Schbath, Sophie
Bougeard, Stéphanie
Rolland, Mathieu
Petrillo, Mauro
Touzain, Fabrice
author_facet Hurel, Julie
Schbath, Sophie
Bougeard, Stéphanie
Rolland, Mathieu
Petrillo, Mauro
Touzain, Fabrice
author_sort Hurel, Julie
collection PubMed
description BACKGROUND: The European Community has adopted very restrictive policies regarding the dissemination and use of genetically modified organisms (GMOs). In fact, a maximum threshold of 0.9% of contaminating GMOs is tolerated for a “GMO-free” label. In recent years, imports of undescribed GMOs have been detected. Their sequences are not described and therefore not detectable by conventional approaches, such as PCR. RESULTS: We developed DUGMO, a bioinformatics pipeline for the detection of genetically modified (GM) bacteria, including unknown GM bacteria, based on Illumina paired-end sequencing data. The method is currently focused on the detection of GM bacteria with – possibly partial – transgenes in pure bacterial samples. In the preliminary steps, coding sequences (CDSs) are aligned through two successive BLASTN against the host pangenome with relevant tuned parameters to discriminate CDSs belonging to the wild type genome (wgCDS) from potential GM coding sequences (pgmCDSs). Then, Bray-Curtis distances are calculated between the wgCDS and each pgmCDS, based on the difference of genomic vocabulary. Finally, two machine learning methods, namely the Random Forest and Generalized Linear Model, are carried out to target true GM CDS(s), based on six variables including Bray-Curtis distances and GC content. Tests carried out on a GM Bacillus subtilis showed 25 positive CDSs corresponding to the chloramphenicol resistance gene and CDSs of the inserted plasmids. On a wild type B. subtilis, no false positive sequences were detected. CONCLUSION: DUGMO detects exogenous CDS, truncated, fused or highly mutated wild CDSs in high-throughput sequencing data, and was shown to be efficient at detecting GM sequences, but it might also be employed for the identification of recent horizontal gene transfers.
format Online
Article
Text
id pubmed-7336441
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-73364412020-07-08 DUGMO: tool for the detection of unknown genetically modified organisms with high-throughput sequencing data for pure bacterial samples Hurel, Julie Schbath, Sophie Bougeard, Stéphanie Rolland, Mathieu Petrillo, Mauro Touzain, Fabrice BMC Bioinformatics Methodology Article BACKGROUND: The European Community has adopted very restrictive policies regarding the dissemination and use of genetically modified organisms (GMOs). In fact, a maximum threshold of 0.9% of contaminating GMOs is tolerated for a “GMO-free” label. In recent years, imports of undescribed GMOs have been detected. Their sequences are not described and therefore not detectable by conventional approaches, such as PCR. RESULTS: We developed DUGMO, a bioinformatics pipeline for the detection of genetically modified (GM) bacteria, including unknown GM bacteria, based on Illumina paired-end sequencing data. The method is currently focused on the detection of GM bacteria with – possibly partial – transgenes in pure bacterial samples. In the preliminary steps, coding sequences (CDSs) are aligned through two successive BLASTN against the host pangenome with relevant tuned parameters to discriminate CDSs belonging to the wild type genome (wgCDS) from potential GM coding sequences (pgmCDSs). Then, Bray-Curtis distances are calculated between the wgCDS and each pgmCDS, based on the difference of genomic vocabulary. Finally, two machine learning methods, namely the Random Forest and Generalized Linear Model, are carried out to target true GM CDS(s), based on six variables including Bray-Curtis distances and GC content. Tests carried out on a GM Bacillus subtilis showed 25 positive CDSs corresponding to the chloramphenicol resistance gene and CDSs of the inserted plasmids. On a wild type B. subtilis, no false positive sequences were detected. CONCLUSION: DUGMO detects exogenous CDS, truncated, fused or highly mutated wild CDSs in high-throughput sequencing data, and was shown to be efficient at detecting GM sequences, but it might also be employed for the identification of recent horizontal gene transfers. BioMed Central 2020-07-06 /pmc/articles/PMC7336441/ /pubmed/32631215 http://dx.doi.org/10.1186/s12859-020-03611-5 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Hurel, Julie
Schbath, Sophie
Bougeard, Stéphanie
Rolland, Mathieu
Petrillo, Mauro
Touzain, Fabrice
DUGMO: tool for the detection of unknown genetically modified organisms with high-throughput sequencing data for pure bacterial samples
title DUGMO: tool for the detection of unknown genetically modified organisms with high-throughput sequencing data for pure bacterial samples
title_full DUGMO: tool for the detection of unknown genetically modified organisms with high-throughput sequencing data for pure bacterial samples
title_fullStr DUGMO: tool for the detection of unknown genetically modified organisms with high-throughput sequencing data for pure bacterial samples
title_full_unstemmed DUGMO: tool for the detection of unknown genetically modified organisms with high-throughput sequencing data for pure bacterial samples
title_short DUGMO: tool for the detection of unknown genetically modified organisms with high-throughput sequencing data for pure bacterial samples
title_sort dugmo: tool for the detection of unknown genetically modified organisms with high-throughput sequencing data for pure bacterial samples
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7336441/
https://www.ncbi.nlm.nih.gov/pubmed/32631215
http://dx.doi.org/10.1186/s12859-020-03611-5
work_keys_str_mv AT hureljulie dugmotoolforthedetectionofunknowngeneticallymodifiedorganismswithhighthroughputsequencingdataforpurebacterialsamples
AT schbathsophie dugmotoolforthedetectionofunknowngeneticallymodifiedorganismswithhighthroughputsequencingdataforpurebacterialsamples
AT bougeardstephanie dugmotoolforthedetectionofunknowngeneticallymodifiedorganismswithhighthroughputsequencingdataforpurebacterialsamples
AT rollandmathieu dugmotoolforthedetectionofunknowngeneticallymodifiedorganismswithhighthroughputsequencingdataforpurebacterialsamples
AT petrillomauro dugmotoolforthedetectionofunknowngeneticallymodifiedorganismswithhighthroughputsequencingdataforpurebacterialsamples
AT touzainfabrice dugmotoolforthedetectionofunknowngeneticallymodifiedorganismswithhighthroughputsequencingdataforpurebacterialsamples