Cargando…

plASgraph2: using graph neural networks to detect plasmid contigs from an assembly graph

Identification of plasmids from sequencing data is an important and challenging problem related to antimicrobial resistance spread and other One-Health issues. We provide a new architecture for identifying plasmid contigs in fragmented genome assemblies built from short-read data. We employ graph ne...

Descripción completa

Detalles Bibliográficos
Autores principales: Sielemann, Janik, Sielemann, Katharina, Brejová, Broňa, Vinař, Tomáš, Chauve, Cedric
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10587606/
https://www.ncbi.nlm.nih.gov/pubmed/37869681
http://dx.doi.org/10.3389/fmicb.2023.1267695
_version_ 1785123402786275328
author Sielemann, Janik
Sielemann, Katharina
Brejová, Broňa
Vinař, Tomáš
Chauve, Cedric
author_facet Sielemann, Janik
Sielemann, Katharina
Brejová, Broňa
Vinař, Tomáš
Chauve, Cedric
author_sort Sielemann, Janik
collection PubMed
description Identification of plasmids from sequencing data is an important and challenging problem related to antimicrobial resistance spread and other One-Health issues. We provide a new architecture for identifying plasmid contigs in fragmented genome assemblies built from short-read data. We employ graph neural networks (GNNs) and the assembly graph to propagate the information from nearby nodes, which leads to more accurate classification, especially for short contigs that are difficult to classify based on sequence features or database searches alone. We trained plASgraph2 on a data set of samples from the ESKAPEE group of pathogens. plASgraph2 either outperforms or performs on par with a wide range of state-of-the-art methods on testing sets of independent ESKAPEE samples and samples from related pathogens. On one hand, our study provides a new accurate and easy to use tool for contig classification in bacterial isolates; on the other hand, it serves as a proof-of-concept for the use of GNNs in genomics. Our software is available at https://github.com/cchauve/plasgraph2 and the training and testing data sets are available at https://github.com/fmfi-compbio/plasgraph2-datasets.
format Online
Article
Text
id pubmed-10587606
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-105876062023-10-21 plASgraph2: using graph neural networks to detect plasmid contigs from an assembly graph Sielemann, Janik Sielemann, Katharina Brejová, Broňa Vinař, Tomáš Chauve, Cedric Front Microbiol Microbiology Identification of plasmids from sequencing data is an important and challenging problem related to antimicrobial resistance spread and other One-Health issues. We provide a new architecture for identifying plasmid contigs in fragmented genome assemblies built from short-read data. We employ graph neural networks (GNNs) and the assembly graph to propagate the information from nearby nodes, which leads to more accurate classification, especially for short contigs that are difficult to classify based on sequence features or database searches alone. We trained plASgraph2 on a data set of samples from the ESKAPEE group of pathogens. plASgraph2 either outperforms or performs on par with a wide range of state-of-the-art methods on testing sets of independent ESKAPEE samples and samples from related pathogens. On one hand, our study provides a new accurate and easy to use tool for contig classification in bacterial isolates; on the other hand, it serves as a proof-of-concept for the use of GNNs in genomics. Our software is available at https://github.com/cchauve/plasgraph2 and the training and testing data sets are available at https://github.com/fmfi-compbio/plasgraph2-datasets. Frontiers Media S.A. 2023-10-06 /pmc/articles/PMC10587606/ /pubmed/37869681 http://dx.doi.org/10.3389/fmicb.2023.1267695 Text en Copyright © 2023 Sielemann, Sielemann, Brejová, Vinař and Chauve. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Microbiology
Sielemann, Janik
Sielemann, Katharina
Brejová, Broňa
Vinař, Tomáš
Chauve, Cedric
plASgraph2: using graph neural networks to detect plasmid contigs from an assembly graph
title plASgraph2: using graph neural networks to detect plasmid contigs from an assembly graph
title_full plASgraph2: using graph neural networks to detect plasmid contigs from an assembly graph
title_fullStr plASgraph2: using graph neural networks to detect plasmid contigs from an assembly graph
title_full_unstemmed plASgraph2: using graph neural networks to detect plasmid contigs from an assembly graph
title_short plASgraph2: using graph neural networks to detect plasmid contigs from an assembly graph
title_sort plasgraph2: using graph neural networks to detect plasmid contigs from an assembly graph
topic Microbiology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10587606/
https://www.ncbi.nlm.nih.gov/pubmed/37869681
http://dx.doi.org/10.3389/fmicb.2023.1267695
work_keys_str_mv AT sielemannjanik plasgraph2usinggraphneuralnetworkstodetectplasmidcontigsfromanassemblygraph
AT sielemannkatharina plasgraph2usinggraphneuralnetworkstodetectplasmidcontigsfromanassemblygraph
AT brejovabrona plasgraph2usinggraphneuralnetworkstodetectplasmidcontigsfromanassemblygraph
AT vinartomas plasgraph2usinggraphneuralnetworkstodetectplasmidcontigsfromanassemblygraph
AT chauvecedric plasgraph2usinggraphneuralnetworkstodetectplasmidcontigsfromanassemblygraph