Cargando…

PlasBin-flow: a flow-based MILP algorithm for plasmid contigs binning

MOTIVATION: The analysis of bacterial isolates to detect plasmids is important due to their role in the propagation of antimicrobial resistance. In short-read sequence assemblies, both plasmids and bacterial chromosomes are typically split into several contigs of various lengths, making identificati...

Descripción completa

Detalles Bibliográficos
Autores principales: Mane, Aniket, Faizrahnemoon, Mahsa, Vinař, Tomáš, Brejová, Broňa, Chauve, Cedric
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10311310/
https://www.ncbi.nlm.nih.gov/pubmed/37387134
http://dx.doi.org/10.1093/bioinformatics/btad250
_version_ 1785066716060975104
author Mane, Aniket
Faizrahnemoon, Mahsa
Vinař, Tomáš
Brejová, Broňa
Chauve, Cedric
author_facet Mane, Aniket
Faizrahnemoon, Mahsa
Vinař, Tomáš
Brejová, Broňa
Chauve, Cedric
author_sort Mane, Aniket
collection PubMed
description MOTIVATION: The analysis of bacterial isolates to detect plasmids is important due to their role in the propagation of antimicrobial resistance. In short-read sequence assemblies, both plasmids and bacterial chromosomes are typically split into several contigs of various lengths, making identification of plasmids a challenging problem. In plasmid contig binning, the goal is to distinguish short-read assembly contigs based on their origin into plasmid and chromosomal contigs and subsequently sort plasmid contigs into bins, each bin corresponding to a single plasmid. Previous works on this problem consist of de novo approaches and reference-based approaches. De novo methods rely on contig features such as length, circularity, read coverage, or GC content. Reference-based approaches compare contigs to databases of known plasmids or plasmid markers from finished bacterial genomes. RESULTS: Recent developments suggest that leveraging information contained in the assembly graph improves the accuracy of plasmid binning. We present PlasBin-flow, a hybrid method that defines contig bins as subgraphs of the assembly graph. PlasBin-flow identifies such plasmid subgraphs through a mixed integer linear programming model that relies on the concept of network flow to account for sequencing coverage, while also accounting for the presence of plasmid genes and the GC content that often distinguishes plasmids from chromosomes. We demonstrate the performance of PlasBin-flow on a real dataset of bacterial samples. AVAILABILITY AND IMPLEMENTATION: https://github.com/cchauve/PlasBin-flow.
format Online
Article
Text
id pubmed-10311310
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-103113102023-07-01 PlasBin-flow: a flow-based MILP algorithm for plasmid contigs binning Mane, Aniket Faizrahnemoon, Mahsa Vinař, Tomáš Brejová, Broňa Chauve, Cedric Bioinformatics Genome Sequence Analysis MOTIVATION: The analysis of bacterial isolates to detect plasmids is important due to their role in the propagation of antimicrobial resistance. In short-read sequence assemblies, both plasmids and bacterial chromosomes are typically split into several contigs of various lengths, making identification of plasmids a challenging problem. In plasmid contig binning, the goal is to distinguish short-read assembly contigs based on their origin into plasmid and chromosomal contigs and subsequently sort plasmid contigs into bins, each bin corresponding to a single plasmid. Previous works on this problem consist of de novo approaches and reference-based approaches. De novo methods rely on contig features such as length, circularity, read coverage, or GC content. Reference-based approaches compare contigs to databases of known plasmids or plasmid markers from finished bacterial genomes. RESULTS: Recent developments suggest that leveraging information contained in the assembly graph improves the accuracy of plasmid binning. We present PlasBin-flow, a hybrid method that defines contig bins as subgraphs of the assembly graph. PlasBin-flow identifies such plasmid subgraphs through a mixed integer linear programming model that relies on the concept of network flow to account for sequencing coverage, while also accounting for the presence of plasmid genes and the GC content that often distinguishes plasmids from chromosomes. We demonstrate the performance of PlasBin-flow on a real dataset of bacterial samples. AVAILABILITY AND IMPLEMENTATION: https://github.com/cchauve/PlasBin-flow. Oxford University Press 2023-06-30 /pmc/articles/PMC10311310/ /pubmed/37387134 http://dx.doi.org/10.1093/bioinformatics/btad250 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Genome Sequence Analysis
Mane, Aniket
Faizrahnemoon, Mahsa
Vinař, Tomáš
Brejová, Broňa
Chauve, Cedric
PlasBin-flow: a flow-based MILP algorithm for plasmid contigs binning
title PlasBin-flow: a flow-based MILP algorithm for plasmid contigs binning
title_full PlasBin-flow: a flow-based MILP algorithm for plasmid contigs binning
title_fullStr PlasBin-flow: a flow-based MILP algorithm for plasmid contigs binning
title_full_unstemmed PlasBin-flow: a flow-based MILP algorithm for plasmid contigs binning
title_short PlasBin-flow: a flow-based MILP algorithm for plasmid contigs binning
title_sort plasbin-flow: a flow-based milp algorithm for plasmid contigs binning
topic Genome Sequence Analysis
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10311310/
https://www.ncbi.nlm.nih.gov/pubmed/37387134
http://dx.doi.org/10.1093/bioinformatics/btad250
work_keys_str_mv AT maneaniket plasbinflowaflowbasedmilpalgorithmforplasmidcontigsbinning
AT faizrahnemoonmahsa plasbinflowaflowbasedmilpalgorithmforplasmidcontigsbinning
AT vinartomas plasbinflowaflowbasedmilpalgorithmforplasmidcontigsbinning
AT brejovabrona plasbinflowaflowbasedmilpalgorithmforplasmidcontigsbinning
AT chauvecedric plasbinflowaflowbasedmilpalgorithmforplasmidcontigsbinning