Cargando…
PlasBin-flow: a flow-based MILP algorithm for plasmid contigs binning
MOTIVATION: The analysis of bacterial isolates to detect plasmids is important due to their role in the propagation of antimicrobial resistance. In short-read sequence assemblies, both plasmids and bacterial chromosomes are typically split into several contigs of various lengths, making identificati...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10311310/ https://www.ncbi.nlm.nih.gov/pubmed/37387134 http://dx.doi.org/10.1093/bioinformatics/btad250 |
_version_ | 1785066716060975104 |
---|---|
author | Mane, Aniket Faizrahnemoon, Mahsa Vinař, Tomáš Brejová, Broňa Chauve, Cedric |
author_facet | Mane, Aniket Faizrahnemoon, Mahsa Vinař, Tomáš Brejová, Broňa Chauve, Cedric |
author_sort | Mane, Aniket |
collection | PubMed |
description | MOTIVATION: The analysis of bacterial isolates to detect plasmids is important due to their role in the propagation of antimicrobial resistance. In short-read sequence assemblies, both plasmids and bacterial chromosomes are typically split into several contigs of various lengths, making identification of plasmids a challenging problem. In plasmid contig binning, the goal is to distinguish short-read assembly contigs based on their origin into plasmid and chromosomal contigs and subsequently sort plasmid contigs into bins, each bin corresponding to a single plasmid. Previous works on this problem consist of de novo approaches and reference-based approaches. De novo methods rely on contig features such as length, circularity, read coverage, or GC content. Reference-based approaches compare contigs to databases of known plasmids or plasmid markers from finished bacterial genomes. RESULTS: Recent developments suggest that leveraging information contained in the assembly graph improves the accuracy of plasmid binning. We present PlasBin-flow, a hybrid method that defines contig bins as subgraphs of the assembly graph. PlasBin-flow identifies such plasmid subgraphs through a mixed integer linear programming model that relies on the concept of network flow to account for sequencing coverage, while also accounting for the presence of plasmid genes and the GC content that often distinguishes plasmids from chromosomes. We demonstrate the performance of PlasBin-flow on a real dataset of bacterial samples. AVAILABILITY AND IMPLEMENTATION: https://github.com/cchauve/PlasBin-flow. |
format | Online Article Text |
id | pubmed-10311310 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-103113102023-07-01 PlasBin-flow: a flow-based MILP algorithm for plasmid contigs binning Mane, Aniket Faizrahnemoon, Mahsa Vinař, Tomáš Brejová, Broňa Chauve, Cedric Bioinformatics Genome Sequence Analysis MOTIVATION: The analysis of bacterial isolates to detect plasmids is important due to their role in the propagation of antimicrobial resistance. In short-read sequence assemblies, both plasmids and bacterial chromosomes are typically split into several contigs of various lengths, making identification of plasmids a challenging problem. In plasmid contig binning, the goal is to distinguish short-read assembly contigs based on their origin into plasmid and chromosomal contigs and subsequently sort plasmid contigs into bins, each bin corresponding to a single plasmid. Previous works on this problem consist of de novo approaches and reference-based approaches. De novo methods rely on contig features such as length, circularity, read coverage, or GC content. Reference-based approaches compare contigs to databases of known plasmids or plasmid markers from finished bacterial genomes. RESULTS: Recent developments suggest that leveraging information contained in the assembly graph improves the accuracy of plasmid binning. We present PlasBin-flow, a hybrid method that defines contig bins as subgraphs of the assembly graph. PlasBin-flow identifies such plasmid subgraphs through a mixed integer linear programming model that relies on the concept of network flow to account for sequencing coverage, while also accounting for the presence of plasmid genes and the GC content that often distinguishes plasmids from chromosomes. We demonstrate the performance of PlasBin-flow on a real dataset of bacterial samples. AVAILABILITY AND IMPLEMENTATION: https://github.com/cchauve/PlasBin-flow. Oxford University Press 2023-06-30 /pmc/articles/PMC10311310/ /pubmed/37387134 http://dx.doi.org/10.1093/bioinformatics/btad250 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Genome Sequence Analysis Mane, Aniket Faizrahnemoon, Mahsa Vinař, Tomáš Brejová, Broňa Chauve, Cedric PlasBin-flow: a flow-based MILP algorithm for plasmid contigs binning |
title | PlasBin-flow: a flow-based MILP algorithm for plasmid contigs binning |
title_full | PlasBin-flow: a flow-based MILP algorithm for plasmid contigs binning |
title_fullStr | PlasBin-flow: a flow-based MILP algorithm for plasmid contigs binning |
title_full_unstemmed | PlasBin-flow: a flow-based MILP algorithm for plasmid contigs binning |
title_short | PlasBin-flow: a flow-based MILP algorithm for plasmid contigs binning |
title_sort | plasbin-flow: a flow-based milp algorithm for plasmid contigs binning |
topic | Genome Sequence Analysis |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10311310/ https://www.ncbi.nlm.nih.gov/pubmed/37387134 http://dx.doi.org/10.1093/bioinformatics/btad250 |
work_keys_str_mv | AT maneaniket plasbinflowaflowbasedmilpalgorithmforplasmidcontigsbinning AT faizrahnemoonmahsa plasbinflowaflowbasedmilpalgorithmforplasmidcontigsbinning AT vinartomas plasbinflowaflowbasedmilpalgorithmforplasmidcontigsbinning AT brejovabrona plasbinflowaflowbasedmilpalgorithmforplasmidcontigsbinning AT chauvecedric plasbinflowaflowbasedmilpalgorithmforplasmidcontigsbinning |