Cargando…

StrainFLAIR: strain-level profiling of metagenomic samples using variation graphs

Current studies are shifting from the use of single linear references to representation of multiple genomes organised in pangenome graphs or variation graphs. Meanwhile, in metagenomic samples, resolving strain-level abundances is a major step in microbiome studies, as associations between strain va...

Descripción completa

Detalles Bibliográficos
Autores principales: Da Silva, Kévin, Pons, Nicolas, Berland, Magali, Plaza Oñate, Florian, Almeida, Mathieu, Peterlongo, Pierre
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8388557/
https://www.ncbi.nlm.nih.gov/pubmed/34513324
http://dx.doi.org/10.7717/peerj.11884
_version_ 1783742670261190656
author Da Silva, Kévin
Pons, Nicolas
Berland, Magali
Plaza Oñate, Florian
Almeida, Mathieu
Peterlongo, Pierre
author_facet Da Silva, Kévin
Pons, Nicolas
Berland, Magali
Plaza Oñate, Florian
Almeida, Mathieu
Peterlongo, Pierre
author_sort Da Silva, Kévin
collection PubMed
description Current studies are shifting from the use of single linear references to representation of multiple genomes organised in pangenome graphs or variation graphs. Meanwhile, in metagenomic samples, resolving strain-level abundances is a major step in microbiome studies, as associations between strain variants and phenotype are of great interest for diagnostic and therapeutic purposes. We developed StrainFLAIR with the aim of showing the feasibility of using variation graphs for indexing highly similar genomic sequences up to the strain level, and for characterizing a set of unknown sequenced genomes by querying this graph. On simulated data composed of mixtures of strains from the same bacterial species Escherichia coli, results show that StrainFLAIR was able to distinguish and estimate the abundances of close strains, as well as to highlight the presence of a new strain close to a referenced one and to estimate its abundance. On a real dataset composed of a mix of several bacterial species and several strains for the same species, results show that in a more complex configuration StrainFLAIR correctly estimates the abundance of each strain. Hence, results demonstrated how graph representation of multiple close genomes can be used as a reference to characterize a sample at the strain level.
format Online
Article
Text
id pubmed-8388557
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-83885572021-09-09 StrainFLAIR: strain-level profiling of metagenomic samples using variation graphs Da Silva, Kévin Pons, Nicolas Berland, Magali Plaza Oñate, Florian Almeida, Mathieu Peterlongo, Pierre PeerJ Bioinformatics Current studies are shifting from the use of single linear references to representation of multiple genomes organised in pangenome graphs or variation graphs. Meanwhile, in metagenomic samples, resolving strain-level abundances is a major step in microbiome studies, as associations between strain variants and phenotype are of great interest for diagnostic and therapeutic purposes. We developed StrainFLAIR with the aim of showing the feasibility of using variation graphs for indexing highly similar genomic sequences up to the strain level, and for characterizing a set of unknown sequenced genomes by querying this graph. On simulated data composed of mixtures of strains from the same bacterial species Escherichia coli, results show that StrainFLAIR was able to distinguish and estimate the abundances of close strains, as well as to highlight the presence of a new strain close to a referenced one and to estimate its abundance. On a real dataset composed of a mix of several bacterial species and several strains for the same species, results show that in a more complex configuration StrainFLAIR correctly estimates the abundance of each strain. Hence, results demonstrated how graph representation of multiple close genomes can be used as a reference to characterize a sample at the strain level. PeerJ Inc. 2021-08-23 /pmc/articles/PMC8388557/ /pubmed/34513324 http://dx.doi.org/10.7717/peerj.11884 Text en © 2021 Da Silva et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Da Silva, Kévin
Pons, Nicolas
Berland, Magali
Plaza Oñate, Florian
Almeida, Mathieu
Peterlongo, Pierre
StrainFLAIR: strain-level profiling of metagenomic samples using variation graphs
title StrainFLAIR: strain-level profiling of metagenomic samples using variation graphs
title_full StrainFLAIR: strain-level profiling of metagenomic samples using variation graphs
title_fullStr StrainFLAIR: strain-level profiling of metagenomic samples using variation graphs
title_full_unstemmed StrainFLAIR: strain-level profiling of metagenomic samples using variation graphs
title_short StrainFLAIR: strain-level profiling of metagenomic samples using variation graphs
title_sort strainflair: strain-level profiling of metagenomic samples using variation graphs
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8388557/
https://www.ncbi.nlm.nih.gov/pubmed/34513324
http://dx.doi.org/10.7717/peerj.11884
work_keys_str_mv AT dasilvakevin strainflairstrainlevelprofilingofmetagenomicsamplesusingvariationgraphs
AT ponsnicolas strainflairstrainlevelprofilingofmetagenomicsamplesusingvariationgraphs
AT berlandmagali strainflairstrainlevelprofilingofmetagenomicsamplesusingvariationgraphs
AT plazaonateflorian strainflairstrainlevelprofilingofmetagenomicsamplesusingvariationgraphs
AT almeidamathieu strainflairstrainlevelprofilingofmetagenomicsamplesusingvariationgraphs
AT peterlongopierre strainflairstrainlevelprofilingofmetagenomicsamplesusingvariationgraphs