Cargando…
StrainFLAIR: strain-level profiling of metagenomic samples using variation graphs
Current studies are shifting from the use of single linear references to representation of multiple genomes organised in pangenome graphs or variation graphs. Meanwhile, in metagenomic samples, resolving strain-level abundances is a major step in microbiome studies, as associations between strain va...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8388557/ https://www.ncbi.nlm.nih.gov/pubmed/34513324 http://dx.doi.org/10.7717/peerj.11884 |
_version_ | 1783742670261190656 |
---|---|
author | Da Silva, Kévin Pons, Nicolas Berland, Magali Plaza Oñate, Florian Almeida, Mathieu Peterlongo, Pierre |
author_facet | Da Silva, Kévin Pons, Nicolas Berland, Magali Plaza Oñate, Florian Almeida, Mathieu Peterlongo, Pierre |
author_sort | Da Silva, Kévin |
collection | PubMed |
description | Current studies are shifting from the use of single linear references to representation of multiple genomes organised in pangenome graphs or variation graphs. Meanwhile, in metagenomic samples, resolving strain-level abundances is a major step in microbiome studies, as associations between strain variants and phenotype are of great interest for diagnostic and therapeutic purposes. We developed StrainFLAIR with the aim of showing the feasibility of using variation graphs for indexing highly similar genomic sequences up to the strain level, and for characterizing a set of unknown sequenced genomes by querying this graph. On simulated data composed of mixtures of strains from the same bacterial species Escherichia coli, results show that StrainFLAIR was able to distinguish and estimate the abundances of close strains, as well as to highlight the presence of a new strain close to a referenced one and to estimate its abundance. On a real dataset composed of a mix of several bacterial species and several strains for the same species, results show that in a more complex configuration StrainFLAIR correctly estimates the abundance of each strain. Hence, results demonstrated how graph representation of multiple close genomes can be used as a reference to characterize a sample at the strain level. |
format | Online Article Text |
id | pubmed-8388557 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-83885572021-09-09 StrainFLAIR: strain-level profiling of metagenomic samples using variation graphs Da Silva, Kévin Pons, Nicolas Berland, Magali Plaza Oñate, Florian Almeida, Mathieu Peterlongo, Pierre PeerJ Bioinformatics Current studies are shifting from the use of single linear references to representation of multiple genomes organised in pangenome graphs or variation graphs. Meanwhile, in metagenomic samples, resolving strain-level abundances is a major step in microbiome studies, as associations between strain variants and phenotype are of great interest for diagnostic and therapeutic purposes. We developed StrainFLAIR with the aim of showing the feasibility of using variation graphs for indexing highly similar genomic sequences up to the strain level, and for characterizing a set of unknown sequenced genomes by querying this graph. On simulated data composed of mixtures of strains from the same bacterial species Escherichia coli, results show that StrainFLAIR was able to distinguish and estimate the abundances of close strains, as well as to highlight the presence of a new strain close to a referenced one and to estimate its abundance. On a real dataset composed of a mix of several bacterial species and several strains for the same species, results show that in a more complex configuration StrainFLAIR correctly estimates the abundance of each strain. Hence, results demonstrated how graph representation of multiple close genomes can be used as a reference to characterize a sample at the strain level. PeerJ Inc. 2021-08-23 /pmc/articles/PMC8388557/ /pubmed/34513324 http://dx.doi.org/10.7717/peerj.11884 Text en © 2021 Da Silva et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited. |
spellingShingle | Bioinformatics Da Silva, Kévin Pons, Nicolas Berland, Magali Plaza Oñate, Florian Almeida, Mathieu Peterlongo, Pierre StrainFLAIR: strain-level profiling of metagenomic samples using variation graphs |
title | StrainFLAIR: strain-level profiling of metagenomic samples using variation graphs |
title_full | StrainFLAIR: strain-level profiling of metagenomic samples using variation graphs |
title_fullStr | StrainFLAIR: strain-level profiling of metagenomic samples using variation graphs |
title_full_unstemmed | StrainFLAIR: strain-level profiling of metagenomic samples using variation graphs |
title_short | StrainFLAIR: strain-level profiling of metagenomic samples using variation graphs |
title_sort | strainflair: strain-level profiling of metagenomic samples using variation graphs |
topic | Bioinformatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8388557/ https://www.ncbi.nlm.nih.gov/pubmed/34513324 http://dx.doi.org/10.7717/peerj.11884 |
work_keys_str_mv | AT dasilvakevin strainflairstrainlevelprofilingofmetagenomicsamplesusingvariationgraphs AT ponsnicolas strainflairstrainlevelprofilingofmetagenomicsamplesusingvariationgraphs AT berlandmagali strainflairstrainlevelprofilingofmetagenomicsamplesusingvariationgraphs AT plazaonateflorian strainflairstrainlevelprofilingofmetagenomicsamplesusingvariationgraphs AT almeidamathieu strainflairstrainlevelprofilingofmetagenomicsamplesusingvariationgraphs AT peterlongopierre strainflairstrainlevelprofilingofmetagenomicsamplesusingvariationgraphs |