Cargando…
Critical assessment of pan-genomic analysis of metagenome-assembled genomes
Pan-genome analyses of metagenome-assembled genomes (MAGs) may suffer from the known issues with MAGs: fragmentation, incompleteness and contamination. Here, we conducted a critical assessment of pan-genomics of MAGs, by comparing pan-genome analysis results of complete bacterial genomes and simulat...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9677465/ https://www.ncbi.nlm.nih.gov/pubmed/36124775 http://dx.doi.org/10.1093/bib/bbac413 |
_version_ | 1784833816997658624 |
---|---|
author | Li, Tang Yin, Yanbin |
author_facet | Li, Tang Yin, Yanbin |
author_sort | Li, Tang |
collection | PubMed |
description | Pan-genome analyses of metagenome-assembled genomes (MAGs) may suffer from the known issues with MAGs: fragmentation, incompleteness and contamination. Here, we conducted a critical assessment of pan-genomics of MAGs, by comparing pan-genome analysis results of complete bacterial genomes and simulated MAGs. We found that incompleteness led to significant core gene (CG) loss. The CG loss remained when using different pan-genome analysis tools (Roary, BPGA, Anvi’o) and when using a mixture of MAGs and complete genomes. Contamination had little effect on core genome size (except for Roary due to in its gene clustering issue) but had major influence on accessory genomes. Importantly, the CG loss was partially alleviated by lowering the CG threshold and using gene prediction algorithms that consider fragmented genes, but to a less degree when incompleteness was higher than 5%. The CG loss also led to incorrect pan-genome functional predictions and inaccurate phylogenetic trees. Our main findings were supported by a study of real MAG-isolate genome data. We conclude that lowering CG threshold and predicting genes in metagenome mode (as Anvi’o does with Prodigal) are necessary in pan-genome analysis of MAGs. Development of new pan-genome analysis tools specifically for MAGs are needed in future studies. |
format | Online Article Text |
id | pubmed-9677465 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-96774652022-11-21 Critical assessment of pan-genomic analysis of metagenome-assembled genomes Li, Tang Yin, Yanbin Brief Bioinform Problem Solving Protocol Pan-genome analyses of metagenome-assembled genomes (MAGs) may suffer from the known issues with MAGs: fragmentation, incompleteness and contamination. Here, we conducted a critical assessment of pan-genomics of MAGs, by comparing pan-genome analysis results of complete bacterial genomes and simulated MAGs. We found that incompleteness led to significant core gene (CG) loss. The CG loss remained when using different pan-genome analysis tools (Roary, BPGA, Anvi’o) and when using a mixture of MAGs and complete genomes. Contamination had little effect on core genome size (except for Roary due to in its gene clustering issue) but had major influence on accessory genomes. Importantly, the CG loss was partially alleviated by lowering the CG threshold and using gene prediction algorithms that consider fragmented genes, but to a less degree when incompleteness was higher than 5%. The CG loss also led to incorrect pan-genome functional predictions and inaccurate phylogenetic trees. Our main findings were supported by a study of real MAG-isolate genome data. We conclude that lowering CG threshold and predicting genes in metagenome mode (as Anvi’o does with Prodigal) are necessary in pan-genome analysis of MAGs. Development of new pan-genome analysis tools specifically for MAGs are needed in future studies. Oxford University Press 2022-09-17 /pmc/articles/PMC9677465/ /pubmed/36124775 http://dx.doi.org/10.1093/bib/bbac413 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Problem Solving Protocol Li, Tang Yin, Yanbin Critical assessment of pan-genomic analysis of metagenome-assembled genomes |
title | Critical assessment of pan-genomic analysis of metagenome-assembled genomes |
title_full | Critical assessment of pan-genomic analysis of metagenome-assembled genomes |
title_fullStr | Critical assessment of pan-genomic analysis of metagenome-assembled genomes |
title_full_unstemmed | Critical assessment of pan-genomic analysis of metagenome-assembled genomes |
title_short | Critical assessment of pan-genomic analysis of metagenome-assembled genomes |
title_sort | critical assessment of pan-genomic analysis of metagenome-assembled genomes |
topic | Problem Solving Protocol |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9677465/ https://www.ncbi.nlm.nih.gov/pubmed/36124775 http://dx.doi.org/10.1093/bib/bbac413 |
work_keys_str_mv | AT litang criticalassessmentofpangenomicanalysisofmetagenomeassembledgenomes AT yinyanbin criticalassessmentofpangenomicanalysisofmetagenomeassembledgenomes |