Cargando…

METAnnotatorX2: a Comprehensive Tool for Deep and Shallow Metagenomic Data Set Analyses

The use of bioinformatic tools for read-based taxonomic and functional analyses of metagenomic data sets, including their assembly and management, is rather fragmentary due to the absence of an accepted gold standard. Moreover, most currently available software tools need input of millions of reads...

Descripción completa

Detalles Bibliográficos
Autores principales: Milani, Christian, Lugli, Gabriele Andrea, Fontana, Federico, Mancabelli, Leonardo, Alessandri, Giulia, Longhi, Giulia, Anzalone, Rosaria, Viappiani, Alice, Turroni, Francesca, van Sinderen, Douwe, Ventura, Marco
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8269244/
https://www.ncbi.nlm.nih.gov/pubmed/34184911
http://dx.doi.org/10.1128/mSystems.00583-21
_version_ 1783720535447830528
author Milani, Christian
Lugli, Gabriele Andrea
Fontana, Federico
Mancabelli, Leonardo
Alessandri, Giulia
Longhi, Giulia
Anzalone, Rosaria
Viappiani, Alice
Turroni, Francesca
van Sinderen, Douwe
Ventura, Marco
author_facet Milani, Christian
Lugli, Gabriele Andrea
Fontana, Federico
Mancabelli, Leonardo
Alessandri, Giulia
Longhi, Giulia
Anzalone, Rosaria
Viappiani, Alice
Turroni, Francesca
van Sinderen, Douwe
Ventura, Marco
author_sort Milani, Christian
collection PubMed
description The use of bioinformatic tools for read-based taxonomic and functional analyses of metagenomic data sets, including their assembly and management, is rather fragmentary due to the absence of an accepted gold standard. Moreover, most currently available software tools need input of millions of reads and rely on approximations in data analysis in order to reduce computing times. These issues result in suboptimal results in terms of accuracy, sensitivity, and specificity when used either for the reconstruction of taxonomic or functional profiles through read analysis or analysis of genomes reconstructed by metagenomic assembly. Moreover, the recent introduction of novel DNA sequencing technologies that generate long reads, such as Nanopore and PacBio, represent a valuable data resource that still suffers from a lack of dedicated tools to perform integrated hybrid analysis alongside short read data. In order to overcome these limitations, here we describe a comprehensive bioinformatic platform, METAnnotatorX2, aimed at providing an optimized user-friendly resource which maximizes output quality, while also allowing user-specific adaptation of the pipeline and straightforward integrated analysis of both short and long read data. To further improve performance quality and accuracy of taxonomic assignment of reads and contigs, custom preprocessed and taxonomically revised genomic databases for viruses, prokaryotes, and various eukaryotes were developed. The performance of METAnnotatorX2 was tested by analysis of artificial data sets encompassing viral, archaeal, bacterial, and eukaryotic (fungal) sequence reads that simulate different biological matrices. Moreover, real biological samples were employed to validate in silico results. IMPORTANCE We developed a novel tool, i.e., METAnnotatorX2, that includes a number of new advanced features for analysis of deep and shallow metagenomic data sets and is accompanied by (regularly updated) customized databases for archaea, bacteria, fungi, protists, and viruses. Both software and databases were developed so as to maximize sensitivity and specificity while including support for shallow metagenomic data sets. Through extensive tests performed on Illumina and Nanopore artificial data sets, we demonstrated the high performance of the software to not only extract taxonomic and functional information from sequence reads but also to assemble and process genomes from metagenomic data. The robustness of these functionalities was validated using “real-life” data sets obtained from Illumina and Nanopore sequencing of biological samples. Furthermore, the performance of METAnnotatorX2 was compared to other available software tools for analysis of shotgun metagenomics data.
format Online
Article
Text
id pubmed-8269244
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher American Society for Microbiology
record_format MEDLINE/PubMed
spelling pubmed-82692442021-08-02 METAnnotatorX2: a Comprehensive Tool for Deep and Shallow Metagenomic Data Set Analyses Milani, Christian Lugli, Gabriele Andrea Fontana, Federico Mancabelli, Leonardo Alessandri, Giulia Longhi, Giulia Anzalone, Rosaria Viappiani, Alice Turroni, Francesca van Sinderen, Douwe Ventura, Marco mSystems Methods and Protocols The use of bioinformatic tools for read-based taxonomic and functional analyses of metagenomic data sets, including their assembly and management, is rather fragmentary due to the absence of an accepted gold standard. Moreover, most currently available software tools need input of millions of reads and rely on approximations in data analysis in order to reduce computing times. These issues result in suboptimal results in terms of accuracy, sensitivity, and specificity when used either for the reconstruction of taxonomic or functional profiles through read analysis or analysis of genomes reconstructed by metagenomic assembly. Moreover, the recent introduction of novel DNA sequencing technologies that generate long reads, such as Nanopore and PacBio, represent a valuable data resource that still suffers from a lack of dedicated tools to perform integrated hybrid analysis alongside short read data. In order to overcome these limitations, here we describe a comprehensive bioinformatic platform, METAnnotatorX2, aimed at providing an optimized user-friendly resource which maximizes output quality, while also allowing user-specific adaptation of the pipeline and straightforward integrated analysis of both short and long read data. To further improve performance quality and accuracy of taxonomic assignment of reads and contigs, custom preprocessed and taxonomically revised genomic databases for viruses, prokaryotes, and various eukaryotes were developed. The performance of METAnnotatorX2 was tested by analysis of artificial data sets encompassing viral, archaeal, bacterial, and eukaryotic (fungal) sequence reads that simulate different biological matrices. Moreover, real biological samples were employed to validate in silico results. IMPORTANCE We developed a novel tool, i.e., METAnnotatorX2, that includes a number of new advanced features for analysis of deep and shallow metagenomic data sets and is accompanied by (regularly updated) customized databases for archaea, bacteria, fungi, protists, and viruses. Both software and databases were developed so as to maximize sensitivity and specificity while including support for shallow metagenomic data sets. Through extensive tests performed on Illumina and Nanopore artificial data sets, we demonstrated the high performance of the software to not only extract taxonomic and functional information from sequence reads but also to assemble and process genomes from metagenomic data. The robustness of these functionalities was validated using “real-life” data sets obtained from Illumina and Nanopore sequencing of biological samples. Furthermore, the performance of METAnnotatorX2 was compared to other available software tools for analysis of shotgun metagenomics data. American Society for Microbiology 2021-06-29 /pmc/articles/PMC8269244/ /pubmed/34184911 http://dx.doi.org/10.1128/mSystems.00583-21 Text en Copyright © 2021 Milani et al. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Methods and Protocols
Milani, Christian
Lugli, Gabriele Andrea
Fontana, Federico
Mancabelli, Leonardo
Alessandri, Giulia
Longhi, Giulia
Anzalone, Rosaria
Viappiani, Alice
Turroni, Francesca
van Sinderen, Douwe
Ventura, Marco
METAnnotatorX2: a Comprehensive Tool for Deep and Shallow Metagenomic Data Set Analyses
title METAnnotatorX2: a Comprehensive Tool for Deep and Shallow Metagenomic Data Set Analyses
title_full METAnnotatorX2: a Comprehensive Tool for Deep and Shallow Metagenomic Data Set Analyses
title_fullStr METAnnotatorX2: a Comprehensive Tool for Deep and Shallow Metagenomic Data Set Analyses
title_full_unstemmed METAnnotatorX2: a Comprehensive Tool for Deep and Shallow Metagenomic Data Set Analyses
title_short METAnnotatorX2: a Comprehensive Tool for Deep and Shallow Metagenomic Data Set Analyses
title_sort metannotatorx2: a comprehensive tool for deep and shallow metagenomic data set analyses
topic Methods and Protocols
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8269244/
https://www.ncbi.nlm.nih.gov/pubmed/34184911
http://dx.doi.org/10.1128/mSystems.00583-21
work_keys_str_mv AT milanichristian metannotatorx2acomprehensivetoolfordeepandshallowmetagenomicdatasetanalyses
AT lugligabrieleandrea metannotatorx2acomprehensivetoolfordeepandshallowmetagenomicdatasetanalyses
AT fontanafederico metannotatorx2acomprehensivetoolfordeepandshallowmetagenomicdatasetanalyses
AT mancabellileonardo metannotatorx2acomprehensivetoolfordeepandshallowmetagenomicdatasetanalyses
AT alessandrigiulia metannotatorx2acomprehensivetoolfordeepandshallowmetagenomicdatasetanalyses
AT longhigiulia metannotatorx2acomprehensivetoolfordeepandshallowmetagenomicdatasetanalyses
AT anzalonerosaria metannotatorx2acomprehensivetoolfordeepandshallowmetagenomicdatasetanalyses
AT viappianialice metannotatorx2acomprehensivetoolfordeepandshallowmetagenomicdatasetanalyses
AT turronifrancesca metannotatorx2acomprehensivetoolfordeepandshallowmetagenomicdatasetanalyses
AT vansinderendouwe metannotatorx2acomprehensivetoolfordeepandshallowmetagenomicdatasetanalyses
AT venturamarco metannotatorx2acomprehensivetoolfordeepandshallowmetagenomicdatasetanalyses