Cargando…

Parallel-META: efficient metagenomic data analysis based on high-performance computation

BACKGROUND: Metagenomics method directly sequences and analyses genome information from microbial communities. There are usually more than hundreds of genomes from different microbial species in the same community, and the main computational tasks for metagenomic data analyses include taxonomical an...

Descripción completa

Detalles Bibliográficos
Autores principales:	Su, Xiaoquan, Xu, Jian, Ning, Kang
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2012
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3403166/ https://www.ncbi.nlm.nih.gov/pubmed/23046922 http://dx.doi.org/10.1186/1752-0509-6-S1-S16

_version_	1782238847566872576
author	Su, Xiaoquan Xu, Jian Ning, Kang
author_facet	Su, Xiaoquan Xu, Jian Ning, Kang
author_sort	Su, Xiaoquan
collection	PubMed
description	BACKGROUND: Metagenomics method directly sequences and analyses genome information from microbial communities. There are usually more than hundreds of genomes from different microbial species in the same community, and the main computational tasks for metagenomic data analyses include taxonomical and functional component examination of all genomes in the microbial community. Metagenomic data analysis is both data- and computation- intensive, which requires extensive computational power. Most of the current metagenomic data analysis softwares were designed to be used on a single computer or single computer clusters, which could not match with the fast increasing number of large metagenomic projects' computational requirements. Therefore, advanced computational methods and pipelines have to be developed to cope with such need for efficient analyses. RESULT: In this paper, we proposed Parallel-META, a GPU- and multi-core-CPU-based open-source pipeline for metagenomic data analysis, which enabled the efficient and parallel analysis of multiple metagenomic datasets and the visualization of the results for multiple samples. In Parallel-META, the similarity-based database search was parallelized based on GPU computing and multi-core CPU computing optimization. Experiments have shown that Parallel-META has at least 15 times speed-up compared to traditional metagenomic data analysis method, with the same accuracy of the results http://www.computationalbioenergy.org/parallel-meta.html. CONCLUSION: The parallel processing of current metagenomic data would be very promising: with current speed up of 15 times and above, binning would not be a very time-consuming process any more. Therefore, some deeper analysis of the metagenomic data, such as the comparison of different samples, would be feasible in the pipeline, and some of these functionalities have been included into the Parallel-META pipeline.
format	Online Article Text
id	pubmed-3403166
institution	National Center for Biotechnology Information
language	English
publishDate	2012
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-34031662012-07-25 Parallel-META: efficient metagenomic data analysis based on high-performance computation Su, Xiaoquan Xu, Jian Ning, Kang BMC Syst Biol Research BACKGROUND: Metagenomics method directly sequences and analyses genome information from microbial communities. There are usually more than hundreds of genomes from different microbial species in the same community, and the main computational tasks for metagenomic data analyses include taxonomical and functional component examination of all genomes in the microbial community. Metagenomic data analysis is both data- and computation- intensive, which requires extensive computational power. Most of the current metagenomic data analysis softwares were designed to be used on a single computer or single computer clusters, which could not match with the fast increasing number of large metagenomic projects' computational requirements. Therefore, advanced computational methods and pipelines have to be developed to cope with such need for efficient analyses. RESULT: In this paper, we proposed Parallel-META, a GPU- and multi-core-CPU-based open-source pipeline for metagenomic data analysis, which enabled the efficient and parallel analysis of multiple metagenomic datasets and the visualization of the results for multiple samples. In Parallel-META, the similarity-based database search was parallelized based on GPU computing and multi-core CPU computing optimization. Experiments have shown that Parallel-META has at least 15 times speed-up compared to traditional metagenomic data analysis method, with the same accuracy of the results http://www.computationalbioenergy.org/parallel-meta.html. CONCLUSION: The parallel processing of current metagenomic data would be very promising: with current speed up of 15 times and above, binning would not be a very time-consuming process any more. Therefore, some deeper analysis of the metagenomic data, such as the comparison of different samples, would be feasible in the pipeline, and some of these functionalities have been included into the Parallel-META pipeline. BioMed Central 2012-07-16 /pmc/articles/PMC3403166/ /pubmed/23046922 http://dx.doi.org/10.1186/1752-0509-6-S1-S16 Text en Copyright ©2012 Su et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Su, Xiaoquan Xu, Jian Ning, Kang Parallel-META: efficient metagenomic data analysis based on high-performance computation
title	Parallel-META: efficient metagenomic data analysis based on high-performance computation
title_full	Parallel-META: efficient metagenomic data analysis based on high-performance computation
title_fullStr	Parallel-META: efficient metagenomic data analysis based on high-performance computation
title_full_unstemmed	Parallel-META: efficient metagenomic data analysis based on high-performance computation
title_short	Parallel-META: efficient metagenomic data analysis based on high-performance computation
title_sort	parallel-meta: efficient metagenomic data analysis based on high-performance computation
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3403166/ https://www.ncbi.nlm.nih.gov/pubmed/23046922 http://dx.doi.org/10.1186/1752-0509-6-S1-S16
work_keys_str_mv	AT suxiaoquan parallelmetaefficientmetagenomicdataanalysisbasedonhighperformancecomputation AT xujian parallelmetaefficientmetagenomicdataanalysisbasedonhighperformancecomputation AT ningkang parallelmetaefficientmetagenomicdataanalysisbasedonhighperformancecomputation

Parallel-META: efficient metagenomic data analysis based on high-performance computation

Ejemplares similares