Cargando…

Efficient Implementation of MrBayes on Multi-GPU

MrBayes, using Metropolis-coupled Markov chain Monte Carlo (MCMCMC or (MC)(3)), is a popular program for Bayesian inference. As a leading method of using DNA data to infer phylogeny, the (MC)(3) Bayesian algorithm and its improved and parallel versions are now not fast enough for biologists to analy...

Descripción completa

Detalles Bibliográficos
Autores principales: Bao, Jie, Xia, Hongju, Zhou, Jianfu, Liu, Xiaoguang, Wang, Gang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3649675/
https://www.ncbi.nlm.nih.gov/pubmed/23493260
http://dx.doi.org/10.1093/molbev/mst043
_version_ 1782269015263019008
author Bao, Jie
Xia, Hongju
Zhou, Jianfu
Liu, Xiaoguang
Wang, Gang
author_facet Bao, Jie
Xia, Hongju
Zhou, Jianfu
Liu, Xiaoguang
Wang, Gang
author_sort Bao, Jie
collection PubMed
description MrBayes, using Metropolis-coupled Markov chain Monte Carlo (MCMCMC or (MC)(3)), is a popular program for Bayesian inference. As a leading method of using DNA data to infer phylogeny, the (MC)(3) Bayesian algorithm and its improved and parallel versions are now not fast enough for biologists to analyze massive real-world DNA data. Recently, graphics processor unit (GPU) has shown its power as a coprocessor (or rather, an accelerator) in many fields. This article describes an efficient implementation a(MC)(3) (aMCMCMC) for MrBayes (MC)(3) on compute unified device architecture. By dynamically adjusting the task granularity to adapt to input data size and hardware configuration, it makes full use of GPU cores with different data sets. An adaptive method is also developed to split and combine DNA sequences to make full use of a large number of GPU cards. Furthermore, a new “node-by-node” task scheduling strategy is developed to improve concurrency, and several optimizing methods are used to reduce extra overhead. Experimental results show that a(MC)(3) achieves up to 63× speedup over serial MrBayes on a single machine with one GPU card, and up to 170× speedup with four GPU cards, and up to 478× speedup with a 32-node GPU cluster. a(MC)(3) is dramatically faster than all the previous (MC)(3) algorithms and scales well to large GPU clusters.
format Online
Article
Text
id pubmed-3649675
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-36496752013-05-13 Efficient Implementation of MrBayes on Multi-GPU Bao, Jie Xia, Hongju Zhou, Jianfu Liu, Xiaoguang Wang, Gang Mol Biol Evol Resources MrBayes, using Metropolis-coupled Markov chain Monte Carlo (MCMCMC or (MC)(3)), is a popular program for Bayesian inference. As a leading method of using DNA data to infer phylogeny, the (MC)(3) Bayesian algorithm and its improved and parallel versions are now not fast enough for biologists to analyze massive real-world DNA data. Recently, graphics processor unit (GPU) has shown its power as a coprocessor (or rather, an accelerator) in many fields. This article describes an efficient implementation a(MC)(3) (aMCMCMC) for MrBayes (MC)(3) on compute unified device architecture. By dynamically adjusting the task granularity to adapt to input data size and hardware configuration, it makes full use of GPU cores with different data sets. An adaptive method is also developed to split and combine DNA sequences to make full use of a large number of GPU cards. Furthermore, a new “node-by-node” task scheduling strategy is developed to improve concurrency, and several optimizing methods are used to reduce extra overhead. Experimental results show that a(MC)(3) achieves up to 63× speedup over serial MrBayes on a single machine with one GPU card, and up to 170× speedup with four GPU cards, and up to 478× speedup with a 32-node GPU cluster. a(MC)(3) is dramatically faster than all the previous (MC)(3) algorithms and scales well to large GPU clusters. Oxford University Press 2013-06 2013-03-14 /pmc/articles/PMC3649675/ /pubmed/23493260 http://dx.doi.org/10.1093/molbev/mst043 Text en © The Author 2013. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Resources
Bao, Jie
Xia, Hongju
Zhou, Jianfu
Liu, Xiaoguang
Wang, Gang
Efficient Implementation of MrBayes on Multi-GPU
title Efficient Implementation of MrBayes on Multi-GPU
title_full Efficient Implementation of MrBayes on Multi-GPU
title_fullStr Efficient Implementation of MrBayes on Multi-GPU
title_full_unstemmed Efficient Implementation of MrBayes on Multi-GPU
title_short Efficient Implementation of MrBayes on Multi-GPU
title_sort efficient implementation of mrbayes on multi-gpu
topic Resources
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3649675/
https://www.ncbi.nlm.nih.gov/pubmed/23493260
http://dx.doi.org/10.1093/molbev/mst043
work_keys_str_mv AT baojie efficientimplementationofmrbayesonmultigpu
AT xiahongju efficientimplementationofmrbayesonmultigpu
AT zhoujianfu efficientimplementationofmrbayesonmultigpu
AT liuxiaoguang efficientimplementationofmrbayesonmultigpu
AT wanggang efficientimplementationofmrbayesonmultigpu