Cargando…

Hierarchical Parallelization of Gene Differential Association Analysis

BACKGROUND: Microarray gene differential expression analysis is a widely used technique that deals with high dimensional data and is computationally intensive for permutation-based procedures. Microarray gene differential association analysis is even more computationally demanding and must take adva...

Descripción completa

Detalles Bibliográficos
Autores principales: Needham, Mark, Hu, Rui, Dwarkadas, Sandhya, Qiu, Xing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3248234/
https://www.ncbi.nlm.nih.gov/pubmed/21936916
http://dx.doi.org/10.1186/1471-2105-12-374
_version_ 1782220220941729792
author Needham, Mark
Hu, Rui
Dwarkadas, Sandhya
Qiu, Xing
author_facet Needham, Mark
Hu, Rui
Dwarkadas, Sandhya
Qiu, Xing
author_sort Needham, Mark
collection PubMed
description BACKGROUND: Microarray gene differential expression analysis is a widely used technique that deals with high dimensional data and is computationally intensive for permutation-based procedures. Microarray gene differential association analysis is even more computationally demanding and must take advantage of multicore computing technology, which is the driving force behind increasing compute power in recent years. In this paper, we present a two-layer hierarchical parallel implementation of gene differential association analysis. It takes advantage of both fine- and coarse-grain (with granularity defined by the frequency of communication) parallelism in order to effectively leverage the non-uniform nature of parallel processing available in the cutting-edge systems of today. RESULTS: Our results show that this hierarchical strategy matches data sharing behavior to the properties of the underlying hardware, thereby reducing the memory and bandwidth needs of the application. The resulting improved efficiency reduces computation time and allows the gene differential association analysis code to scale its execution with the number of processors. The code and biological data used in this study are downloadable from http://www.urmc.rochester.edu/biostat/people/faculty/hu.cfm. CONCLUSIONS: The performance sweet spot occurs when using a number of threads per MPI process that allows the working sets of the corresponding MPI processes running on the multicore to fit within the machine cache. Hence, we suggest that practitioners follow this principle in selecting the appropriate number of MPI processes and threads within each MPI process for their cluster configurations. We believe that the principles of this hierarchical approach to parallelization can be utilized in the parallelization of other computationally demanding kernels.
format Online
Article
Text
id pubmed-3248234
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32482342011-12-30 Hierarchical Parallelization of Gene Differential Association Analysis Needham, Mark Hu, Rui Dwarkadas, Sandhya Qiu, Xing BMC Bioinformatics Software BACKGROUND: Microarray gene differential expression analysis is a widely used technique that deals with high dimensional data and is computationally intensive for permutation-based procedures. Microarray gene differential association analysis is even more computationally demanding and must take advantage of multicore computing technology, which is the driving force behind increasing compute power in recent years. In this paper, we present a two-layer hierarchical parallel implementation of gene differential association analysis. It takes advantage of both fine- and coarse-grain (with granularity defined by the frequency of communication) parallelism in order to effectively leverage the non-uniform nature of parallel processing available in the cutting-edge systems of today. RESULTS: Our results show that this hierarchical strategy matches data sharing behavior to the properties of the underlying hardware, thereby reducing the memory and bandwidth needs of the application. The resulting improved efficiency reduces computation time and allows the gene differential association analysis code to scale its execution with the number of processors. The code and biological data used in this study are downloadable from http://www.urmc.rochester.edu/biostat/people/faculty/hu.cfm. CONCLUSIONS: The performance sweet spot occurs when using a number of threads per MPI process that allows the working sets of the corresponding MPI processes running on the multicore to fit within the machine cache. Hence, we suggest that practitioners follow this principle in selecting the appropriate number of MPI processes and threads within each MPI process for their cluster configurations. We believe that the principles of this hierarchical approach to parallelization can be utilized in the parallelization of other computationally demanding kernels. BioMed Central 2011-09-21 /pmc/articles/PMC3248234/ /pubmed/21936916 http://dx.doi.org/10.1186/1471-2105-12-374 Text en Copyright © 2011 Needham et al; licensee BioMed Central Ltd. https://creativecommons.org/licenses/by/2.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0 (https://creativecommons.org/licenses/by/2.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Needham, Mark
Hu, Rui
Dwarkadas, Sandhya
Qiu, Xing
Hierarchical Parallelization of Gene Differential Association Analysis
title Hierarchical Parallelization of Gene Differential Association Analysis
title_full Hierarchical Parallelization of Gene Differential Association Analysis
title_fullStr Hierarchical Parallelization of Gene Differential Association Analysis
title_full_unstemmed Hierarchical Parallelization of Gene Differential Association Analysis
title_short Hierarchical Parallelization of Gene Differential Association Analysis
title_sort hierarchical parallelization of gene differential association analysis
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3248234/
https://www.ncbi.nlm.nih.gov/pubmed/21936916
http://dx.doi.org/10.1186/1471-2105-12-374
work_keys_str_mv AT needhammark hierarchicalparallelizationofgenedifferentialassociationanalysis
AT hurui hierarchicalparallelizationofgenedifferentialassociationanalysis
AT dwarkadassandhya hierarchicalparallelizationofgenedifferentialassociationanalysis
AT qiuxing hierarchicalparallelizationofgenedifferentialassociationanalysis