Cargando…
Hierarchical Parallelization of Gene Differential Association Analysis
BACKGROUND: Microarray gene differential expression analysis is a widely used technique that deals with high dimensional data and is computationally intensive for permutation-based procedures. Microarray gene differential association analysis is even more computationally demanding and must take adva...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3248234/ https://www.ncbi.nlm.nih.gov/pubmed/21936916 http://dx.doi.org/10.1186/1471-2105-12-374 |
_version_ | 1782220220941729792 |
---|---|
author | Needham, Mark Hu, Rui Dwarkadas, Sandhya Qiu, Xing |
author_facet | Needham, Mark Hu, Rui Dwarkadas, Sandhya Qiu, Xing |
author_sort | Needham, Mark |
collection | PubMed |
description | BACKGROUND: Microarray gene differential expression analysis is a widely used technique that deals with high dimensional data and is computationally intensive for permutation-based procedures. Microarray gene differential association analysis is even more computationally demanding and must take advantage of multicore computing technology, which is the driving force behind increasing compute power in recent years. In this paper, we present a two-layer hierarchical parallel implementation of gene differential association analysis. It takes advantage of both fine- and coarse-grain (with granularity defined by the frequency of communication) parallelism in order to effectively leverage the non-uniform nature of parallel processing available in the cutting-edge systems of today. RESULTS: Our results show that this hierarchical strategy matches data sharing behavior to the properties of the underlying hardware, thereby reducing the memory and bandwidth needs of the application. The resulting improved efficiency reduces computation time and allows the gene differential association analysis code to scale its execution with the number of processors. The code and biological data used in this study are downloadable from http://www.urmc.rochester.edu/biostat/people/faculty/hu.cfm. CONCLUSIONS: The performance sweet spot occurs when using a number of threads per MPI process that allows the working sets of the corresponding MPI processes running on the multicore to fit within the machine cache. Hence, we suggest that practitioners follow this principle in selecting the appropriate number of MPI processes and threads within each MPI process for their cluster configurations. We believe that the principles of this hierarchical approach to parallelization can be utilized in the parallelization of other computationally demanding kernels. |
format | Online Article Text |
id | pubmed-3248234 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-32482342011-12-30 Hierarchical Parallelization of Gene Differential Association Analysis Needham, Mark Hu, Rui Dwarkadas, Sandhya Qiu, Xing BMC Bioinformatics Software BACKGROUND: Microarray gene differential expression analysis is a widely used technique that deals with high dimensional data and is computationally intensive for permutation-based procedures. Microarray gene differential association analysis is even more computationally demanding and must take advantage of multicore computing technology, which is the driving force behind increasing compute power in recent years. In this paper, we present a two-layer hierarchical parallel implementation of gene differential association analysis. It takes advantage of both fine- and coarse-grain (with granularity defined by the frequency of communication) parallelism in order to effectively leverage the non-uniform nature of parallel processing available in the cutting-edge systems of today. RESULTS: Our results show that this hierarchical strategy matches data sharing behavior to the properties of the underlying hardware, thereby reducing the memory and bandwidth needs of the application. The resulting improved efficiency reduces computation time and allows the gene differential association analysis code to scale its execution with the number of processors. The code and biological data used in this study are downloadable from http://www.urmc.rochester.edu/biostat/people/faculty/hu.cfm. CONCLUSIONS: The performance sweet spot occurs when using a number of threads per MPI process that allows the working sets of the corresponding MPI processes running on the multicore to fit within the machine cache. Hence, we suggest that practitioners follow this principle in selecting the appropriate number of MPI processes and threads within each MPI process for their cluster configurations. We believe that the principles of this hierarchical approach to parallelization can be utilized in the parallelization of other computationally demanding kernels. BioMed Central 2011-09-21 /pmc/articles/PMC3248234/ /pubmed/21936916 http://dx.doi.org/10.1186/1471-2105-12-374 Text en Copyright © 2011 Needham et al; licensee BioMed Central Ltd. https://creativecommons.org/licenses/by/2.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0 (https://creativecommons.org/licenses/by/2.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Software Needham, Mark Hu, Rui Dwarkadas, Sandhya Qiu, Xing Hierarchical Parallelization of Gene Differential Association Analysis |
title | Hierarchical Parallelization of Gene Differential Association Analysis |
title_full | Hierarchical Parallelization of Gene Differential Association Analysis |
title_fullStr | Hierarchical Parallelization of Gene Differential Association Analysis |
title_full_unstemmed | Hierarchical Parallelization of Gene Differential Association Analysis |
title_short | Hierarchical Parallelization of Gene Differential Association Analysis |
title_sort | hierarchical parallelization of gene differential association analysis |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3248234/ https://www.ncbi.nlm.nih.gov/pubmed/21936916 http://dx.doi.org/10.1186/1471-2105-12-374 |
work_keys_str_mv | AT needhammark hierarchicalparallelizationofgenedifferentialassociationanalysis AT hurui hierarchicalparallelizationofgenedifferentialassociationanalysis AT dwarkadassandhya hierarchicalparallelizationofgenedifferentialassociationanalysis AT qiuxing hierarchicalparallelizationofgenedifferentialassociationanalysis |