Cargando…
An Interface for Biomedical Big Data Processing on the Tianhe-2 Supercomputer
Big data, cloud computing, and high-performance computing (HPC) are at the verge of convergence. Cloud computing is already playing an active part in big data processing with the help of big data frameworks like Hadoop and Spark. The recent upsurge of high-performance computing in China provides ext...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6149962/ https://www.ncbi.nlm.nih.gov/pubmed/29194413 http://dx.doi.org/10.3390/molecules22122116 |
_version_ | 1783356906004283392 |
---|---|
author | Yang, Xi Wu, Chengkun Lu, Kai Fang, Lin Zhang, Yong Li, Shengkang Guo, Guixin Du, YunFei |
author_facet | Yang, Xi Wu, Chengkun Lu, Kai Fang, Lin Zhang, Yong Li, Shengkang Guo, Guixin Du, YunFei |
author_sort | Yang, Xi |
collection | PubMed |
description | Big data, cloud computing, and high-performance computing (HPC) are at the verge of convergence. Cloud computing is already playing an active part in big data processing with the help of big data frameworks like Hadoop and Spark. The recent upsurge of high-performance computing in China provides extra possibilities and capacity to address the challenges associated with big data. In this paper, we propose Orion—a big data interface on the Tianhe-2 supercomputer—to enable big data applications to run on Tianhe-2 via a single command or a shell script. Orion supports multiple users, and each user can launch multiple tasks. It minimizes the effort needed to initiate big data applications on the Tianhe-2 supercomputer via automated configuration. Orion follows the “allocate-when-needed” paradigm, and it avoids the idle occupation of computational resources. We tested the utility and performance of Orion using a big genomic dataset and achieved a satisfactory performance on Tianhe-2 with very few modifications to existing applications that were implemented in Hadoop/Spark. In summary, Orion provides a practical and economical interface for big data processing on Tianhe-2. |
format | Online Article Text |
id | pubmed-6149962 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-61499622018-11-13 An Interface for Biomedical Big Data Processing on the Tianhe-2 Supercomputer Yang, Xi Wu, Chengkun Lu, Kai Fang, Lin Zhang, Yong Li, Shengkang Guo, Guixin Du, YunFei Molecules Article Big data, cloud computing, and high-performance computing (HPC) are at the verge of convergence. Cloud computing is already playing an active part in big data processing with the help of big data frameworks like Hadoop and Spark. The recent upsurge of high-performance computing in China provides extra possibilities and capacity to address the challenges associated with big data. In this paper, we propose Orion—a big data interface on the Tianhe-2 supercomputer—to enable big data applications to run on Tianhe-2 via a single command or a shell script. Orion supports multiple users, and each user can launch multiple tasks. It minimizes the effort needed to initiate big data applications on the Tianhe-2 supercomputer via automated configuration. Orion follows the “allocate-when-needed” paradigm, and it avoids the idle occupation of computational resources. We tested the utility and performance of Orion using a big genomic dataset and achieved a satisfactory performance on Tianhe-2 with very few modifications to existing applications that were implemented in Hadoop/Spark. In summary, Orion provides a practical and economical interface for big data processing on Tianhe-2. MDPI 2017-12-01 /pmc/articles/PMC6149962/ /pubmed/29194413 http://dx.doi.org/10.3390/molecules22122116 Text en © 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Yang, Xi Wu, Chengkun Lu, Kai Fang, Lin Zhang, Yong Li, Shengkang Guo, Guixin Du, YunFei An Interface for Biomedical Big Data Processing on the Tianhe-2 Supercomputer |
title | An Interface for Biomedical Big Data Processing on the Tianhe-2 Supercomputer |
title_full | An Interface for Biomedical Big Data Processing on the Tianhe-2 Supercomputer |
title_fullStr | An Interface for Biomedical Big Data Processing on the Tianhe-2 Supercomputer |
title_full_unstemmed | An Interface for Biomedical Big Data Processing on the Tianhe-2 Supercomputer |
title_short | An Interface for Biomedical Big Data Processing on the Tianhe-2 Supercomputer |
title_sort | interface for biomedical big data processing on the tianhe-2 supercomputer |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6149962/ https://www.ncbi.nlm.nih.gov/pubmed/29194413 http://dx.doi.org/10.3390/molecules22122116 |
work_keys_str_mv | AT yangxi aninterfaceforbiomedicalbigdataprocessingonthetianhe2supercomputer AT wuchengkun aninterfaceforbiomedicalbigdataprocessingonthetianhe2supercomputer AT lukai aninterfaceforbiomedicalbigdataprocessingonthetianhe2supercomputer AT fanglin aninterfaceforbiomedicalbigdataprocessingonthetianhe2supercomputer AT zhangyong aninterfaceforbiomedicalbigdataprocessingonthetianhe2supercomputer AT lishengkang aninterfaceforbiomedicalbigdataprocessingonthetianhe2supercomputer AT guoguixin aninterfaceforbiomedicalbigdataprocessingonthetianhe2supercomputer AT duyunfei aninterfaceforbiomedicalbigdataprocessingonthetianhe2supercomputer AT yangxi interfaceforbiomedicalbigdataprocessingonthetianhe2supercomputer AT wuchengkun interfaceforbiomedicalbigdataprocessingonthetianhe2supercomputer AT lukai interfaceforbiomedicalbigdataprocessingonthetianhe2supercomputer AT fanglin interfaceforbiomedicalbigdataprocessingonthetianhe2supercomputer AT zhangyong interfaceforbiomedicalbigdataprocessingonthetianhe2supercomputer AT lishengkang interfaceforbiomedicalbigdataprocessingonthetianhe2supercomputer AT guoguixin interfaceforbiomedicalbigdataprocessingonthetianhe2supercomputer AT duyunfei interfaceforbiomedicalbigdataprocessingonthetianhe2supercomputer |