Cargando…

Computing Platforms for Big Biological Data Analytics: Perspectives and Challenges

The last decade has witnessed an explosion in the amount of available biological sequence data, due to the rapid progress of high-throughput sequencing projects. However, the biological data amount is becoming so great that traditional data analysis platforms and methods can no longer meet the need...

Descripción completa

Detalles Bibliográficos
Autores principales: Yin, Zekun, Lan, Haidong, Tan, Guangming, Lu, Mian, Vasilakos, Athanasios V., Liu, Weiguo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5581845/
https://www.ncbi.nlm.nih.gov/pubmed/28883909
http://dx.doi.org/10.1016/j.csbj.2017.07.004
_version_ 1783261105311711232
author Yin, Zekun
Lan, Haidong
Tan, Guangming
Lu, Mian
Vasilakos, Athanasios V.
Liu, Weiguo
author_facet Yin, Zekun
Lan, Haidong
Tan, Guangming
Lu, Mian
Vasilakos, Athanasios V.
Liu, Weiguo
author_sort Yin, Zekun
collection PubMed
description The last decade has witnessed an explosion in the amount of available biological sequence data, due to the rapid progress of high-throughput sequencing projects. However, the biological data amount is becoming so great that traditional data analysis platforms and methods can no longer meet the need to rapidly perform data analysis tasks in life sciences. As a result, both biologists and computer scientists are facing the challenge of gaining a profound insight into the deepest biological functions from big biological data. This in turn requires massive computational resources. Therefore, high performance computing (HPC) platforms are highly needed as well as efficient and scalable algorithms that can take advantage of these platforms. In this paper, we survey the state-of-the-art HPC platforms for big biological data analytics. We first list the characteristics of big biological data and popular computing platforms. Then we provide a taxonomy of different biological data analysis applications and a survey of the way they have been mapped onto various computing platforms. After that, we present a case study to compare the efficiency of different computing platforms for handling the classical biological sequence alignment problem. At last we discuss the open issues in big biological data analytics.
format Online
Article
Text
id pubmed-5581845
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Research Network of Computational and Structural Biotechnology
record_format MEDLINE/PubMed
spelling pubmed-55818452017-09-07 Computing Platforms for Big Biological Data Analytics: Perspectives and Challenges Yin, Zekun Lan, Haidong Tan, Guangming Lu, Mian Vasilakos, Athanasios V. Liu, Weiguo Comput Struct Biotechnol J Short Survey The last decade has witnessed an explosion in the amount of available biological sequence data, due to the rapid progress of high-throughput sequencing projects. However, the biological data amount is becoming so great that traditional data analysis platforms and methods can no longer meet the need to rapidly perform data analysis tasks in life sciences. As a result, both biologists and computer scientists are facing the challenge of gaining a profound insight into the deepest biological functions from big biological data. This in turn requires massive computational resources. Therefore, high performance computing (HPC) platforms are highly needed as well as efficient and scalable algorithms that can take advantage of these platforms. In this paper, we survey the state-of-the-art HPC platforms for big biological data analytics. We first list the characteristics of big biological data and popular computing platforms. Then we provide a taxonomy of different biological data analysis applications and a survey of the way they have been mapped onto various computing platforms. After that, we present a case study to compare the efficiency of different computing platforms for handling the classical biological sequence alignment problem. At last we discuss the open issues in big biological data analytics. Research Network of Computational and Structural Biotechnology 2017-08-14 /pmc/articles/PMC5581845/ /pubmed/28883909 http://dx.doi.org/10.1016/j.csbj.2017.07.004 Text en © 2017 The Authors http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Short Survey
Yin, Zekun
Lan, Haidong
Tan, Guangming
Lu, Mian
Vasilakos, Athanasios V.
Liu, Weiguo
Computing Platforms for Big Biological Data Analytics: Perspectives and Challenges
title Computing Platforms for Big Biological Data Analytics: Perspectives and Challenges
title_full Computing Platforms for Big Biological Data Analytics: Perspectives and Challenges
title_fullStr Computing Platforms for Big Biological Data Analytics: Perspectives and Challenges
title_full_unstemmed Computing Platforms for Big Biological Data Analytics: Perspectives and Challenges
title_short Computing Platforms for Big Biological Data Analytics: Perspectives and Challenges
title_sort computing platforms for big biological data analytics: perspectives and challenges
topic Short Survey
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5581845/
https://www.ncbi.nlm.nih.gov/pubmed/28883909
http://dx.doi.org/10.1016/j.csbj.2017.07.004
work_keys_str_mv AT yinzekun computingplatformsforbigbiologicaldataanalyticsperspectivesandchallenges
AT lanhaidong computingplatformsforbigbiologicaldataanalyticsperspectivesandchallenges
AT tanguangming computingplatformsforbigbiologicaldataanalyticsperspectivesandchallenges
AT lumian computingplatformsforbigbiologicaldataanalyticsperspectivesandchallenges
AT vasilakosathanasiosv computingplatformsforbigbiologicaldataanalyticsperspectivesandchallenges
AT liuweiguo computingplatformsforbigbiologicaldataanalyticsperspectivesandchallenges