Cargando…
Scalable in-memory processing of omics workflows
We present a proof of concept implementation of the in-memory computing paradigm that we use to facilitate the analysis of metagenomic sequencing reads. In doing so we compare the performance of POSIX™file systems and key-value storage for omics data, and we show the potential for integrating high-p...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Research Network of Computational and Structural Biotechnology
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9052061/ https://www.ncbi.nlm.nih.gov/pubmed/35521547 http://dx.doi.org/10.1016/j.csbj.2022.04.014 |
_version_ | 1784696704326434816 |
---|---|
author | Elisseev, Vadim Gardiner, Laura-Jayne Krishna, Ritesh |
author_facet | Elisseev, Vadim Gardiner, Laura-Jayne Krishna, Ritesh |
author_sort | Elisseev, Vadim |
collection | PubMed |
description | We present a proof of concept implementation of the in-memory computing paradigm that we use to facilitate the analysis of metagenomic sequencing reads. In doing so we compare the performance of POSIX™file systems and key-value storage for omics data, and we show the potential for integrating high-performance computing (HPC) and cloud native technologies. We show that in-memory key-value storage offers possibilities for improved handling of omics data through more flexible and faster data processing. We envision fully containerized workflows and their deployment in portable micro-pipelines with multiple instances working concurrently with the same distributed in-memory storage. To highlight the potential usage of this technology for event driven and real-time data processing, we use a biological case study focused on the growing threat of antimicrobial resistance (AMR). We develop a workflow encompassing bioinformatics and explainable machine learning (ML) to predict life expectancy of a population based on the microbiome of its sewage while providing a description of AMR contribution to the prediction. We propose that in future, performing such analyses in ’real-time’ would allow us to assess the potential risk to the population based on changes in the AMR profile of the community. |
format | Online Article Text |
id | pubmed-9052061 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Research Network of Computational and Structural Biotechnology |
record_format | MEDLINE/PubMed |
spelling | pubmed-90520612022-05-04 Scalable in-memory processing of omics workflows Elisseev, Vadim Gardiner, Laura-Jayne Krishna, Ritesh Comput Struct Biotechnol J Research Article We present a proof of concept implementation of the in-memory computing paradigm that we use to facilitate the analysis of metagenomic sequencing reads. In doing so we compare the performance of POSIX™file systems and key-value storage for omics data, and we show the potential for integrating high-performance computing (HPC) and cloud native technologies. We show that in-memory key-value storage offers possibilities for improved handling of omics data through more flexible and faster data processing. We envision fully containerized workflows and their deployment in portable micro-pipelines with multiple instances working concurrently with the same distributed in-memory storage. To highlight the potential usage of this technology for event driven and real-time data processing, we use a biological case study focused on the growing threat of antimicrobial resistance (AMR). We develop a workflow encompassing bioinformatics and explainable machine learning (ML) to predict life expectancy of a population based on the microbiome of its sewage while providing a description of AMR contribution to the prediction. We propose that in future, performing such analyses in ’real-time’ would allow us to assess the potential risk to the population based on changes in the AMR profile of the community. Research Network of Computational and Structural Biotechnology 2022-04-20 /pmc/articles/PMC9052061/ /pubmed/35521547 http://dx.doi.org/10.1016/j.csbj.2022.04.014 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Research Article Elisseev, Vadim Gardiner, Laura-Jayne Krishna, Ritesh Scalable in-memory processing of omics workflows |
title | Scalable in-memory processing of omics workflows |
title_full | Scalable in-memory processing of omics workflows |
title_fullStr | Scalable in-memory processing of omics workflows |
title_full_unstemmed | Scalable in-memory processing of omics workflows |
title_short | Scalable in-memory processing of omics workflows |
title_sort | scalable in-memory processing of omics workflows |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9052061/ https://www.ncbi.nlm.nih.gov/pubmed/35521547 http://dx.doi.org/10.1016/j.csbj.2022.04.014 |
work_keys_str_mv | AT elisseevvadim scalableinmemoryprocessingofomicsworkflows AT gardinerlaurajayne scalableinmemoryprocessingofomicsworkflows AT krishnaritesh scalableinmemoryprocessingofomicsworkflows |