Cargando…

GenoVault: a cloud based genomics repository

GenoVault is a cloud-based repository for handling Next Generation Sequencing (NGS) data. It is developed using OpenStack-based private cloud with various services like keystone for authentication, cinder for block storage, neutron for networking and nova for managing compute instances for the Cloud...

Descripción completa

Detalles Bibliográficos
Autores principales: Jain, Sankalp, Saxena, Amit, Hesarur, Suprit, Bhadhadhara, Kirti, Bharti, Neeraj, Kasibhatla, Sunitha Manjari, Sonavane, Uddhavesh, Joshi, Rajendra
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8319889/
https://www.ncbi.nlm.nih.gov/pubmed/34325724
http://dx.doi.org/10.1186/s13040-021-00268-5
_version_ 1783730542291714048
author Jain, Sankalp
Saxena, Amit
Hesarur, Suprit
Bhadhadhara, Kirti
Bharti, Neeraj
Kasibhatla, Sunitha Manjari
Sonavane, Uddhavesh
Joshi, Rajendra
author_facet Jain, Sankalp
Saxena, Amit
Hesarur, Suprit
Bhadhadhara, Kirti
Bharti, Neeraj
Kasibhatla, Sunitha Manjari
Sonavane, Uddhavesh
Joshi, Rajendra
author_sort Jain, Sankalp
collection PubMed
description GenoVault is a cloud-based repository for handling Next Generation Sequencing (NGS) data. It is developed using OpenStack-based private cloud with various services like keystone for authentication, cinder for block storage, neutron for networking and nova for managing compute instances for the Cloud. GenoVault uses object-based storage, which enables data to be stored as objects instead of files or blocks for faster retrieval from different distributed object nodes. Along with a web-based interface, a JavaFX-based desktop client has also been developed to meet the requirements of large file uploads that are usually seen in NGS datasets. Users can store files in their respective object-based storage areas and the metadata provided by the user during file uploads is used for querying the database. GenoVault repository is designed taking into account future needs and hence can scale both vertically and horizontally using OpenStack-based cloud features. Users have an option to make the data shareable to the public or restrict the access as private. Data security is ensured as every container is a separate entity in object-based storage architecture which is also supported by Secure File Transfer Protocol (SFTP) for data upload and download. The data is uploaded by the user in individual containers that include raw read files (fastq), processed alignment files (bam, sam, bed) and the output of variation detection (vcf). GenoVault architecture allows verification of the data in terms of integrity and authentication before making it available to collaborators as per the user’s permissions. GenoVault is useful for maintaining the organization-wide NGS data generated in various labs which is not yet published and submitted to public repositories like NCBI. GenoVault also provides support to share NGS data among the collaborating institutions. GenoVault can thus manage vast volumes of NGS data on any OpenStack-based private cloud.
format Online
Article
Text
id pubmed-8319889
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-83198892021-07-29 GenoVault: a cloud based genomics repository Jain, Sankalp Saxena, Amit Hesarur, Suprit Bhadhadhara, Kirti Bharti, Neeraj Kasibhatla, Sunitha Manjari Sonavane, Uddhavesh Joshi, Rajendra BioData Min Software Article GenoVault is a cloud-based repository for handling Next Generation Sequencing (NGS) data. It is developed using OpenStack-based private cloud with various services like keystone for authentication, cinder for block storage, neutron for networking and nova for managing compute instances for the Cloud. GenoVault uses object-based storage, which enables data to be stored as objects instead of files or blocks for faster retrieval from different distributed object nodes. Along with a web-based interface, a JavaFX-based desktop client has also been developed to meet the requirements of large file uploads that are usually seen in NGS datasets. Users can store files in their respective object-based storage areas and the metadata provided by the user during file uploads is used for querying the database. GenoVault repository is designed taking into account future needs and hence can scale both vertically and horizontally using OpenStack-based cloud features. Users have an option to make the data shareable to the public or restrict the access as private. Data security is ensured as every container is a separate entity in object-based storage architecture which is also supported by Secure File Transfer Protocol (SFTP) for data upload and download. The data is uploaded by the user in individual containers that include raw read files (fastq), processed alignment files (bam, sam, bed) and the output of variation detection (vcf). GenoVault architecture allows verification of the data in terms of integrity and authentication before making it available to collaborators as per the user’s permissions. GenoVault is useful for maintaining the organization-wide NGS data generated in various labs which is not yet published and submitted to public repositories like NCBI. GenoVault also provides support to share NGS data among the collaborating institutions. GenoVault can thus manage vast volumes of NGS data on any OpenStack-based private cloud. BioMed Central 2021-07-29 /pmc/articles/PMC8319889/ /pubmed/34325724 http://dx.doi.org/10.1186/s13040-021-00268-5 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Software Article
Jain, Sankalp
Saxena, Amit
Hesarur, Suprit
Bhadhadhara, Kirti
Bharti, Neeraj
Kasibhatla, Sunitha Manjari
Sonavane, Uddhavesh
Joshi, Rajendra
GenoVault: a cloud based genomics repository
title GenoVault: a cloud based genomics repository
title_full GenoVault: a cloud based genomics repository
title_fullStr GenoVault: a cloud based genomics repository
title_full_unstemmed GenoVault: a cloud based genomics repository
title_short GenoVault: a cloud based genomics repository
title_sort genovault: a cloud based genomics repository
topic Software Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8319889/
https://www.ncbi.nlm.nih.gov/pubmed/34325724
http://dx.doi.org/10.1186/s13040-021-00268-5
work_keys_str_mv AT jainsankalp genovaultacloudbasedgenomicsrepository
AT saxenaamit genovaultacloudbasedgenomicsrepository
AT hesarursuprit genovaultacloudbasedgenomicsrepository
AT bhadhadharakirti genovaultacloudbasedgenomicsrepository
AT bhartineeraj genovaultacloudbasedgenomicsrepository
AT kasibhatlasunithamanjari genovaultacloudbasedgenomicsrepository
AT sonavaneuddhavesh genovaultacloudbasedgenomicsrepository
AT joshirajendra genovaultacloudbasedgenomicsrepository