Cargando…
GenoVault: a cloud based genomics repository
GenoVault is a cloud-based repository for handling Next Generation Sequencing (NGS) data. It is developed using OpenStack-based private cloud with various services like keystone for authentication, cinder for block storage, neutron for networking and nova for managing compute instances for the Cloud...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8319889/ https://www.ncbi.nlm.nih.gov/pubmed/34325724 http://dx.doi.org/10.1186/s13040-021-00268-5 |
_version_ | 1783730542291714048 |
---|---|
author | Jain, Sankalp Saxena, Amit Hesarur, Suprit Bhadhadhara, Kirti Bharti, Neeraj Kasibhatla, Sunitha Manjari Sonavane, Uddhavesh Joshi, Rajendra |
author_facet | Jain, Sankalp Saxena, Amit Hesarur, Suprit Bhadhadhara, Kirti Bharti, Neeraj Kasibhatla, Sunitha Manjari Sonavane, Uddhavesh Joshi, Rajendra |
author_sort | Jain, Sankalp |
collection | PubMed |
description | GenoVault is a cloud-based repository for handling Next Generation Sequencing (NGS) data. It is developed using OpenStack-based private cloud with various services like keystone for authentication, cinder for block storage, neutron for networking and nova for managing compute instances for the Cloud. GenoVault uses object-based storage, which enables data to be stored as objects instead of files or blocks for faster retrieval from different distributed object nodes. Along with a web-based interface, a JavaFX-based desktop client has also been developed to meet the requirements of large file uploads that are usually seen in NGS datasets. Users can store files in their respective object-based storage areas and the metadata provided by the user during file uploads is used for querying the database. GenoVault repository is designed taking into account future needs and hence can scale both vertically and horizontally using OpenStack-based cloud features. Users have an option to make the data shareable to the public or restrict the access as private. Data security is ensured as every container is a separate entity in object-based storage architecture which is also supported by Secure File Transfer Protocol (SFTP) for data upload and download. The data is uploaded by the user in individual containers that include raw read files (fastq), processed alignment files (bam, sam, bed) and the output of variation detection (vcf). GenoVault architecture allows verification of the data in terms of integrity and authentication before making it available to collaborators as per the user’s permissions. GenoVault is useful for maintaining the organization-wide NGS data generated in various labs which is not yet published and submitted to public repositories like NCBI. GenoVault also provides support to share NGS data among the collaborating institutions. GenoVault can thus manage vast volumes of NGS data on any OpenStack-based private cloud. |
format | Online Article Text |
id | pubmed-8319889 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-83198892021-07-29 GenoVault: a cloud based genomics repository Jain, Sankalp Saxena, Amit Hesarur, Suprit Bhadhadhara, Kirti Bharti, Neeraj Kasibhatla, Sunitha Manjari Sonavane, Uddhavesh Joshi, Rajendra BioData Min Software Article GenoVault is a cloud-based repository for handling Next Generation Sequencing (NGS) data. It is developed using OpenStack-based private cloud with various services like keystone for authentication, cinder for block storage, neutron for networking and nova for managing compute instances for the Cloud. GenoVault uses object-based storage, which enables data to be stored as objects instead of files or blocks for faster retrieval from different distributed object nodes. Along with a web-based interface, a JavaFX-based desktop client has also been developed to meet the requirements of large file uploads that are usually seen in NGS datasets. Users can store files in their respective object-based storage areas and the metadata provided by the user during file uploads is used for querying the database. GenoVault repository is designed taking into account future needs and hence can scale both vertically and horizontally using OpenStack-based cloud features. Users have an option to make the data shareable to the public or restrict the access as private. Data security is ensured as every container is a separate entity in object-based storage architecture which is also supported by Secure File Transfer Protocol (SFTP) for data upload and download. The data is uploaded by the user in individual containers that include raw read files (fastq), processed alignment files (bam, sam, bed) and the output of variation detection (vcf). GenoVault architecture allows verification of the data in terms of integrity and authentication before making it available to collaborators as per the user’s permissions. GenoVault is useful for maintaining the organization-wide NGS data generated in various labs which is not yet published and submitted to public repositories like NCBI. GenoVault also provides support to share NGS data among the collaborating institutions. GenoVault can thus manage vast volumes of NGS data on any OpenStack-based private cloud. BioMed Central 2021-07-29 /pmc/articles/PMC8319889/ /pubmed/34325724 http://dx.doi.org/10.1186/s13040-021-00268-5 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Software Article Jain, Sankalp Saxena, Amit Hesarur, Suprit Bhadhadhara, Kirti Bharti, Neeraj Kasibhatla, Sunitha Manjari Sonavane, Uddhavesh Joshi, Rajendra GenoVault: a cloud based genomics repository |
title | GenoVault: a cloud based genomics repository |
title_full | GenoVault: a cloud based genomics repository |
title_fullStr | GenoVault: a cloud based genomics repository |
title_full_unstemmed | GenoVault: a cloud based genomics repository |
title_short | GenoVault: a cloud based genomics repository |
title_sort | genovault: a cloud based genomics repository |
topic | Software Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8319889/ https://www.ncbi.nlm.nih.gov/pubmed/34325724 http://dx.doi.org/10.1186/s13040-021-00268-5 |
work_keys_str_mv | AT jainsankalp genovaultacloudbasedgenomicsrepository AT saxenaamit genovaultacloudbasedgenomicsrepository AT hesarursuprit genovaultacloudbasedgenomicsrepository AT bhadhadharakirti genovaultacloudbasedgenomicsrepository AT bhartineeraj genovaultacloudbasedgenomicsrepository AT kasibhatlasunithamanjari genovaultacloudbasedgenomicsrepository AT sonavaneuddhavesh genovaultacloudbasedgenomicsrepository AT joshirajendra genovaultacloudbasedgenomicsrepository |