Cargando…
A Multi-Science Data Analysis Platform and the GeneROOT Use Case
<!--HTML-->This talk will cover two areas of current research in the context of knowledge sharing between CERN openlab and the life science communities. The first area covers the development and prototyping of a multi-science data analysis platform build up around CERN developed technologies l...
Autores principales: | , |
---|---|
Lenguaje: | eng |
Publicado: |
2017
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/2297022 |
_version_ | 1780956836091920384 |
---|---|
author | Aliyev, Taghi Rademakers, Fons |
author_facet | Aliyev, Taghi Rademakers, Fons |
author_sort | Aliyev, Taghi |
collection | CERN |
description | <!--HTML-->This talk will cover two areas of current research in the context of knowledge sharing between CERN openlab and the life science communities. The first area covers the development and prototyping of a multi-science data analysis platform build up around CERN developed technologies like, Zenodo, REANA and CVMFS. When finished this platform will support a complete data analysis life-cycle from data discovery, to data access, to data processing to end-user data analysis. The second area covers a specific use case, where HEP specific software like ROOT is used to store and process genomics data sequences. There are a number of handcrafted genomics data formats being used, like FASTQ, SAM, BAM, CRAM, etc. They range from pure ASCII to compressed binary formats. We will compare the features of these formats with the generic capabilities of ROOT’s TTree containers. Also we will show performance numbers of typical analysis scenarios. |
id | cern-2297022 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2017 |
record_format | invenio |
spelling | cern-22970222022-11-02T22:13:39Zhttp://cds.cern.ch/record/2297022engAliyev, TaghiRademakers, FonsA Multi-Science Data Analysis Platform and the GeneROOT Use CaseA Multi-Science Data Analysis Platform and the GeneROOT Use CaseIT Technical Forum (ITTF)<!--HTML-->This talk will cover two areas of current research in the context of knowledge sharing between CERN openlab and the life science communities. The first area covers the development and prototyping of a multi-science data analysis platform build up around CERN developed technologies like, Zenodo, REANA and CVMFS. When finished this platform will support a complete data analysis life-cycle from data discovery, to data access, to data processing to end-user data analysis. The second area covers a specific use case, where HEP specific software like ROOT is used to store and process genomics data sequences. There are a number of handcrafted genomics data formats being used, like FASTQ, SAM, BAM, CRAM, etc. They range from pure ASCII to compressed binary formats. We will compare the features of these formats with the generic capabilities of ROOT’s TTree containers. Also we will show performance numbers of typical analysis scenarios.oai:cds.cern.ch:22970222017 |
spellingShingle | IT Technical Forum (ITTF) Aliyev, Taghi Rademakers, Fons A Multi-Science Data Analysis Platform and the GeneROOT Use Case |
title | A Multi-Science Data Analysis Platform and the GeneROOT Use Case |
title_full | A Multi-Science Data Analysis Platform and the GeneROOT Use Case |
title_fullStr | A Multi-Science Data Analysis Platform and the GeneROOT Use Case |
title_full_unstemmed | A Multi-Science Data Analysis Platform and the GeneROOT Use Case |
title_short | A Multi-Science Data Analysis Platform and the GeneROOT Use Case |
title_sort | multi-science data analysis platform and the generoot use case |
topic | IT Technical Forum (ITTF) |
url | http://cds.cern.ch/record/2297022 |
work_keys_str_mv | AT aliyevtaghi amultisciencedataanalysisplatformandthegenerootusecase AT rademakersfons amultisciencedataanalysisplatformandthegenerootusecase AT aliyevtaghi multisciencedataanalysisplatformandthegenerootusecase AT rademakersfons multisciencedataanalysisplatformandthegenerootusecase |