Cargando…

A Multi-Science Data Analysis Platform and the GeneROOT Use Case

<!--HTML-->This talk will cover two areas of current research in the context of knowledge sharing between CERN openlab and the life science communities. The first area covers the development and prototyping of a multi-science data analysis platform build up around CERN developed technologies l...

Descripción completa

Detalles Bibliográficos
Autores principales: Aliyev, Taghi, Rademakers, Fons
Lenguaje:eng
Publicado: 2017
Materias:
Acceso en línea:http://cds.cern.ch/record/2297022
_version_ 1780956836091920384
author Aliyev, Taghi
Rademakers, Fons
author_facet Aliyev, Taghi
Rademakers, Fons
author_sort Aliyev, Taghi
collection CERN
description <!--HTML-->This talk will cover two areas of current research in the context of knowledge sharing between CERN openlab and the life science communities. The first area covers the development and prototyping of a multi-science data analysis platform build up around CERN developed technologies like, Zenodo, REANA and CVMFS. When finished this platform will support a complete data analysis life-cycle from data discovery, to data access, to data processing to end-user data analysis. The second area covers a specific use case, where HEP specific software like ROOT is used to store and process genomics data sequences. There are a number of handcrafted genomics data formats being used, like FASTQ, SAM, BAM, CRAM, etc. They range from pure ASCII to compressed binary formats. We will compare the features of these formats with the generic capabilities of ROOT’s TTree containers. Also we will show performance numbers of typical analysis scenarios.
id cern-2297022
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2017
record_format invenio
spelling cern-22970222022-11-02T22:13:39Zhttp://cds.cern.ch/record/2297022engAliyev, TaghiRademakers, FonsA Multi-Science Data Analysis Platform and the GeneROOT Use CaseA Multi-Science Data Analysis Platform and the GeneROOT Use CaseIT Technical Forum (ITTF)<!--HTML-->This talk will cover two areas of current research in the context of knowledge sharing between CERN openlab and the life science communities. The first area covers the development and prototyping of a multi-science data analysis platform build up around CERN developed technologies like, Zenodo, REANA and CVMFS. When finished this platform will support a complete data analysis life-cycle from data discovery, to data access, to data processing to end-user data analysis. The second area covers a specific use case, where HEP specific software like ROOT is used to store and process genomics data sequences. There are a number of handcrafted genomics data formats being used, like FASTQ, SAM, BAM, CRAM, etc. They range from pure ASCII to compressed binary formats. We will compare the features of these formats with the generic capabilities of ROOT’s TTree containers. Also we will show performance numbers of typical analysis scenarios.oai:cds.cern.ch:22970222017
spellingShingle IT Technical Forum (ITTF)
Aliyev, Taghi
Rademakers, Fons
A Multi-Science Data Analysis Platform and the GeneROOT Use Case
title A Multi-Science Data Analysis Platform and the GeneROOT Use Case
title_full A Multi-Science Data Analysis Platform and the GeneROOT Use Case
title_fullStr A Multi-Science Data Analysis Platform and the GeneROOT Use Case
title_full_unstemmed A Multi-Science Data Analysis Platform and the GeneROOT Use Case
title_short A Multi-Science Data Analysis Platform and the GeneROOT Use Case
title_sort multi-science data analysis platform and the generoot use case
topic IT Technical Forum (ITTF)
url http://cds.cern.ch/record/2297022
work_keys_str_mv AT aliyevtaghi amultisciencedataanalysisplatformandthegenerootusecase
AT rademakersfons amultisciencedataanalysisplatformandthegenerootusecase
AT aliyevtaghi multisciencedataanalysisplatformandthegenerootusecase
AT rademakersfons multisciencedataanalysisplatformandthegenerootusecase