Cargando…

Structural Biology in the context of EGEE

Electron microscopy (EM) is a crucial technique, which allows Structural Biology researchers to characterize macromolecular assemblies in distinct functional states. Image processing in three dimensional EM (3D-EM) is used by a flourishing community (exemplarized by the EU funded 3D-EM NoE) and is c...

Descripción completa

Detalles Bibliográficos
Autores principales: García, D, Carrera, G, Carazo, J M, Valverde, J R, Moscicki, J, Muraru, A
Lenguaje:eng
Publicado: 2007
Materias:
Acceso en línea:http://cds.cern.ch/record/1120792
_version_ 1780914566295715840
author García, D
Carrera, G
Carazo, J M
Valverde, J R
Moscicki, J
Muraru, A
author_facet García, D
Carrera, G
Carazo, J M
Valverde, J R
Moscicki, J
Muraru, A
author_sort García, D
collection CERN
description Electron microscopy (EM) is a crucial technique, which allows Structural Biology researchers to characterize macromolecular assemblies in distinct functional states. Image processing in three dimensional EM (3D-EM) is used by a flourishing community (exemplarized by the EU funded 3D-EM NoE) and is characterized by voluminous data and large computing requirements, making this a problem well suited for Grid computing and the EGEE infrastructure. There are various steps in the 3D-EM refinement process that may benefit from Grid computing. To start with, large numbers of experimental images need to be averaged. Nowadays, typically tens of thousands of images are used, while future studies may routinely employ millions of images. Our group has been developing Xmipp, a package for single-particle 3D-EM image processing. Using Xmipp, the classification of 91,000 ribosome projections into 4 classes took more than 2500 CPU hours using the resources of the MareNostrum supercomputer at the Barcelona Supercomputing Centre. As few groups will have access to such resources, we propose to use the EGEE infrastructure for Xmipp (ML2D/ML3D), in collaboration with the Network of Excellence in 3D-EM. Enabling widespread adoption of 3D-EM will have a long-term profound impact in our understanding complex biological structures (such as viruses, organelles and macromolecular assemblies) to exploit their biomedical applications. We have adapted our Structural Biology applications for production use over EGEE with the help of the DIANE framework for resource and job management. To spread knowledge of our solution, CNB is organizing a seminar with wet-lab users (Structural Biology researchers) and developers where we will introduce it and collect their response and feedback to its implementation. We think that the success of our activity within VO Biomed and NA4 depends on: Production level quality of the services running on the Grid: for this reason we need an efficient porting of our applications to EGEE and a correct adaptation of the software to this environment. Interaction with the EGEE infrastructure cannot pose any added handicap to users, as this would dissuade potential users from using EGEE. We must rely in the availability of the resources of EGEE. Dealing with infrastructure problems external to the code being ported adds a heavy burden on application developers. Better and widespread support for parallel processing (MPI) is needed to improve response time in our applications. Data management needs improvements for usability and transparency (e.g. a Grid file system like ELFI).EGEE needs to go past its current transition to gLite: we still need to use LCG commands to interact with the Information system to avoid gLite shortcomings detecting resource availability.
id cern-1120792
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2007
record_format invenio
spelling cern-11207922019-09-30T06:29:59Zhttp://cds.cern.ch/record/1120792engGarcía, DCarrera, GCarazo, J MValverde, J RMoscicki, JMuraru, AStructural Biology in the context of EGEEScience in GeneralComputing and ComputersElectron microscopy (EM) is a crucial technique, which allows Structural Biology researchers to characterize macromolecular assemblies in distinct functional states. Image processing in three dimensional EM (3D-EM) is used by a flourishing community (exemplarized by the EU funded 3D-EM NoE) and is characterized by voluminous data and large computing requirements, making this a problem well suited for Grid computing and the EGEE infrastructure. There are various steps in the 3D-EM refinement process that may benefit from Grid computing. To start with, large numbers of experimental images need to be averaged. Nowadays, typically tens of thousands of images are used, while future studies may routinely employ millions of images. Our group has been developing Xmipp, a package for single-particle 3D-EM image processing. Using Xmipp, the classification of 91,000 ribosome projections into 4 classes took more than 2500 CPU hours using the resources of the MareNostrum supercomputer at the Barcelona Supercomputing Centre. As few groups will have access to such resources, we propose to use the EGEE infrastructure for Xmipp (ML2D/ML3D), in collaboration with the Network of Excellence in 3D-EM. Enabling widespread adoption of 3D-EM will have a long-term profound impact in our understanding complex biological structures (such as viruses, organelles and macromolecular assemblies) to exploit their biomedical applications. We have adapted our Structural Biology applications for production use over EGEE with the help of the DIANE framework for resource and job management. To spread knowledge of our solution, CNB is organizing a seminar with wet-lab users (Structural Biology researchers) and developers where we will introduce it and collect their response and feedback to its implementation. We think that the success of our activity within VO Biomed and NA4 depends on: Production level quality of the services running on the Grid: for this reason we need an efficient porting of our applications to EGEE and a correct adaptation of the software to this environment. Interaction with the EGEE infrastructure cannot pose any added handicap to users, as this would dissuade potential users from using EGEE. We must rely in the availability of the resources of EGEE. Dealing with infrastructure problems external to the code being ported adds a heavy burden on application developers. Better and widespread support for parallel processing (MPI) is needed to improve response time in our applications. Data management needs improvements for usability and transparency (e.g. a Grid file system like ELFI).EGEE needs to go past its current transition to gLite: we still need to use LCG commands to interact with the Information system to avoid gLite shortcomings detecting resource availability.oai:cds.cern.ch:11207922007
spellingShingle Science in General
Computing and Computers
García, D
Carrera, G
Carazo, J M
Valverde, J R
Moscicki, J
Muraru, A
Structural Biology in the context of EGEE
title Structural Biology in the context of EGEE
title_full Structural Biology in the context of EGEE
title_fullStr Structural Biology in the context of EGEE
title_full_unstemmed Structural Biology in the context of EGEE
title_short Structural Biology in the context of EGEE
title_sort structural biology in the context of egee
topic Science in General
Computing and Computers
url http://cds.cern.ch/record/1120792
work_keys_str_mv AT garciad structuralbiologyinthecontextofegee
AT carrerag structuralbiologyinthecontextofegee
AT carazojm structuralbiologyinthecontextofegee
AT valverdejr structuralbiologyinthecontextofegee
AT moscickij structuralbiologyinthecontextofegee
AT murarua structuralbiologyinthecontextofegee