Cargando…

FCC – An automated rule-based processing tool for life science data

BACKGROUND: Data processing in the bioinformatics field often involves the handling of diverse software programs in one workflow. The field is lacking a set of standards for file formats so that files have to be processed in different ways in order to make them compatible to different analysis progr...

Descripción completa

Detalles Bibliográficos
Autores principales: Barkow-Oesterreicher, Simon, Türker, Can, Panse, Christian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3614436/
https://www.ncbi.nlm.nih.gov/pubmed/23311610
http://dx.doi.org/10.1186/1751-0473-8-3
_version_ 1782264839232552960
author Barkow-Oesterreicher, Simon
Türker, Can
Panse, Christian
author_facet Barkow-Oesterreicher, Simon
Türker, Can
Panse, Christian
author_sort Barkow-Oesterreicher, Simon
collection PubMed
description BACKGROUND: Data processing in the bioinformatics field often involves the handling of diverse software programs in one workflow. The field is lacking a set of standards for file formats so that files have to be processed in different ways in order to make them compatible to different analysis programs. The problem is that mass spectrometry vendors at most provide only closed-source Windows libraries to programmatically access their proprietary binary formats. This prohibits the creation of an efficient and unified tool that fits all processing needs of the users. Therefore, researchers are spending a significant amount of time using GUI-based conversion and processing programs. Besides the time needed for manual usage, such programs also can show long running times for processing, because most of them make use of only a single CPU. In particular, algorithms to enhance data quality, e.g. peak picking or deconvolution of spectra, add waiting time for the users. RESULTS: To automate these processing tasks and let them run continuously without user interaction, we developed the FGCZ Converter Control (FCC) at the Functional Genomics Center Zurich (FGCZ) core facility. The FCC is a rule-based system for automated file processing that reduces the operation of diverse programs to a single configuration task. Using filtering rules for raw data files, the parameters for all tasks can be custom-tailored to the needs of every single researcher and processing can run automatically and efficiently on any number of servers in parallel using all available CPU resources. CONCLUSIONS: FCC has been used intensively at FGCZ for processing more than hundred thousand mass spectrometry raw files so far. Since we know that many other research facilities have similar problems, we would like to report on our tool and the accompanying ideas for an efficient set-up for potential reuse.
format Online
Article
Text
id pubmed-3614436
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-36144362013-04-03 FCC – An automated rule-based processing tool for life science data Barkow-Oesterreicher, Simon Türker, Can Panse, Christian Source Code Biol Med Software Review BACKGROUND: Data processing in the bioinformatics field often involves the handling of diverse software programs in one workflow. The field is lacking a set of standards for file formats so that files have to be processed in different ways in order to make them compatible to different analysis programs. The problem is that mass spectrometry vendors at most provide only closed-source Windows libraries to programmatically access their proprietary binary formats. This prohibits the creation of an efficient and unified tool that fits all processing needs of the users. Therefore, researchers are spending a significant amount of time using GUI-based conversion and processing programs. Besides the time needed for manual usage, such programs also can show long running times for processing, because most of them make use of only a single CPU. In particular, algorithms to enhance data quality, e.g. peak picking or deconvolution of spectra, add waiting time for the users. RESULTS: To automate these processing tasks and let them run continuously without user interaction, we developed the FGCZ Converter Control (FCC) at the Functional Genomics Center Zurich (FGCZ) core facility. The FCC is a rule-based system for automated file processing that reduces the operation of diverse programs to a single configuration task. Using filtering rules for raw data files, the parameters for all tasks can be custom-tailored to the needs of every single researcher and processing can run automatically and efficiently on any number of servers in parallel using all available CPU resources. CONCLUSIONS: FCC has been used intensively at FGCZ for processing more than hundred thousand mass spectrometry raw files so far. Since we know that many other research facilities have similar problems, we would like to report on our tool and the accompanying ideas for an efficient set-up for potential reuse. BioMed Central 2013-01-11 /pmc/articles/PMC3614436/ /pubmed/23311610 http://dx.doi.org/10.1186/1751-0473-8-3 Text en Copyright © 2013 Barkow-Oesterreicher et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software Review
Barkow-Oesterreicher, Simon
Türker, Can
Panse, Christian
FCC – An automated rule-based processing tool for life science data
title FCC – An automated rule-based processing tool for life science data
title_full FCC – An automated rule-based processing tool for life science data
title_fullStr FCC – An automated rule-based processing tool for life science data
title_full_unstemmed FCC – An automated rule-based processing tool for life science data
title_short FCC – An automated rule-based processing tool for life science data
title_sort fcc – an automated rule-based processing tool for life science data
topic Software Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3614436/
https://www.ncbi.nlm.nih.gov/pubmed/23311610
http://dx.doi.org/10.1186/1751-0473-8-3
work_keys_str_mv AT barkowoesterreichersimon fccanautomatedrulebasedprocessingtoolforlifesciencedata
AT turkercan fccanautomatedrulebasedprocessingtoolforlifesciencedata
AT pansechristian fccanautomatedrulebasedprocessingtoolforlifesciencedata