Cargando…

DIAMIN: a software library for the distributed analysis of large-scale molecular interaction networks

BACKGROUND: Huge amounts of molecular interaction data are continuously produced and stored in public databases. Although many bioinformatics tools have been proposed in the literature for their analysis, based on their modeling through different types of biological networks, several problems still...

Descripción completa

Detalles Bibliográficos
Autores principales: Di Rocco, Lorenzo, Ferraro Petrillo, Umberto, Rombo, Simona E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9652854/
https://www.ncbi.nlm.nih.gov/pubmed/36368948
http://dx.doi.org/10.1186/s12859-022-05026-w
_version_ 1784828565657747456
author Di Rocco, Lorenzo
Ferraro Petrillo, Umberto
Rombo, Simona E.
author_facet Di Rocco, Lorenzo
Ferraro Petrillo, Umberto
Rombo, Simona E.
author_sort Di Rocco, Lorenzo
collection PubMed
description BACKGROUND: Huge amounts of molecular interaction data are continuously produced and stored in public databases. Although many bioinformatics tools have been proposed in the literature for their analysis, based on their modeling through different types of biological networks, several problems still remain unsolved when the problem turns on a large scale. RESULTS: We propose DIAMIN, that is, a high-level software library to facilitate the development of applications for the efficient analysis of large-scale molecular interaction networks. DIAMIN relies on distributed computing, and it is implemented in Java upon the framework Apache Spark. It delivers a set of functionalities implementing different tasks on an abstract representation of very large graphs, providing a built-in support for methods and algorithms commonly used to analyze these networks. DIAMIN has been tested on data retrieved from two of the most used molecular interactions databases, resulting to be highly efficient and scalable. As shown by different provided examples, DIAMIN can be exploited by users without any distributed programming experience, in order to perform various types of data analysis, and to implement new algorithms based on its primitives. CONCLUSIONS: The proposed DIAMIN has been proved to be successful in allowing users to solve specific biological problems that can be modeled relying on biological networks, by using its functionalities. The software is freely available and this will hopefully allow its rapid diffusion through the scientific community, to solve both specific data analysis and more complex tasks.
format Online
Article
Text
id pubmed-9652854
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-96528542022-11-15 DIAMIN: a software library for the distributed analysis of large-scale molecular interaction networks Di Rocco, Lorenzo Ferraro Petrillo, Umberto Rombo, Simona E. BMC Bioinformatics Software BACKGROUND: Huge amounts of molecular interaction data are continuously produced and stored in public databases. Although many bioinformatics tools have been proposed in the literature for their analysis, based on their modeling through different types of biological networks, several problems still remain unsolved when the problem turns on a large scale. RESULTS: We propose DIAMIN, that is, a high-level software library to facilitate the development of applications for the efficient analysis of large-scale molecular interaction networks. DIAMIN relies on distributed computing, and it is implemented in Java upon the framework Apache Spark. It delivers a set of functionalities implementing different tasks on an abstract representation of very large graphs, providing a built-in support for methods and algorithms commonly used to analyze these networks. DIAMIN has been tested on data retrieved from two of the most used molecular interactions databases, resulting to be highly efficient and scalable. As shown by different provided examples, DIAMIN can be exploited by users without any distributed programming experience, in order to perform various types of data analysis, and to implement new algorithms based on its primitives. CONCLUSIONS: The proposed DIAMIN has been proved to be successful in allowing users to solve specific biological problems that can be modeled relying on biological networks, by using its functionalities. The software is freely available and this will hopefully allow its rapid diffusion through the scientific community, to solve both specific data analysis and more complex tasks. BioMed Central 2022-11-11 /pmc/articles/PMC9652854/ /pubmed/36368948 http://dx.doi.org/10.1186/s12859-022-05026-w Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Software
Di Rocco, Lorenzo
Ferraro Petrillo, Umberto
Rombo, Simona E.
DIAMIN: a software library for the distributed analysis of large-scale molecular interaction networks
title DIAMIN: a software library for the distributed analysis of large-scale molecular interaction networks
title_full DIAMIN: a software library for the distributed analysis of large-scale molecular interaction networks
title_fullStr DIAMIN: a software library for the distributed analysis of large-scale molecular interaction networks
title_full_unstemmed DIAMIN: a software library for the distributed analysis of large-scale molecular interaction networks
title_short DIAMIN: a software library for the distributed analysis of large-scale molecular interaction networks
title_sort diamin: a software library for the distributed analysis of large-scale molecular interaction networks
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9652854/
https://www.ncbi.nlm.nih.gov/pubmed/36368948
http://dx.doi.org/10.1186/s12859-022-05026-w
work_keys_str_mv AT diroccolorenzo diaminasoftwarelibraryforthedistributedanalysisoflargescalemolecularinteractionnetworks
AT ferraropetrilloumberto diaminasoftwarelibraryforthedistributedanalysisoflargescalemolecularinteractionnetworks
AT rombosimonae diaminasoftwarelibraryforthedistributedanalysisoflargescalemolecularinteractionnetworks