Cargando…
DIAMIN: a software library for the distributed analysis of large-scale molecular interaction networks
BACKGROUND: Huge amounts of molecular interaction data are continuously produced and stored in public databases. Although many bioinformatics tools have been proposed in the literature for their analysis, based on their modeling through different types of biological networks, several problems still...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9652854/ https://www.ncbi.nlm.nih.gov/pubmed/36368948 http://dx.doi.org/10.1186/s12859-022-05026-w |
_version_ | 1784828565657747456 |
---|---|
author | Di Rocco, Lorenzo Ferraro Petrillo, Umberto Rombo, Simona E. |
author_facet | Di Rocco, Lorenzo Ferraro Petrillo, Umberto Rombo, Simona E. |
author_sort | Di Rocco, Lorenzo |
collection | PubMed |
description | BACKGROUND: Huge amounts of molecular interaction data are continuously produced and stored in public databases. Although many bioinformatics tools have been proposed in the literature for their analysis, based on their modeling through different types of biological networks, several problems still remain unsolved when the problem turns on a large scale. RESULTS: We propose DIAMIN, that is, a high-level software library to facilitate the development of applications for the efficient analysis of large-scale molecular interaction networks. DIAMIN relies on distributed computing, and it is implemented in Java upon the framework Apache Spark. It delivers a set of functionalities implementing different tasks on an abstract representation of very large graphs, providing a built-in support for methods and algorithms commonly used to analyze these networks. DIAMIN has been tested on data retrieved from two of the most used molecular interactions databases, resulting to be highly efficient and scalable. As shown by different provided examples, DIAMIN can be exploited by users without any distributed programming experience, in order to perform various types of data analysis, and to implement new algorithms based on its primitives. CONCLUSIONS: The proposed DIAMIN has been proved to be successful in allowing users to solve specific biological problems that can be modeled relying on biological networks, by using its functionalities. The software is freely available and this will hopefully allow its rapid diffusion through the scientific community, to solve both specific data analysis and more complex tasks. |
format | Online Article Text |
id | pubmed-9652854 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-96528542022-11-15 DIAMIN: a software library for the distributed analysis of large-scale molecular interaction networks Di Rocco, Lorenzo Ferraro Petrillo, Umberto Rombo, Simona E. BMC Bioinformatics Software BACKGROUND: Huge amounts of molecular interaction data are continuously produced and stored in public databases. Although many bioinformatics tools have been proposed in the literature for their analysis, based on their modeling through different types of biological networks, several problems still remain unsolved when the problem turns on a large scale. RESULTS: We propose DIAMIN, that is, a high-level software library to facilitate the development of applications for the efficient analysis of large-scale molecular interaction networks. DIAMIN relies on distributed computing, and it is implemented in Java upon the framework Apache Spark. It delivers a set of functionalities implementing different tasks on an abstract representation of very large graphs, providing a built-in support for methods and algorithms commonly used to analyze these networks. DIAMIN has been tested on data retrieved from two of the most used molecular interactions databases, resulting to be highly efficient and scalable. As shown by different provided examples, DIAMIN can be exploited by users without any distributed programming experience, in order to perform various types of data analysis, and to implement new algorithms based on its primitives. CONCLUSIONS: The proposed DIAMIN has been proved to be successful in allowing users to solve specific biological problems that can be modeled relying on biological networks, by using its functionalities. The software is freely available and this will hopefully allow its rapid diffusion through the scientific community, to solve both specific data analysis and more complex tasks. BioMed Central 2022-11-11 /pmc/articles/PMC9652854/ /pubmed/36368948 http://dx.doi.org/10.1186/s12859-022-05026-w Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Software Di Rocco, Lorenzo Ferraro Petrillo, Umberto Rombo, Simona E. DIAMIN: a software library for the distributed analysis of large-scale molecular interaction networks |
title | DIAMIN: a software library for the distributed analysis of large-scale molecular interaction networks |
title_full | DIAMIN: a software library for the distributed analysis of large-scale molecular interaction networks |
title_fullStr | DIAMIN: a software library for the distributed analysis of large-scale molecular interaction networks |
title_full_unstemmed | DIAMIN: a software library for the distributed analysis of large-scale molecular interaction networks |
title_short | DIAMIN: a software library for the distributed analysis of large-scale molecular interaction networks |
title_sort | diamin: a software library for the distributed analysis of large-scale molecular interaction networks |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9652854/ https://www.ncbi.nlm.nih.gov/pubmed/36368948 http://dx.doi.org/10.1186/s12859-022-05026-w |
work_keys_str_mv | AT diroccolorenzo diaminasoftwarelibraryforthedistributedanalysisoflargescalemolecularinteractionnetworks AT ferraropetrilloumberto diaminasoftwarelibraryforthedistributedanalysisoflargescalemolecularinteractionnetworks AT rombosimonae diaminasoftwarelibraryforthedistributedanalysisoflargescalemolecularinteractionnetworks |