Cargando…

AGRN: accurate gene regulatory network inference using ensemble machine learning methods

MOTIVATION: Biological processes are regulated by underlying genes and their interactions that form gene regulatory networks (GRNs). Dysregulation of these GRNs can cause complex diseases such as cancer, Alzheimer’s and diabetes. Hence, accurate GRN inference is critical for elucidating gene functio...

Descripción completa

Detalles Bibliográficos
Autores principales: Alawad, Duaa Mohammad, Katebi, Ataur, Kabir, Md Wasi Ul, Hoque, Md Tamjidul
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10082608/
https://www.ncbi.nlm.nih.gov/pubmed/37038446
http://dx.doi.org/10.1093/bioadv/vbad032
Descripción
Sumario:MOTIVATION: Biological processes are regulated by underlying genes and their interactions that form gene regulatory networks (GRNs). Dysregulation of these GRNs can cause complex diseases such as cancer, Alzheimer’s and diabetes. Hence, accurate GRN inference is critical for elucidating gene function, allowing for the faster identification and prioritization of candidate genes for functional investigation. Several statistical and machine learning-based methods have been developed to infer GRNs based on biological and synthetic datasets. Here, we developed a method named AGRN that infers GRNs by employing an ensemble of machine learning algorithms. RESULTS: From the idea that a single method may not perform well on all datasets, we calculate the gene importance scores using three machine learning methods—random forest, extra tree and support vector regressors. We calculate the importance scores from Shapley Additive Explanations, a recently published method to explain machine learning models. We have found that the importance scores from Shapley values perform better than the traditional importance scoring methods based on almost all the benchmark datasets. We have analyzed the performance of AGRN using the datasets from the DREAM4 and DREAM5 challenges for GRN inference. The proposed method, AGRN—an ensemble machine learning method with Shapley values, outperforms the existing methods both in the DREAM4 and DREAM5 datasets. With improved accuracy, we believe that AGRN inferred GRNs would enhance our mechanistic understanding of biological processes in health and disease. AVAILABILITYAND IMPLEMENTATION: https://github.com/DuaaAlawad/AGRN. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.