Cargando…

FOntCell: Fusion of Ontologies of Cells

High-throughput cell-data technologies such as single-cell RNA-seq create a demand for algorithms for automatic cell classification and characterization. There exist several cell classification ontologies with complementary information. However, one needs to merge them to synergistically combine the...

Descripción completa

Detalles Bibliográficos
Autores principales: Cabau-Laporta, Javier, Ascensión, Alex M., Arrospide-Elgarresta, Mikel, Gerovska, Daniela, Araúzo-Bravo, Marcos J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7905052/
https://www.ncbi.nlm.nih.gov/pubmed/33644039
http://dx.doi.org/10.3389/fcell.2021.562908
_version_ 1783655042086076416
author Cabau-Laporta, Javier
Ascensión, Alex M.
Arrospide-Elgarresta, Mikel
Gerovska, Daniela
Araúzo-Bravo, Marcos J.
author_facet Cabau-Laporta, Javier
Ascensión, Alex M.
Arrospide-Elgarresta, Mikel
Gerovska, Daniela
Araúzo-Bravo, Marcos J.
author_sort Cabau-Laporta, Javier
collection PubMed
description High-throughput cell-data technologies such as single-cell RNA-seq create a demand for algorithms for automatic cell classification and characterization. There exist several cell classification ontologies with complementary information. However, one needs to merge them to synergistically combine their information. The main difficulty in merging is to match the ontologies since they use different naming conventions. Therefore, we developed an algorithm that merges ontologies by integrating the name matching between class label names with the structure mapping between the ontology elements based on graph convolution. Since the structure mapping is a time consuming process, we designed two methods to perform the graph convolution: vectorial structure matching and constraint-based structure matching. To perform the vectorial structure matching, we designed a general method to calculate the similarities between vectors of different lengths for different metrics. Additionally, we adapted the slower Blondel method to work for structure matching. We implemented our algorithms into FOntCell, a software module in Python for efficient automatic parallel-computed merging/fusion of ontologies in the same or similar knowledge domains. FOntCell can unify dispersed knowledge from one domain into a unique ontology in OWL format and iteratively reuse it to continuously adapt ontologies with new data endlessly produced by data-driven classification methods, such as of the Human Cell Atlas. To navigate easily across the merged ontologies, it generates HTML files with tabulated and graphic summaries, and interactive circular Directed Acyclic Graphs. We used FOntCell to merge the CELDA, LifeMap and LungMAP Human Anatomy cell ontologies into a comprehensive cell ontology. We compared FOntCell with tools used for the alignment of mouse and human anatomy ontologies task proposed by the Ontology Alignment Evaluation Initiative (OAEI) and found that the F(β) alignment accuracies of FOntCell are above the geometric mean of the other tools; more importantly, it outperforms significantly the best OAEI tools in cell ontology alignment in terms of F(β) alignment accuracies.
format Online
Article
Text
id pubmed-7905052
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-79050522021-02-26 FOntCell: Fusion of Ontologies of Cells Cabau-Laporta, Javier Ascensión, Alex M. Arrospide-Elgarresta, Mikel Gerovska, Daniela Araúzo-Bravo, Marcos J. Front Cell Dev Biol Cell and Developmental Biology High-throughput cell-data technologies such as single-cell RNA-seq create a demand for algorithms for automatic cell classification and characterization. There exist several cell classification ontologies with complementary information. However, one needs to merge them to synergistically combine their information. The main difficulty in merging is to match the ontologies since they use different naming conventions. Therefore, we developed an algorithm that merges ontologies by integrating the name matching between class label names with the structure mapping between the ontology elements based on graph convolution. Since the structure mapping is a time consuming process, we designed two methods to perform the graph convolution: vectorial structure matching and constraint-based structure matching. To perform the vectorial structure matching, we designed a general method to calculate the similarities between vectors of different lengths for different metrics. Additionally, we adapted the slower Blondel method to work for structure matching. We implemented our algorithms into FOntCell, a software module in Python for efficient automatic parallel-computed merging/fusion of ontologies in the same or similar knowledge domains. FOntCell can unify dispersed knowledge from one domain into a unique ontology in OWL format and iteratively reuse it to continuously adapt ontologies with new data endlessly produced by data-driven classification methods, such as of the Human Cell Atlas. To navigate easily across the merged ontologies, it generates HTML files with tabulated and graphic summaries, and interactive circular Directed Acyclic Graphs. We used FOntCell to merge the CELDA, LifeMap and LungMAP Human Anatomy cell ontologies into a comprehensive cell ontology. We compared FOntCell with tools used for the alignment of mouse and human anatomy ontologies task proposed by the Ontology Alignment Evaluation Initiative (OAEI) and found that the F(β) alignment accuracies of FOntCell are above the geometric mean of the other tools; more importantly, it outperforms significantly the best OAEI tools in cell ontology alignment in terms of F(β) alignment accuracies. Frontiers Media S.A. 2021-02-11 /pmc/articles/PMC7905052/ /pubmed/33644039 http://dx.doi.org/10.3389/fcell.2021.562908 Text en Copyright © 2021 Cabau-Laporta, Ascensión, Arrospide-Elgarresta, Gerovska and Araúzo-Bravo. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Cell and Developmental Biology
Cabau-Laporta, Javier
Ascensión, Alex M.
Arrospide-Elgarresta, Mikel
Gerovska, Daniela
Araúzo-Bravo, Marcos J.
FOntCell: Fusion of Ontologies of Cells
title FOntCell: Fusion of Ontologies of Cells
title_full FOntCell: Fusion of Ontologies of Cells
title_fullStr FOntCell: Fusion of Ontologies of Cells
title_full_unstemmed FOntCell: Fusion of Ontologies of Cells
title_short FOntCell: Fusion of Ontologies of Cells
title_sort fontcell: fusion of ontologies of cells
topic Cell and Developmental Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7905052/
https://www.ncbi.nlm.nih.gov/pubmed/33644039
http://dx.doi.org/10.3389/fcell.2021.562908
work_keys_str_mv AT cabaulaportajavier fontcellfusionofontologiesofcells
AT ascensionalexm fontcellfusionofontologiesofcells
AT arrospideelgarrestamikel fontcellfusionofontologiesofcells
AT gerovskadaniela fontcellfusionofontologiesofcells
AT arauzobravomarcosj fontcellfusionofontologiesofcells