Cargando…
FOntCell: Fusion of Ontologies of Cells
High-throughput cell-data technologies such as single-cell RNA-seq create a demand for algorithms for automatic cell classification and characterization. There exist several cell classification ontologies with complementary information. However, one needs to merge them to synergistically combine the...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7905052/ https://www.ncbi.nlm.nih.gov/pubmed/33644039 http://dx.doi.org/10.3389/fcell.2021.562908 |
_version_ | 1783655042086076416 |
---|---|
author | Cabau-Laporta, Javier Ascensión, Alex M. Arrospide-Elgarresta, Mikel Gerovska, Daniela Araúzo-Bravo, Marcos J. |
author_facet | Cabau-Laporta, Javier Ascensión, Alex M. Arrospide-Elgarresta, Mikel Gerovska, Daniela Araúzo-Bravo, Marcos J. |
author_sort | Cabau-Laporta, Javier |
collection | PubMed |
description | High-throughput cell-data technologies such as single-cell RNA-seq create a demand for algorithms for automatic cell classification and characterization. There exist several cell classification ontologies with complementary information. However, one needs to merge them to synergistically combine their information. The main difficulty in merging is to match the ontologies since they use different naming conventions. Therefore, we developed an algorithm that merges ontologies by integrating the name matching between class label names with the structure mapping between the ontology elements based on graph convolution. Since the structure mapping is a time consuming process, we designed two methods to perform the graph convolution: vectorial structure matching and constraint-based structure matching. To perform the vectorial structure matching, we designed a general method to calculate the similarities between vectors of different lengths for different metrics. Additionally, we adapted the slower Blondel method to work for structure matching. We implemented our algorithms into FOntCell, a software module in Python for efficient automatic parallel-computed merging/fusion of ontologies in the same or similar knowledge domains. FOntCell can unify dispersed knowledge from one domain into a unique ontology in OWL format and iteratively reuse it to continuously adapt ontologies with new data endlessly produced by data-driven classification methods, such as of the Human Cell Atlas. To navigate easily across the merged ontologies, it generates HTML files with tabulated and graphic summaries, and interactive circular Directed Acyclic Graphs. We used FOntCell to merge the CELDA, LifeMap and LungMAP Human Anatomy cell ontologies into a comprehensive cell ontology. We compared FOntCell with tools used for the alignment of mouse and human anatomy ontologies task proposed by the Ontology Alignment Evaluation Initiative (OAEI) and found that the F(β) alignment accuracies of FOntCell are above the geometric mean of the other tools; more importantly, it outperforms significantly the best OAEI tools in cell ontology alignment in terms of F(β) alignment accuracies. |
format | Online Article Text |
id | pubmed-7905052 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-79050522021-02-26 FOntCell: Fusion of Ontologies of Cells Cabau-Laporta, Javier Ascensión, Alex M. Arrospide-Elgarresta, Mikel Gerovska, Daniela Araúzo-Bravo, Marcos J. Front Cell Dev Biol Cell and Developmental Biology High-throughput cell-data technologies such as single-cell RNA-seq create a demand for algorithms for automatic cell classification and characterization. There exist several cell classification ontologies with complementary information. However, one needs to merge them to synergistically combine their information. The main difficulty in merging is to match the ontologies since they use different naming conventions. Therefore, we developed an algorithm that merges ontologies by integrating the name matching between class label names with the structure mapping between the ontology elements based on graph convolution. Since the structure mapping is a time consuming process, we designed two methods to perform the graph convolution: vectorial structure matching and constraint-based structure matching. To perform the vectorial structure matching, we designed a general method to calculate the similarities between vectors of different lengths for different metrics. Additionally, we adapted the slower Blondel method to work for structure matching. We implemented our algorithms into FOntCell, a software module in Python for efficient automatic parallel-computed merging/fusion of ontologies in the same or similar knowledge domains. FOntCell can unify dispersed knowledge from one domain into a unique ontology in OWL format and iteratively reuse it to continuously adapt ontologies with new data endlessly produced by data-driven classification methods, such as of the Human Cell Atlas. To navigate easily across the merged ontologies, it generates HTML files with tabulated and graphic summaries, and interactive circular Directed Acyclic Graphs. We used FOntCell to merge the CELDA, LifeMap and LungMAP Human Anatomy cell ontologies into a comprehensive cell ontology. We compared FOntCell with tools used for the alignment of mouse and human anatomy ontologies task proposed by the Ontology Alignment Evaluation Initiative (OAEI) and found that the F(β) alignment accuracies of FOntCell are above the geometric mean of the other tools; more importantly, it outperforms significantly the best OAEI tools in cell ontology alignment in terms of F(β) alignment accuracies. Frontiers Media S.A. 2021-02-11 /pmc/articles/PMC7905052/ /pubmed/33644039 http://dx.doi.org/10.3389/fcell.2021.562908 Text en Copyright © 2021 Cabau-Laporta, Ascensión, Arrospide-Elgarresta, Gerovska and Araúzo-Bravo. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Cell and Developmental Biology Cabau-Laporta, Javier Ascensión, Alex M. Arrospide-Elgarresta, Mikel Gerovska, Daniela Araúzo-Bravo, Marcos J. FOntCell: Fusion of Ontologies of Cells |
title | FOntCell: Fusion of Ontologies of Cells |
title_full | FOntCell: Fusion of Ontologies of Cells |
title_fullStr | FOntCell: Fusion of Ontologies of Cells |
title_full_unstemmed | FOntCell: Fusion of Ontologies of Cells |
title_short | FOntCell: Fusion of Ontologies of Cells |
title_sort | fontcell: fusion of ontologies of cells |
topic | Cell and Developmental Biology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7905052/ https://www.ncbi.nlm.nih.gov/pubmed/33644039 http://dx.doi.org/10.3389/fcell.2021.562908 |
work_keys_str_mv | AT cabaulaportajavier fontcellfusionofontologiesofcells AT ascensionalexm fontcellfusionofontologiesofcells AT arrospideelgarrestamikel fontcellfusionofontologiesofcells AT gerovskadaniela fontcellfusionofontologiesofcells AT arauzobravomarcosj fontcellfusionofontologiesofcells |