Cargando…

Classifying diseases by using biological features to identify potential nosological models

Established nosological models have provided physicians an adequate enough classification of diseases so far. Such systems are important to correctly identify diseases and treat them successfully. However, these taxonomies tend to be based on phenotypical observations, lacking a molecular or biologi...

Descripción completa

Detalles Bibliográficos
Autores principales: Prieto Santamaría, Lucía, García del Valle, Eduardo P., Zanin, Massimiliano, Hernández Chan, Gandhi Samuel, Pérez Gallardo, Yuliana, Rodríguez-González, Alejandro
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8548311/
https://www.ncbi.nlm.nih.gov/pubmed/34702888
http://dx.doi.org/10.1038/s41598-021-00554-6
_version_ 1784590545708908544
author Prieto Santamaría, Lucía
García del Valle, Eduardo P.
Zanin, Massimiliano
Hernández Chan, Gandhi Samuel
Pérez Gallardo, Yuliana
Rodríguez-González, Alejandro
author_facet Prieto Santamaría, Lucía
García del Valle, Eduardo P.
Zanin, Massimiliano
Hernández Chan, Gandhi Samuel
Pérez Gallardo, Yuliana
Rodríguez-González, Alejandro
author_sort Prieto Santamaría, Lucía
collection PubMed
description Established nosological models have provided physicians an adequate enough classification of diseases so far. Such systems are important to correctly identify diseases and treat them successfully. However, these taxonomies tend to be based on phenotypical observations, lacking a molecular or biological foundation. Therefore, there is an urgent need to modernize them in order to include the heterogeneous information that is produced in the present, as could be genomic, proteomic, transcriptomic and metabolic data, leading this way to more comprehensive and robust structures. For that purpose, we have developed an extensive methodology to analyse the possibilities when it comes to generate new nosological models from biological features. Different datasets of diseases have been considered, and distinct features related to diseases, namely genes, proteins, metabolic pathways and genetical variants, have been represented as binary and numerical vectors. From those vectors, diseases distances have been computed on the basis of several metrics. Clustering algorithms have been implemented to group diseases, generating different models, each of them corresponding to the distinct combinations of the previous parameters. They have been evaluated by means of intrinsic metrics, proving that some of them are highly suitable to cover new nosologies. One of the clustering configurations has been deeply analysed, demonstrating its quality and validity in the research context, and further biological interpretations have been made. Such model was particularly generated by OPTICS clustering algorithm, by studying the distance between diseases based on gene sharedness and following cosine index metric. 729 clusters were formed in this model, which obtained a Silhouette coefficient of 0.43.
format Online
Article
Text
id pubmed-8548311
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-85483112021-10-27 Classifying diseases by using biological features to identify potential nosological models Prieto Santamaría, Lucía García del Valle, Eduardo P. Zanin, Massimiliano Hernández Chan, Gandhi Samuel Pérez Gallardo, Yuliana Rodríguez-González, Alejandro Sci Rep Article Established nosological models have provided physicians an adequate enough classification of diseases so far. Such systems are important to correctly identify diseases and treat them successfully. However, these taxonomies tend to be based on phenotypical observations, lacking a molecular or biological foundation. Therefore, there is an urgent need to modernize them in order to include the heterogeneous information that is produced in the present, as could be genomic, proteomic, transcriptomic and metabolic data, leading this way to more comprehensive and robust structures. For that purpose, we have developed an extensive methodology to analyse the possibilities when it comes to generate new nosological models from biological features. Different datasets of diseases have been considered, and distinct features related to diseases, namely genes, proteins, metabolic pathways and genetical variants, have been represented as binary and numerical vectors. From those vectors, diseases distances have been computed on the basis of several metrics. Clustering algorithms have been implemented to group diseases, generating different models, each of them corresponding to the distinct combinations of the previous parameters. They have been evaluated by means of intrinsic metrics, proving that some of them are highly suitable to cover new nosologies. One of the clustering configurations has been deeply analysed, demonstrating its quality and validity in the research context, and further biological interpretations have been made. Such model was particularly generated by OPTICS clustering algorithm, by studying the distance between diseases based on gene sharedness and following cosine index metric. 729 clusters were formed in this model, which obtained a Silhouette coefficient of 0.43. Nature Publishing Group UK 2021-10-26 /pmc/articles/PMC8548311/ /pubmed/34702888 http://dx.doi.org/10.1038/s41598-021-00554-6 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Prieto Santamaría, Lucía
García del Valle, Eduardo P.
Zanin, Massimiliano
Hernández Chan, Gandhi Samuel
Pérez Gallardo, Yuliana
Rodríguez-González, Alejandro
Classifying diseases by using biological features to identify potential nosological models
title Classifying diseases by using biological features to identify potential nosological models
title_full Classifying diseases by using biological features to identify potential nosological models
title_fullStr Classifying diseases by using biological features to identify potential nosological models
title_full_unstemmed Classifying diseases by using biological features to identify potential nosological models
title_short Classifying diseases by using biological features to identify potential nosological models
title_sort classifying diseases by using biological features to identify potential nosological models
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8548311/
https://www.ncbi.nlm.nih.gov/pubmed/34702888
http://dx.doi.org/10.1038/s41598-021-00554-6
work_keys_str_mv AT prietosantamarialucia classifyingdiseasesbyusingbiologicalfeaturestoidentifypotentialnosologicalmodels
AT garciadelvalleeduardop classifyingdiseasesbyusingbiologicalfeaturestoidentifypotentialnosologicalmodels
AT zaninmassimiliano classifyingdiseasesbyusingbiologicalfeaturestoidentifypotentialnosologicalmodels
AT hernandezchangandhisamuel classifyingdiseasesbyusingbiologicalfeaturestoidentifypotentialnosologicalmodels
AT perezgallardoyuliana classifyingdiseasesbyusingbiologicalfeaturestoidentifypotentialnosologicalmodels
AT rodriguezgonzalezalejandro classifyingdiseasesbyusingbiologicalfeaturestoidentifypotentialnosologicalmodels