Cargando…

Interspecific comparison of gene expression profiles using machine learning

Interspecific gene comparisons are the keystones for many areas of biological research and are especially important for the translation of knowledge from model organisms to economically important species. Currently they are hampered by the low resolution of methods based on sequence analysis and by...

Descripción completa

Detalles Bibliográficos
Autores principales: Kasianov, Artem S., Klepikova, Anna V., Mayorov, Alexey V., Buzanov, Gleb S., Logacheva, Maria D., Penin, Aleksey A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9879537/
https://www.ncbi.nlm.nih.gov/pubmed/36626392
http://dx.doi.org/10.1371/journal.pcbi.1010743
_version_ 1784878713143296000
author Kasianov, Artem S.
Klepikova, Anna V.
Mayorov, Alexey V.
Buzanov, Gleb S.
Logacheva, Maria D.
Penin, Aleksey A.
author_facet Kasianov, Artem S.
Klepikova, Anna V.
Mayorov, Alexey V.
Buzanov, Gleb S.
Logacheva, Maria D.
Penin, Aleksey A.
author_sort Kasianov, Artem S.
collection PubMed
description Interspecific gene comparisons are the keystones for many areas of biological research and are especially important for the translation of knowledge from model organisms to economically important species. Currently they are hampered by the low resolution of methods based on sequence analysis and by the complex evolutionary history of eukaryotic genes. This is especially critical for plants, whose genomes are shaped by multiple whole genome duplications and subsequent gene loss. This requires the development of new methods for comparing the functions of genes in different species. Here, we report ISEEML (Interspecific Similarity of Expression Evaluated using Machine Learning)–a novel machine learning-based algorithm for interspecific gene classification. In contrast to previous studies focused on sequence similarity, our algorithm focuses on functional similarity inferred from the comparison of gene expression profiles. We propose novel metrics for expression pattern similarity–expression score (ES)–that is suitable for species with differing morphologies. As a proof of concept, we compare detailed transcriptome maps of Arabidopsis thaliana, the model species, Zea mays (maize) and Fagopyrum esculentum (common buckwheat), which are species that represent distant clades within flowering plants. The classifier resulted in an AUC of 0.91; under the ES threshold of 0.5, the specificity was 94%, and sensitivity was 72%.
format Online
Article
Text
id pubmed-9879537
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-98795372023-01-27 Interspecific comparison of gene expression profiles using machine learning Kasianov, Artem S. Klepikova, Anna V. Mayorov, Alexey V. Buzanov, Gleb S. Logacheva, Maria D. Penin, Aleksey A. PLoS Comput Biol Research Article Interspecific gene comparisons are the keystones for many areas of biological research and are especially important for the translation of knowledge from model organisms to economically important species. Currently they are hampered by the low resolution of methods based on sequence analysis and by the complex evolutionary history of eukaryotic genes. This is especially critical for plants, whose genomes are shaped by multiple whole genome duplications and subsequent gene loss. This requires the development of new methods for comparing the functions of genes in different species. Here, we report ISEEML (Interspecific Similarity of Expression Evaluated using Machine Learning)–a novel machine learning-based algorithm for interspecific gene classification. In contrast to previous studies focused on sequence similarity, our algorithm focuses on functional similarity inferred from the comparison of gene expression profiles. We propose novel metrics for expression pattern similarity–expression score (ES)–that is suitable for species with differing morphologies. As a proof of concept, we compare detailed transcriptome maps of Arabidopsis thaliana, the model species, Zea mays (maize) and Fagopyrum esculentum (common buckwheat), which are species that represent distant clades within flowering plants. The classifier resulted in an AUC of 0.91; under the ES threshold of 0.5, the specificity was 94%, and sensitivity was 72%. Public Library of Science 2023-01-10 /pmc/articles/PMC9879537/ /pubmed/36626392 http://dx.doi.org/10.1371/journal.pcbi.1010743 Text en © 2023 Kasianov et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Kasianov, Artem S.
Klepikova, Anna V.
Mayorov, Alexey V.
Buzanov, Gleb S.
Logacheva, Maria D.
Penin, Aleksey A.
Interspecific comparison of gene expression profiles using machine learning
title Interspecific comparison of gene expression profiles using machine learning
title_full Interspecific comparison of gene expression profiles using machine learning
title_fullStr Interspecific comparison of gene expression profiles using machine learning
title_full_unstemmed Interspecific comparison of gene expression profiles using machine learning
title_short Interspecific comparison of gene expression profiles using machine learning
title_sort interspecific comparison of gene expression profiles using machine learning
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9879537/
https://www.ncbi.nlm.nih.gov/pubmed/36626392
http://dx.doi.org/10.1371/journal.pcbi.1010743
work_keys_str_mv AT kasianovartems interspecificcomparisonofgeneexpressionprofilesusingmachinelearning
AT klepikovaannav interspecificcomparisonofgeneexpressionprofilesusingmachinelearning
AT mayorovalexeyv interspecificcomparisonofgeneexpressionprofilesusingmachinelearning
AT buzanovglebs interspecificcomparisonofgeneexpressionprofilesusingmachinelearning
AT logachevamariad interspecificcomparisonofgeneexpressionprofilesusingmachinelearning
AT peninalekseya interspecificcomparisonofgeneexpressionprofilesusingmachinelearning