Cargando…

Explainable Deep Learning for Augmentation of Small RNA Expression Profiles

The lack of well-structured metadata annotations complicates the reusability and interpretation of the growing amount of publicly available RNA expression data. The machine learning-based prediction of metadata (data augmentation) can considerably improve the quality of expression data annotation. I...

Descripción completa

Detalles Bibliográficos
Autores principales: Fiosina, Jelena, Fiosins, Maksims, Bonn, Stefan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Mary Ann Liebert, Inc., publishers 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7047095/
https://www.ncbi.nlm.nih.gov/pubmed/31855058
http://dx.doi.org/10.1089/cmb.2019.0320
_version_ 1783502075280228352
author Fiosina, Jelena
Fiosins, Maksims
Bonn, Stefan
author_facet Fiosina, Jelena
Fiosins, Maksims
Bonn, Stefan
author_sort Fiosina, Jelena
collection PubMed
description The lack of well-structured metadata annotations complicates the reusability and interpretation of the growing amount of publicly available RNA expression data. The machine learning-based prediction of metadata (data augmentation) can considerably improve the quality of expression data annotation. In this study, we systematically benchmark deep learning (DL) and random forest (RF)-based metadata augmentation of tissue, age, and sex using small RNA (sRNA) expression profiles. We use 4243 annotated sRNA-Seq samples from the sRNA expression atlas database to train and test the augmentation performance. In general, the DL machine learner outperforms the RF method in almost all tested cases. The average cross-validated prediction accuracy of the DL algorithm for tissues is 96.5%, for sex is 77%, and for age is 77.2%. The average tissue prediction accuracy for a completely new data set is 83.1% (DL) and 80.8% (RF). To understand which sRNAs influence DL predictions, we employ backpropagation-based feature importance scores using the DeepLIFT method, which enable us to obtain information on biological relevance of sRNAs.
format Online
Article
Text
id pubmed-7047095
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Mary Ann Liebert, Inc., publishers
record_format MEDLINE/PubMed
spelling pubmed-70470952020-02-28 Explainable Deep Learning for Augmentation of Small RNA Expression Profiles Fiosina, Jelena Fiosins, Maksims Bonn, Stefan J Comput Biol Conference Papers The lack of well-structured metadata annotations complicates the reusability and interpretation of the growing amount of publicly available RNA expression data. The machine learning-based prediction of metadata (data augmentation) can considerably improve the quality of expression data annotation. In this study, we systematically benchmark deep learning (DL) and random forest (RF)-based metadata augmentation of tissue, age, and sex using small RNA (sRNA) expression profiles. We use 4243 annotated sRNA-Seq samples from the sRNA expression atlas database to train and test the augmentation performance. In general, the DL machine learner outperforms the RF method in almost all tested cases. The average cross-validated prediction accuracy of the DL algorithm for tissues is 96.5%, for sex is 77%, and for age is 77.2%. The average tissue prediction accuracy for a completely new data set is 83.1% (DL) and 80.8% (RF). To understand which sRNAs influence DL predictions, we employ backpropagation-based feature importance scores using the DeepLIFT method, which enable us to obtain information on biological relevance of sRNAs. Mary Ann Liebert, Inc., publishers 2020-02-01 2020-02-06 /pmc/articles/PMC7047095/ /pubmed/31855058 http://dx.doi.org/10.1089/cmb.2019.0320 Text en © Jelena Fiosina, et al., 2020. Published by Mary Ann Liebert, Inc. This Open Access article is distributed under the terms of the Creative Commons License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.
spellingShingle Conference Papers
Fiosina, Jelena
Fiosins, Maksims
Bonn, Stefan
Explainable Deep Learning for Augmentation of Small RNA Expression Profiles
title Explainable Deep Learning for Augmentation of Small RNA Expression Profiles
title_full Explainable Deep Learning for Augmentation of Small RNA Expression Profiles
title_fullStr Explainable Deep Learning for Augmentation of Small RNA Expression Profiles
title_full_unstemmed Explainable Deep Learning for Augmentation of Small RNA Expression Profiles
title_short Explainable Deep Learning for Augmentation of Small RNA Expression Profiles
title_sort explainable deep learning for augmentation of small rna expression profiles
topic Conference Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7047095/
https://www.ncbi.nlm.nih.gov/pubmed/31855058
http://dx.doi.org/10.1089/cmb.2019.0320
work_keys_str_mv AT fiosinajelena explainabledeeplearningforaugmentationofsmallrnaexpressionprofiles
AT fiosinsmaksims explainabledeeplearningforaugmentationofsmallrnaexpressionprofiles
AT bonnstefan explainabledeeplearningforaugmentationofsmallrnaexpressionprofiles