Cargando…

ABEILLE: a novel method for ABerrant Expression Identification empLoying machine LEarning from RNA-sequencing data

MOTIVATION: Current advances in omics technologies are paving the diagnosis of rare diseases proposing a complementary assay to identify the responsible gene. The use of transcriptomic data to identify aberrant gene expression (AGE) has demonstrated to yield potential pathogenic events. However, pop...

Descripción completa

Detalles Bibliográficos
Autores principales: Labory, Justine, Le Bideau, Gwendal, Pratella, David, Yao, Jean-Elisée, Ait-El-Mkadem Saadi, Samira, Bannwarth, Sylvie, El-Hami, Loubna, Paquis-Fluckinger, Véronique, Bottini, Silvia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9563686/
https://www.ncbi.nlm.nih.gov/pubmed/36063052
http://dx.doi.org/10.1093/bioinformatics/btac603
_version_ 1784808463253110784
author Labory, Justine
Le Bideau, Gwendal
Pratella, David
Yao, Jean-Elisée
Ait-El-Mkadem Saadi, Samira
Bannwarth, Sylvie
El-Hami, Loubna
Paquis-Fluckinger, Véronique
Bottini, Silvia
author_facet Labory, Justine
Le Bideau, Gwendal
Pratella, David
Yao, Jean-Elisée
Ait-El-Mkadem Saadi, Samira
Bannwarth, Sylvie
El-Hami, Loubna
Paquis-Fluckinger, Véronique
Bottini, Silvia
author_sort Labory, Justine
collection PubMed
description MOTIVATION: Current advances in omics technologies are paving the diagnosis of rare diseases proposing a complementary assay to identify the responsible gene. The use of transcriptomic data to identify aberrant gene expression (AGE) has demonstrated to yield potential pathogenic events. However, popular approaches for AGE identification are limited by the use of statistical tests that imply the choice of arbitrary cut-off for significance assessment and the availability of several replicates not always possible in clinical contexts. RESULTS: Hence, we developed ABerrant Expression Identification empLoying machine LEarning from sequencing data (ABEILLE) a variational autoencoder (VAE)-based method for the identification of AGEs from the analysis of RNA-seq data without the need for replicates or a control group. ABEILLE combines the use of a VAE, able to model any data without specific assumptions on their distribution, and a decision tree to classify genes as AGE or non-AGE. An anomaly score is associated with each gene in order to stratify AGE by the severity of aberration. We tested ABEILLE on a semi-synthetic and an experimental dataset demonstrating the importance of the flexibility of the VAE configuration to identify potential pathogenic candidates. AVAILABILITY AND IMPLEMENTATION: ABEILLE source code is freely available at: https://github.com/UCA-MSI/ABEILLE. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9563686
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-95636862022-10-18 ABEILLE: a novel method for ABerrant Expression Identification empLoying machine LEarning from RNA-sequencing data Labory, Justine Le Bideau, Gwendal Pratella, David Yao, Jean-Elisée Ait-El-Mkadem Saadi, Samira Bannwarth, Sylvie El-Hami, Loubna Paquis-Fluckinger, Véronique Bottini, Silvia Bioinformatics Original Papers MOTIVATION: Current advances in omics technologies are paving the diagnosis of rare diseases proposing a complementary assay to identify the responsible gene. The use of transcriptomic data to identify aberrant gene expression (AGE) has demonstrated to yield potential pathogenic events. However, popular approaches for AGE identification are limited by the use of statistical tests that imply the choice of arbitrary cut-off for significance assessment and the availability of several replicates not always possible in clinical contexts. RESULTS: Hence, we developed ABerrant Expression Identification empLoying machine LEarning from sequencing data (ABEILLE) a variational autoencoder (VAE)-based method for the identification of AGEs from the analysis of RNA-seq data without the need for replicates or a control group. ABEILLE combines the use of a VAE, able to model any data without specific assumptions on their distribution, and a decision tree to classify genes as AGE or non-AGE. An anomaly score is associated with each gene in order to stratify AGE by the severity of aberration. We tested ABEILLE on a semi-synthetic and an experimental dataset demonstrating the importance of the flexibility of the VAE configuration to identify potential pathogenic candidates. AVAILABILITY AND IMPLEMENTATION: ABEILLE source code is freely available at: https://github.com/UCA-MSI/ABEILLE. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-09-05 /pmc/articles/PMC9563686/ /pubmed/36063052 http://dx.doi.org/10.1093/bioinformatics/btac603 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Labory, Justine
Le Bideau, Gwendal
Pratella, David
Yao, Jean-Elisée
Ait-El-Mkadem Saadi, Samira
Bannwarth, Sylvie
El-Hami, Loubna
Paquis-Fluckinger, Véronique
Bottini, Silvia
ABEILLE: a novel method for ABerrant Expression Identification empLoying machine LEarning from RNA-sequencing data
title ABEILLE: a novel method for ABerrant Expression Identification empLoying machine LEarning from RNA-sequencing data
title_full ABEILLE: a novel method for ABerrant Expression Identification empLoying machine LEarning from RNA-sequencing data
title_fullStr ABEILLE: a novel method for ABerrant Expression Identification empLoying machine LEarning from RNA-sequencing data
title_full_unstemmed ABEILLE: a novel method for ABerrant Expression Identification empLoying machine LEarning from RNA-sequencing data
title_short ABEILLE: a novel method for ABerrant Expression Identification empLoying machine LEarning from RNA-sequencing data
title_sort abeille: a novel method for aberrant expression identification employing machine learning from rna-sequencing data
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9563686/
https://www.ncbi.nlm.nih.gov/pubmed/36063052
http://dx.doi.org/10.1093/bioinformatics/btac603
work_keys_str_mv AT laboryjustine abeilleanovelmethodforaberrantexpressionidentificationemployingmachinelearningfromrnasequencingdata
AT lebideaugwendal abeilleanovelmethodforaberrantexpressionidentificationemployingmachinelearningfromrnasequencingdata
AT pratelladavid abeilleanovelmethodforaberrantexpressionidentificationemployingmachinelearningfromrnasequencingdata
AT yaojeanelisee abeilleanovelmethodforaberrantexpressionidentificationemployingmachinelearningfromrnasequencingdata
AT aitelmkademsaadisamira abeilleanovelmethodforaberrantexpressionidentificationemployingmachinelearningfromrnasequencingdata
AT bannwarthsylvie abeilleanovelmethodforaberrantexpressionidentificationemployingmachinelearningfromrnasequencingdata
AT elhamiloubna abeilleanovelmethodforaberrantexpressionidentificationemployingmachinelearningfromrnasequencingdata
AT paquisfluckingerveronique abeilleanovelmethodforaberrantexpressionidentificationemployingmachinelearningfromrnasequencingdata
AT bottinisilvia abeilleanovelmethodforaberrantexpressionidentificationemployingmachinelearningfromrnasequencingdata