Cargando…

Combining Multiple RNA-Seq Data Analysis Algorithms Using Machine Learning Improves Differential Isoform Expression Analysis

RNA sequencing has become the standard technique for high resolution genome-wide monitoring of gene expression. As such, it often comprises the first step towards understanding complex molecular mechanisms driving various phenotypes, spanning organ development to disease genesis, monitoring and prog...

Descripción completa

Detalles Bibliográficos
Autores principales: Dimopoulos, Alexandros C., Koukoutegos, Konstantinos, Psomopoulos, Fotis E., Moulos, Panagiotis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8544431/
https://www.ncbi.nlm.nih.gov/pubmed/34698224
http://dx.doi.org/10.3390/mps4040068
_version_ 1784589815222632448
author Dimopoulos, Alexandros C.
Koukoutegos, Konstantinos
Psomopoulos, Fotis E.
Moulos, Panagiotis
author_facet Dimopoulos, Alexandros C.
Koukoutegos, Konstantinos
Psomopoulos, Fotis E.
Moulos, Panagiotis
author_sort Dimopoulos, Alexandros C.
collection PubMed
description RNA sequencing has become the standard technique for high resolution genome-wide monitoring of gene expression. As such, it often comprises the first step towards understanding complex molecular mechanisms driving various phenotypes, spanning organ development to disease genesis, monitoring and progression. An advantage of RNA sequencing is its ability to capture complex transcriptomic events such as alternative splicing which results in alternate isoform abundance. At the same time, this advantage remains algorithmically and computationally challenging, especially with the emergence of even higher resolution technologies such as single-cell RNA sequencing. Although several algorithms have been proposed for the effective detection of differential isoform expression from RNA-Seq data, no widely accepted golden standards have been established. This fact is further compounded by the significant differences in the output of different algorithms when applied on the same data. In addition, many of the proposed algorithms remain scarce and poorly maintained. Driven by these challenges, we developed a novel integrative approach that effectively combines the most widely used algorithms for differential transcript and isoform analysis using state-of-the-art machine learning techniques. We demonstrate its usability by applying it on simulated data based on several organisms, and using several performance metrics; we conclude that our strategy outperforms the application of the individual algorithms. Finally, our approach is implemented as an R Shiny application, with the underlying data analysis pipelines also available as docker containers.
format Online
Article
Text
id pubmed-8544431
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-85444312021-10-26 Combining Multiple RNA-Seq Data Analysis Algorithms Using Machine Learning Improves Differential Isoform Expression Analysis Dimopoulos, Alexandros C. Koukoutegos, Konstantinos Psomopoulos, Fotis E. Moulos, Panagiotis Methods Protoc Protocol RNA sequencing has become the standard technique for high resolution genome-wide monitoring of gene expression. As such, it often comprises the first step towards understanding complex molecular mechanisms driving various phenotypes, spanning organ development to disease genesis, monitoring and progression. An advantage of RNA sequencing is its ability to capture complex transcriptomic events such as alternative splicing which results in alternate isoform abundance. At the same time, this advantage remains algorithmically and computationally challenging, especially with the emergence of even higher resolution technologies such as single-cell RNA sequencing. Although several algorithms have been proposed for the effective detection of differential isoform expression from RNA-Seq data, no widely accepted golden standards have been established. This fact is further compounded by the significant differences in the output of different algorithms when applied on the same data. In addition, many of the proposed algorithms remain scarce and poorly maintained. Driven by these challenges, we developed a novel integrative approach that effectively combines the most widely used algorithms for differential transcript and isoform analysis using state-of-the-art machine learning techniques. We demonstrate its usability by applying it on simulated data based on several organisms, and using several performance metrics; we conclude that our strategy outperforms the application of the individual algorithms. Finally, our approach is implemented as an R Shiny application, with the underlying data analysis pipelines also available as docker containers. MDPI 2021-09-27 /pmc/articles/PMC8544431/ /pubmed/34698224 http://dx.doi.org/10.3390/mps4040068 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Protocol
Dimopoulos, Alexandros C.
Koukoutegos, Konstantinos
Psomopoulos, Fotis E.
Moulos, Panagiotis
Combining Multiple RNA-Seq Data Analysis Algorithms Using Machine Learning Improves Differential Isoform Expression Analysis
title Combining Multiple RNA-Seq Data Analysis Algorithms Using Machine Learning Improves Differential Isoform Expression Analysis
title_full Combining Multiple RNA-Seq Data Analysis Algorithms Using Machine Learning Improves Differential Isoform Expression Analysis
title_fullStr Combining Multiple RNA-Seq Data Analysis Algorithms Using Machine Learning Improves Differential Isoform Expression Analysis
title_full_unstemmed Combining Multiple RNA-Seq Data Analysis Algorithms Using Machine Learning Improves Differential Isoform Expression Analysis
title_short Combining Multiple RNA-Seq Data Analysis Algorithms Using Machine Learning Improves Differential Isoform Expression Analysis
title_sort combining multiple rna-seq data analysis algorithms using machine learning improves differential isoform expression analysis
topic Protocol
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8544431/
https://www.ncbi.nlm.nih.gov/pubmed/34698224
http://dx.doi.org/10.3390/mps4040068
work_keys_str_mv AT dimopoulosalexandrosc combiningmultiplernaseqdataanalysisalgorithmsusingmachinelearningimprovesdifferentialisoformexpressionanalysis
AT koukoutegoskonstantinos combiningmultiplernaseqdataanalysisalgorithmsusingmachinelearningimprovesdifferentialisoformexpressionanalysis
AT psomopoulosfotise combiningmultiplernaseqdataanalysisalgorithmsusingmachinelearningimprovesdifferentialisoformexpressionanalysis
AT moulospanagiotis combiningmultiplernaseqdataanalysisalgorithmsusingmachinelearningimprovesdifferentialisoformexpressionanalysis