Cargando…

DIscBIO: A User-Friendly Pipeline for Biomarker Discovery in Single-Cell Transcriptomics

The growing attention toward the benefits of single-cell RNA sequencing (scRNA-seq) is leading to a myriad of computational packages for the analysis of different aspects of scRNA-seq data. For researchers without advanced programing skills, it is very challenging to combine several packages in orde...

Descripción completa

Detalles Bibliográficos
Autores principales: Ghannoum, Salim, Leoncio Netto, Waldir, Fantini, Damiano, Ragan-Kelley, Benjamin, Parizadeh, Amirabbas, Jonasson, Emma, Ståhlberg, Anders, Farhan, Hesso, Köhn-Luque, Alvaro
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7866810/
https://www.ncbi.nlm.nih.gov/pubmed/33573289
http://dx.doi.org/10.3390/ijms22031399
_version_ 1783648159960924160
author Ghannoum, Salim
Leoncio Netto, Waldir
Fantini, Damiano
Ragan-Kelley, Benjamin
Parizadeh, Amirabbas
Jonasson, Emma
Ståhlberg, Anders
Farhan, Hesso
Köhn-Luque, Alvaro
author_facet Ghannoum, Salim
Leoncio Netto, Waldir
Fantini, Damiano
Ragan-Kelley, Benjamin
Parizadeh, Amirabbas
Jonasson, Emma
Ståhlberg, Anders
Farhan, Hesso
Köhn-Luque, Alvaro
author_sort Ghannoum, Salim
collection PubMed
description The growing attention toward the benefits of single-cell RNA sequencing (scRNA-seq) is leading to a myriad of computational packages for the analysis of different aspects of scRNA-seq data. For researchers without advanced programing skills, it is very challenging to combine several packages in order to perform the desired analysis in a simple and reproducible way. Here we present DIscBIO, an open-source, multi-algorithmic pipeline for easy, efficient and reproducible analysis of cellular sub-populations at the transcriptomic level. The pipeline integrates multiple scRNA-seq packages and allows biomarker discovery with decision trees and gene enrichment analysis in a network context using single-cell sequencing read counts through clustering and differential analysis. DIscBIO is freely available as an R package. It can be run either in command-line mode or through a user-friendly computational pipeline using Jupyter notebooks. We showcase all pipeline features using two scRNA-seq datasets. The first dataset consists of circulating tumor cells from patients with breast cancer. The second one is a cell cycle regulation dataset in myxoid liposarcoma. All analyses are available as notebooks that integrate in a sequential narrative R code with explanatory text and output data and images. R users can use the notebooks to understand the different steps of the pipeline and will guide them to explore their scRNA-seq data. We also provide a cloud version using Binder that allows the execution of the pipeline without the need of downloading R, Jupyter or any of the packages used by the pipeline. The cloud version can serve as a tutorial for training purposes, especially for those that are not R users or have limited programing skills. However, in order to do meaningful scRNA-seq analyses, all users will need to understand the implemented methods and their possible options and limitations.
format Online
Article
Text
id pubmed-7866810
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-78668102021-02-07 DIscBIO: A User-Friendly Pipeline for Biomarker Discovery in Single-Cell Transcriptomics Ghannoum, Salim Leoncio Netto, Waldir Fantini, Damiano Ragan-Kelley, Benjamin Parizadeh, Amirabbas Jonasson, Emma Ståhlberg, Anders Farhan, Hesso Köhn-Luque, Alvaro Int J Mol Sci Article The growing attention toward the benefits of single-cell RNA sequencing (scRNA-seq) is leading to a myriad of computational packages for the analysis of different aspects of scRNA-seq data. For researchers without advanced programing skills, it is very challenging to combine several packages in order to perform the desired analysis in a simple and reproducible way. Here we present DIscBIO, an open-source, multi-algorithmic pipeline for easy, efficient and reproducible analysis of cellular sub-populations at the transcriptomic level. The pipeline integrates multiple scRNA-seq packages and allows biomarker discovery with decision trees and gene enrichment analysis in a network context using single-cell sequencing read counts through clustering and differential analysis. DIscBIO is freely available as an R package. It can be run either in command-line mode or through a user-friendly computational pipeline using Jupyter notebooks. We showcase all pipeline features using two scRNA-seq datasets. The first dataset consists of circulating tumor cells from patients with breast cancer. The second one is a cell cycle regulation dataset in myxoid liposarcoma. All analyses are available as notebooks that integrate in a sequential narrative R code with explanatory text and output data and images. R users can use the notebooks to understand the different steps of the pipeline and will guide them to explore their scRNA-seq data. We also provide a cloud version using Binder that allows the execution of the pipeline without the need of downloading R, Jupyter or any of the packages used by the pipeline. The cloud version can serve as a tutorial for training purposes, especially for those that are not R users or have limited programing skills. However, in order to do meaningful scRNA-seq analyses, all users will need to understand the implemented methods and their possible options and limitations. MDPI 2021-01-30 /pmc/articles/PMC7866810/ /pubmed/33573289 http://dx.doi.org/10.3390/ijms22031399 Text en © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Ghannoum, Salim
Leoncio Netto, Waldir
Fantini, Damiano
Ragan-Kelley, Benjamin
Parizadeh, Amirabbas
Jonasson, Emma
Ståhlberg, Anders
Farhan, Hesso
Köhn-Luque, Alvaro
DIscBIO: A User-Friendly Pipeline for Biomarker Discovery in Single-Cell Transcriptomics
title DIscBIO: A User-Friendly Pipeline for Biomarker Discovery in Single-Cell Transcriptomics
title_full DIscBIO: A User-Friendly Pipeline for Biomarker Discovery in Single-Cell Transcriptomics
title_fullStr DIscBIO: A User-Friendly Pipeline for Biomarker Discovery in Single-Cell Transcriptomics
title_full_unstemmed DIscBIO: A User-Friendly Pipeline for Biomarker Discovery in Single-Cell Transcriptomics
title_short DIscBIO: A User-Friendly Pipeline for Biomarker Discovery in Single-Cell Transcriptomics
title_sort discbio: a user-friendly pipeline for biomarker discovery in single-cell transcriptomics
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7866810/
https://www.ncbi.nlm.nih.gov/pubmed/33573289
http://dx.doi.org/10.3390/ijms22031399
work_keys_str_mv AT ghannoumsalim discbioauserfriendlypipelineforbiomarkerdiscoveryinsinglecelltranscriptomics
AT leoncionettowaldir discbioauserfriendlypipelineforbiomarkerdiscoveryinsinglecelltranscriptomics
AT fantinidamiano discbioauserfriendlypipelineforbiomarkerdiscoveryinsinglecelltranscriptomics
AT ragankelleybenjamin discbioauserfriendlypipelineforbiomarkerdiscoveryinsinglecelltranscriptomics
AT parizadehamirabbas discbioauserfriendlypipelineforbiomarkerdiscoveryinsinglecelltranscriptomics
AT jonassonemma discbioauserfriendlypipelineforbiomarkerdiscoveryinsinglecelltranscriptomics
AT stahlberganders discbioauserfriendlypipelineforbiomarkerdiscoveryinsinglecelltranscriptomics
AT farhanhesso discbioauserfriendlypipelineforbiomarkerdiscoveryinsinglecelltranscriptomics
AT kohnluquealvaro discbioauserfriendlypipelineforbiomarkerdiscoveryinsinglecelltranscriptomics