Cargando…

Sequoia: an interactive visual analytics platform for interpretation and feature extraction from nanopore sequencing datasets

BACKGROUND: Direct-sequencing technologies, such as Oxford Nanopore’s, are delivering long RNA reads with great efficacy and convenience. These technologies afford an ability to detect post-transcriptional modifications at a single-molecule resolution, promising new insights into the functional role...

Descripción completa

Detalles Bibliográficos
Autores principales: Koonchanok, Ratanond, Daulatabad, Swapna Vidhur, Mir, Quoseena, Reda, Khairi, Janga, Sarath Chandra
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8262049/
https://www.ncbi.nlm.nih.gov/pubmed/34233619
http://dx.doi.org/10.1186/s12864-021-07791-z
_version_ 1783719120517201920
author Koonchanok, Ratanond
Daulatabad, Swapna Vidhur
Mir, Quoseena
Reda, Khairi
Janga, Sarath Chandra
author_facet Koonchanok, Ratanond
Daulatabad, Swapna Vidhur
Mir, Quoseena
Reda, Khairi
Janga, Sarath Chandra
author_sort Koonchanok, Ratanond
collection PubMed
description BACKGROUND: Direct-sequencing technologies, such as Oxford Nanopore’s, are delivering long RNA reads with great efficacy and convenience. These technologies afford an ability to detect post-transcriptional modifications at a single-molecule resolution, promising new insights into the functional roles of RNA. However, realizing this potential requires new tools to analyze and explore this type of data. RESULT: Here, we present Sequoia, a visual analytics tool that allows users to interactively explore nanopore sequences. Sequoia combines a Python-based backend with a multi-view visualization interface, enabling users to import raw nanopore sequencing data in a Fast5 format, cluster sequences based on electric-current similarities, and drill-down onto signals to identify properties of interest. We demonstrate the application of Sequoia by generating and analyzing ~ 500k reads from direct RNA sequencing data of human HeLa cell line. We focus on comparing signal features from m6A and m5C RNA modifications as the first step towards building automated classifiers. We show how, through iterative visual exploration and tuning of dimensionality reduction parameters, we can separate modified RNA sequences from their unmodified counterparts. We also document new, qualitative signal signatures that characterize these modifications from otherwise normal RNA bases, which we were able to discover from the visualization. CONCLUSIONS: Sequoia’s interactive features complement existing computational approaches in nanopore-based RNA workflows. The insights gleaned through visual analysis should help users in developing rationales, hypotheses, and insights into the dynamic nature of RNA. Sequoia is available at https://github.com/dnonatar/Sequoia. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-021-07791-z.
format Online
Article
Text
id pubmed-8262049
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-82620492021-07-08 Sequoia: an interactive visual analytics platform for interpretation and feature extraction from nanopore sequencing datasets Koonchanok, Ratanond Daulatabad, Swapna Vidhur Mir, Quoseena Reda, Khairi Janga, Sarath Chandra BMC Genomics Software BACKGROUND: Direct-sequencing technologies, such as Oxford Nanopore’s, are delivering long RNA reads with great efficacy and convenience. These technologies afford an ability to detect post-transcriptional modifications at a single-molecule resolution, promising new insights into the functional roles of RNA. However, realizing this potential requires new tools to analyze and explore this type of data. RESULT: Here, we present Sequoia, a visual analytics tool that allows users to interactively explore nanopore sequences. Sequoia combines a Python-based backend with a multi-view visualization interface, enabling users to import raw nanopore sequencing data in a Fast5 format, cluster sequences based on electric-current similarities, and drill-down onto signals to identify properties of interest. We demonstrate the application of Sequoia by generating and analyzing ~ 500k reads from direct RNA sequencing data of human HeLa cell line. We focus on comparing signal features from m6A and m5C RNA modifications as the first step towards building automated classifiers. We show how, through iterative visual exploration and tuning of dimensionality reduction parameters, we can separate modified RNA sequences from their unmodified counterparts. We also document new, qualitative signal signatures that characterize these modifications from otherwise normal RNA bases, which we were able to discover from the visualization. CONCLUSIONS: Sequoia’s interactive features complement existing computational approaches in nanopore-based RNA workflows. The insights gleaned through visual analysis should help users in developing rationales, hypotheses, and insights into the dynamic nature of RNA. Sequoia is available at https://github.com/dnonatar/Sequoia. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-021-07791-z. BioMed Central 2021-07-07 /pmc/articles/PMC8262049/ /pubmed/34233619 http://dx.doi.org/10.1186/s12864-021-07791-z Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Software
Koonchanok, Ratanond
Daulatabad, Swapna Vidhur
Mir, Quoseena
Reda, Khairi
Janga, Sarath Chandra
Sequoia: an interactive visual analytics platform for interpretation and feature extraction from nanopore sequencing datasets
title Sequoia: an interactive visual analytics platform for interpretation and feature extraction from nanopore sequencing datasets
title_full Sequoia: an interactive visual analytics platform for interpretation and feature extraction from nanopore sequencing datasets
title_fullStr Sequoia: an interactive visual analytics platform for interpretation and feature extraction from nanopore sequencing datasets
title_full_unstemmed Sequoia: an interactive visual analytics platform for interpretation and feature extraction from nanopore sequencing datasets
title_short Sequoia: an interactive visual analytics platform for interpretation and feature extraction from nanopore sequencing datasets
title_sort sequoia: an interactive visual analytics platform for interpretation and feature extraction from nanopore sequencing datasets
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8262049/
https://www.ncbi.nlm.nih.gov/pubmed/34233619
http://dx.doi.org/10.1186/s12864-021-07791-z
work_keys_str_mv AT koonchanokratanond sequoiaaninteractivevisualanalyticsplatformforinterpretationandfeatureextractionfromnanoporesequencingdatasets
AT daulatabadswapnavidhur sequoiaaninteractivevisualanalyticsplatformforinterpretationandfeatureextractionfromnanoporesequencingdatasets
AT mirquoseena sequoiaaninteractivevisualanalyticsplatformforinterpretationandfeatureextractionfromnanoporesequencingdatasets
AT redakhairi sequoiaaninteractivevisualanalyticsplatformforinterpretationandfeatureextractionfromnanoporesequencingdatasets
AT jangasarathchandra sequoiaaninteractivevisualanalyticsplatformforinterpretationandfeatureextractionfromnanoporesequencingdatasets