Cargando…

Identifying similar transcripts in a related organism from de Bruijn graphs of RNA-Seq data, with applications to the study of salt and waterlogging tolerance in Melilotus

BACKGROUND: A popular strategy to study alternative splicing in non-model organisms starts from sequencing the entire transcriptome, then assembling the reads by using de novo transcriptome assembly algorithms to obtain predicted transcripts. A similarity search algorithm is then applied to a relate...

Descripción completa

Detalles Bibliográficos
Autores principales: Fu, Shuhua, Chang, Peter L., Friesen, Maren L., Teakle, Natasha L., Tarone, Aaron M., Sze, Sing-Hoi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6551239/
https://www.ncbi.nlm.nih.gov/pubmed/31167652
http://dx.doi.org/10.1186/s12864-019-5702-5
_version_ 1783424361960570880
author Fu, Shuhua
Chang, Peter L.
Friesen, Maren L.
Teakle, Natasha L.
Tarone, Aaron M.
Sze, Sing-Hoi
author_facet Fu, Shuhua
Chang, Peter L.
Friesen, Maren L.
Teakle, Natasha L.
Tarone, Aaron M.
Sze, Sing-Hoi
author_sort Fu, Shuhua
collection PubMed
description BACKGROUND: A popular strategy to study alternative splicing in non-model organisms starts from sequencing the entire transcriptome, then assembling the reads by using de novo transcriptome assembly algorithms to obtain predicted transcripts. A similarity search algorithm is then applied to a related organism to infer possible function of these predicted transcripts. While some of these predictions may be inaccurate and transcripts with low coverage are often missed, we observe that it is possible to obtain a more complete set of transcripts to facilitate possible functional assignments by starting the search from the intermediate de Bruijn graph that contains all branching possibilities. RESULTS: We develop an algorithm to extract similar transcripts in a related organism by starting the search from the de Bruijn graph that represents the transcriptome instead of from predicted transcripts. We show that our algorithm is able to recover more similar transcripts than existing algorithms, with large improvements in obtaining longer transcripts and a finer resolution of isoforms. We apply our algorithm to study salt and waterlogging tolerance in two Melilotus species by constructing new RNA-Seq libraries. CONCLUSIONS: We have developed an algorithm to identify paths in the de Bruijn graph that correspond to similar transcripts in a related organism directly. Our strategy bypasses the transcript prediction step in RNA-Seq data and makes use of support from evolutionary information. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-019-5702-5) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6551239
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-65512392019-06-07 Identifying similar transcripts in a related organism from de Bruijn graphs of RNA-Seq data, with applications to the study of salt and waterlogging tolerance in Melilotus Fu, Shuhua Chang, Peter L. Friesen, Maren L. Teakle, Natasha L. Tarone, Aaron M. Sze, Sing-Hoi BMC Genomics Research BACKGROUND: A popular strategy to study alternative splicing in non-model organisms starts from sequencing the entire transcriptome, then assembling the reads by using de novo transcriptome assembly algorithms to obtain predicted transcripts. A similarity search algorithm is then applied to a related organism to infer possible function of these predicted transcripts. While some of these predictions may be inaccurate and transcripts with low coverage are often missed, we observe that it is possible to obtain a more complete set of transcripts to facilitate possible functional assignments by starting the search from the intermediate de Bruijn graph that contains all branching possibilities. RESULTS: We develop an algorithm to extract similar transcripts in a related organism by starting the search from the de Bruijn graph that represents the transcriptome instead of from predicted transcripts. We show that our algorithm is able to recover more similar transcripts than existing algorithms, with large improvements in obtaining longer transcripts and a finer resolution of isoforms. We apply our algorithm to study salt and waterlogging tolerance in two Melilotus species by constructing new RNA-Seq libraries. CONCLUSIONS: We have developed an algorithm to identify paths in the de Bruijn graph that correspond to similar transcripts in a related organism directly. Our strategy bypasses the transcript prediction step in RNA-Seq data and makes use of support from evolutionary information. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-019-5702-5) contains supplementary material, which is available to authorized users. BioMed Central 2019-06-06 /pmc/articles/PMC6551239/ /pubmed/31167652 http://dx.doi.org/10.1186/s12864-019-5702-5 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Fu, Shuhua
Chang, Peter L.
Friesen, Maren L.
Teakle, Natasha L.
Tarone, Aaron M.
Sze, Sing-Hoi
Identifying similar transcripts in a related organism from de Bruijn graphs of RNA-Seq data, with applications to the study of salt and waterlogging tolerance in Melilotus
title Identifying similar transcripts in a related organism from de Bruijn graphs of RNA-Seq data, with applications to the study of salt and waterlogging tolerance in Melilotus
title_full Identifying similar transcripts in a related organism from de Bruijn graphs of RNA-Seq data, with applications to the study of salt and waterlogging tolerance in Melilotus
title_fullStr Identifying similar transcripts in a related organism from de Bruijn graphs of RNA-Seq data, with applications to the study of salt and waterlogging tolerance in Melilotus
title_full_unstemmed Identifying similar transcripts in a related organism from de Bruijn graphs of RNA-Seq data, with applications to the study of salt and waterlogging tolerance in Melilotus
title_short Identifying similar transcripts in a related organism from de Bruijn graphs of RNA-Seq data, with applications to the study of salt and waterlogging tolerance in Melilotus
title_sort identifying similar transcripts in a related organism from de bruijn graphs of rna-seq data, with applications to the study of salt and waterlogging tolerance in melilotus
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6551239/
https://www.ncbi.nlm.nih.gov/pubmed/31167652
http://dx.doi.org/10.1186/s12864-019-5702-5
work_keys_str_mv AT fushuhua identifyingsimilartranscriptsinarelatedorganismfromdebruijngraphsofrnaseqdatawithapplicationstothestudyofsaltandwaterloggingtoleranceinmelilotus
AT changpeterl identifyingsimilartranscriptsinarelatedorganismfromdebruijngraphsofrnaseqdatawithapplicationstothestudyofsaltandwaterloggingtoleranceinmelilotus
AT friesenmarenl identifyingsimilartranscriptsinarelatedorganismfromdebruijngraphsofrnaseqdatawithapplicationstothestudyofsaltandwaterloggingtoleranceinmelilotus
AT teaklenatashal identifyingsimilartranscriptsinarelatedorganismfromdebruijngraphsofrnaseqdatawithapplicationstothestudyofsaltandwaterloggingtoleranceinmelilotus
AT taroneaaronm identifyingsimilartranscriptsinarelatedorganismfromdebruijngraphsofrnaseqdatawithapplicationstothestudyofsaltandwaterloggingtoleranceinmelilotus
AT szesinghoi identifyingsimilartranscriptsinarelatedorganismfromdebruijngraphsofrnaseqdatawithapplicationstothestudyofsaltandwaterloggingtoleranceinmelilotus