Cargando…

Hybrid sequencing and map finding (HySeMaFi): optional strategies for extensively deciphering gene splicing and expression in organisms without reference genome

Using second-generation sequencing (SGS) RNA-Seq strategies, extensive alterative splicing prediction is impractical and high variability of isoforms expression quantification is inevitable in organisms without true reference dataset. we report the development of a novel analysis method, termed hybr...

Descripción completa

Detalles Bibliográficos
Autores principales: Ning, Guogui, Cheng, Xu, Luo, Ping, Liang, Fan, Wang, Zhen, Yu, Guoliang, Li, Xin, Wang, Depeng, Bao, Manzhu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5341560/
https://www.ncbi.nlm.nih.gov/pubmed/28272530
http://dx.doi.org/10.1038/srep43793
_version_ 1782513004583059456
author Ning, Guogui
Cheng, Xu
Luo, Ping
Liang, Fan
Wang, Zhen
Yu, Guoliang
Li, Xin
Wang, Depeng
Bao, Manzhu
author_facet Ning, Guogui
Cheng, Xu
Luo, Ping
Liang, Fan
Wang, Zhen
Yu, Guoliang
Li, Xin
Wang, Depeng
Bao, Manzhu
author_sort Ning, Guogui
collection PubMed
description Using second-generation sequencing (SGS) RNA-Seq strategies, extensive alterative splicing prediction is impractical and high variability of isoforms expression quantification is inevitable in organisms without true reference dataset. we report the development of a novel analysis method, termed hybrid sequencing and map finding (HySeMaFi) which combines the specific strengths of third-generation sequencing (TGS) (PacBio SMRT sequencing) and SGS (Illumina Hi-Seq/MiSeq sequencing) to effectively decipher gene splicing and to reliably estimate the isoforms abundance. Error-corrected long reads from TGS are capable of capturing full length transcripts or as large partial transcript fragments. Both true and false isoforms, from a particular gene, as well as that containing all possible exons, could be generated by employing different assembly methods in SGS. We first develop an effective method which can establish the mapping relationship between the error-corrected long reads and the longest assembled contig in every corresponding gene. According to the mapping data, the true splicing pattern of the genes was reliably detected, and quantification of the isoforms was also effectively determined. HySeMaFi is also the optimal strategy by which to decipher the full exon expression of a specific gene when the longest mapped contigs were chosen as the reference set.
format Online
Article
Text
id pubmed-5341560
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-53415602017-03-10 Hybrid sequencing and map finding (HySeMaFi): optional strategies for extensively deciphering gene splicing and expression in organisms without reference genome Ning, Guogui Cheng, Xu Luo, Ping Liang, Fan Wang, Zhen Yu, Guoliang Li, Xin Wang, Depeng Bao, Manzhu Sci Rep Article Using second-generation sequencing (SGS) RNA-Seq strategies, extensive alterative splicing prediction is impractical and high variability of isoforms expression quantification is inevitable in organisms without true reference dataset. we report the development of a novel analysis method, termed hybrid sequencing and map finding (HySeMaFi) which combines the specific strengths of third-generation sequencing (TGS) (PacBio SMRT sequencing) and SGS (Illumina Hi-Seq/MiSeq sequencing) to effectively decipher gene splicing and to reliably estimate the isoforms abundance. Error-corrected long reads from TGS are capable of capturing full length transcripts or as large partial transcript fragments. Both true and false isoforms, from a particular gene, as well as that containing all possible exons, could be generated by employing different assembly methods in SGS. We first develop an effective method which can establish the mapping relationship between the error-corrected long reads and the longest assembled contig in every corresponding gene. According to the mapping data, the true splicing pattern of the genes was reliably detected, and quantification of the isoforms was also effectively determined. HySeMaFi is also the optimal strategy by which to decipher the full exon expression of a specific gene when the longest mapped contigs were chosen as the reference set. Nature Publishing Group 2017-03-08 /pmc/articles/PMC5341560/ /pubmed/28272530 http://dx.doi.org/10.1038/srep43793 Text en Copyright © 2017, The Author(s) http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Ning, Guogui
Cheng, Xu
Luo, Ping
Liang, Fan
Wang, Zhen
Yu, Guoliang
Li, Xin
Wang, Depeng
Bao, Manzhu
Hybrid sequencing and map finding (HySeMaFi): optional strategies for extensively deciphering gene splicing and expression in organisms without reference genome
title Hybrid sequencing and map finding (HySeMaFi): optional strategies for extensively deciphering gene splicing and expression in organisms without reference genome
title_full Hybrid sequencing and map finding (HySeMaFi): optional strategies for extensively deciphering gene splicing and expression in organisms without reference genome
title_fullStr Hybrid sequencing and map finding (HySeMaFi): optional strategies for extensively deciphering gene splicing and expression in organisms without reference genome
title_full_unstemmed Hybrid sequencing and map finding (HySeMaFi): optional strategies for extensively deciphering gene splicing and expression in organisms without reference genome
title_short Hybrid sequencing and map finding (HySeMaFi): optional strategies for extensively deciphering gene splicing and expression in organisms without reference genome
title_sort hybrid sequencing and map finding (hysemafi): optional strategies for extensively deciphering gene splicing and expression in organisms without reference genome
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5341560/
https://www.ncbi.nlm.nih.gov/pubmed/28272530
http://dx.doi.org/10.1038/srep43793
work_keys_str_mv AT ningguogui hybridsequencingandmapfindinghysemafioptionalstrategiesforextensivelydecipheringgenesplicingandexpressioninorganismswithoutreferencegenome
AT chengxu hybridsequencingandmapfindinghysemafioptionalstrategiesforextensivelydecipheringgenesplicingandexpressioninorganismswithoutreferencegenome
AT luoping hybridsequencingandmapfindinghysemafioptionalstrategiesforextensivelydecipheringgenesplicingandexpressioninorganismswithoutreferencegenome
AT liangfan hybridsequencingandmapfindinghysemafioptionalstrategiesforextensivelydecipheringgenesplicingandexpressioninorganismswithoutreferencegenome
AT wangzhen hybridsequencingandmapfindinghysemafioptionalstrategiesforextensivelydecipheringgenesplicingandexpressioninorganismswithoutreferencegenome
AT yuguoliang hybridsequencingandmapfindinghysemafioptionalstrategiesforextensivelydecipheringgenesplicingandexpressioninorganismswithoutreferencegenome
AT lixin hybridsequencingandmapfindinghysemafioptionalstrategiesforextensivelydecipheringgenesplicingandexpressioninorganismswithoutreferencegenome
AT wangdepeng hybridsequencingandmapfindinghysemafioptionalstrategiesforextensivelydecipheringgenesplicingandexpressioninorganismswithoutreferencegenome
AT baomanzhu hybridsequencingandmapfindinghysemafioptionalstrategiesforextensivelydecipheringgenesplicingandexpressioninorganismswithoutreferencegenome