Cargando…

AIDE: annotation-assisted isoform discovery with high precision

Genome-wide accurate identification and quantification of full-length mRNA isoforms is crucial for investigating transcriptional and posttranscriptional regulatory mechanisms of biological phenomena. Despite continuing efforts in developing effective computational tools to identify or assemble full-...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Wei Vivian, Li, Shan, Tong, Xin, Deng, Ling, Shi, Hubing, Li, Jingyi Jessica
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6886511/
https://www.ncbi.nlm.nih.gov/pubmed/31694868
http://dx.doi.org/10.1101/gr.251108.119
_version_ 1783474888038678528
author Li, Wei Vivian
Li, Shan
Tong, Xin
Deng, Ling
Shi, Hubing
Li, Jingyi Jessica
author_facet Li, Wei Vivian
Li, Shan
Tong, Xin
Deng, Ling
Shi, Hubing
Li, Jingyi Jessica
author_sort Li, Wei Vivian
collection PubMed
description Genome-wide accurate identification and quantification of full-length mRNA isoforms is crucial for investigating transcriptional and posttranscriptional regulatory mechanisms of biological phenomena. Despite continuing efforts in developing effective computational tools to identify or assemble full-length mRNA isoforms from second-generation RNA-seq data, it remains a challenge to accurately identify mRNA isoforms from short sequence reads owing to the substantial information loss in RNA-seq experiments. Here, we introduce a novel statistical method, annotation-assisted isoform discovery (AIDE), the first approach that directly controls false isoform discoveries by implementing the testing-based model selection principle. Solving the isoform discovery problem in a stepwise and conservative manner, AIDE prioritizes the annotated isoforms and precisely identifies novel isoforms whose addition significantly improves the explanation of observed RNA-seq reads. We evaluate the performance of AIDE based on multiple simulated and real RNA-seq data sets followed by PCR-Sanger sequencing validation. Our results show that AIDE effectively leverages the annotation information to compensate the information loss owing to short read lengths. AIDE achieves the highest precision in isoform discovery and the lowest error rates in isoform abundance estimation, compared with three state-of-the-art methods Cufflinks, SLIDE, and StringTie. As a robust bioinformatics tool for transcriptome analysis, AIDE enables researchers to discover novel transcripts with high confidence.
format Online
Article
Text
id pubmed-6886511
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-68865112020-06-01 AIDE: annotation-assisted isoform discovery with high precision Li, Wei Vivian Li, Shan Tong, Xin Deng, Ling Shi, Hubing Li, Jingyi Jessica Genome Res Method Genome-wide accurate identification and quantification of full-length mRNA isoforms is crucial for investigating transcriptional and posttranscriptional regulatory mechanisms of biological phenomena. Despite continuing efforts in developing effective computational tools to identify or assemble full-length mRNA isoforms from second-generation RNA-seq data, it remains a challenge to accurately identify mRNA isoforms from short sequence reads owing to the substantial information loss in RNA-seq experiments. Here, we introduce a novel statistical method, annotation-assisted isoform discovery (AIDE), the first approach that directly controls false isoform discoveries by implementing the testing-based model selection principle. Solving the isoform discovery problem in a stepwise and conservative manner, AIDE prioritizes the annotated isoforms and precisely identifies novel isoforms whose addition significantly improves the explanation of observed RNA-seq reads. We evaluate the performance of AIDE based on multiple simulated and real RNA-seq data sets followed by PCR-Sanger sequencing validation. Our results show that AIDE effectively leverages the annotation information to compensate the information loss owing to short read lengths. AIDE achieves the highest precision in isoform discovery and the lowest error rates in isoform abundance estimation, compared with three state-of-the-art methods Cufflinks, SLIDE, and StringTie. As a robust bioinformatics tool for transcriptome analysis, AIDE enables researchers to discover novel transcripts with high confidence. Cold Spring Harbor Laboratory Press 2019-12 /pmc/articles/PMC6886511/ /pubmed/31694868 http://dx.doi.org/10.1101/gr.251108.119 Text en © 2019 Li et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by-nc/4.0/ This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.
spellingShingle Method
Li, Wei Vivian
Li, Shan
Tong, Xin
Deng, Ling
Shi, Hubing
Li, Jingyi Jessica
AIDE: annotation-assisted isoform discovery with high precision
title AIDE: annotation-assisted isoform discovery with high precision
title_full AIDE: annotation-assisted isoform discovery with high precision
title_fullStr AIDE: annotation-assisted isoform discovery with high precision
title_full_unstemmed AIDE: annotation-assisted isoform discovery with high precision
title_short AIDE: annotation-assisted isoform discovery with high precision
title_sort aide: annotation-assisted isoform discovery with high precision
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6886511/
https://www.ncbi.nlm.nih.gov/pubmed/31694868
http://dx.doi.org/10.1101/gr.251108.119
work_keys_str_mv AT liweivivian aideannotationassistedisoformdiscoverywithhighprecision
AT lishan aideannotationassistedisoformdiscoverywithhighprecision
AT tongxin aideannotationassistedisoformdiscoverywithhighprecision
AT dengling aideannotationassistedisoformdiscoverywithhighprecision
AT shihubing aideannotationassistedisoformdiscoverywithhighprecision
AT lijingyijessica aideannotationassistedisoformdiscoverywithhighprecision