Cargando…

High-throughput annotation of full-length long noncoding RNAs with Capture Long-Read Sequencing

Accurate annotations of genes and their transcripts is a foundation of genomics, but no annotation technique presently combines throughput and accuracy. As a result, reference gene collections remain incomplete: many gene models are fragmentary, while thousands more remain uncatalogued—particularly...

Descripción completa

Detalles Bibliográficos
Autores principales: Lagarde, Julien, Uszczynska-Ratajczak, Barbara, Carbonell, Silvia, Pérez-Lluch, Sílvia, Abad, Amaya, Davis, Carrie, Gingeras, Thomas R., Frankish, Adam, Harrow, Jennifer, Guigo, Roderic, Johnson, Rory
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5709232/
https://www.ncbi.nlm.nih.gov/pubmed/29106417
http://dx.doi.org/10.1038/ng.3988
_version_ 1783282738213683200
author Lagarde, Julien
Uszczynska-Ratajczak, Barbara
Carbonell, Silvia
Pérez-Lluch, Sílvia
Abad, Amaya
Davis, Carrie
Gingeras, Thomas R.
Frankish, Adam
Harrow, Jennifer
Guigo, Roderic
Johnson, Rory
author_facet Lagarde, Julien
Uszczynska-Ratajczak, Barbara
Carbonell, Silvia
Pérez-Lluch, Sílvia
Abad, Amaya
Davis, Carrie
Gingeras, Thomas R.
Frankish, Adam
Harrow, Jennifer
Guigo, Roderic
Johnson, Rory
author_sort Lagarde, Julien
collection PubMed
description Accurate annotations of genes and their transcripts is a foundation of genomics, but no annotation technique presently combines throughput and accuracy. As a result, reference gene collections remain incomplete: many gene models are fragmentary, while thousands more remain uncatalogued—particularly for long noncoding RNAs (lncRNAs). To accelerate lncRNA annotation, the GENCODE consortium has developed RNA Capture Long Seq (CLS), combining targeted RNA capture with third-generation long-read sequencing. We present an experimental re-annotation of the GENCODE intergenic lncRNA population in matched human and mouse tissues, resulting in novel transcript models for 3574 / 561 gene loci, respectively. CLS approximately doubles the annotated complexity of targeted loci, outperforming existing short-read techniques. Full-length transcript models produced by CLS enable us to definitively characterize the genomic features of lncRNAs, including promoter- and gene-structure, and protein-coding potential. Thus CLS removes a longstanding bottleneck of transcriptome annotation, generating manual-quality full-length transcript models at high-throughput scales.
format Online
Article
Text
id pubmed-5709232
institution National Center for Biotechnology Information
language English
publishDate 2017
record_format MEDLINE/PubMed
spelling pubmed-57092322018-05-06 High-throughput annotation of full-length long noncoding RNAs with Capture Long-Read Sequencing Lagarde, Julien Uszczynska-Ratajczak, Barbara Carbonell, Silvia Pérez-Lluch, Sílvia Abad, Amaya Davis, Carrie Gingeras, Thomas R. Frankish, Adam Harrow, Jennifer Guigo, Roderic Johnson, Rory Nat Genet Article Accurate annotations of genes and their transcripts is a foundation of genomics, but no annotation technique presently combines throughput and accuracy. As a result, reference gene collections remain incomplete: many gene models are fragmentary, while thousands more remain uncatalogued—particularly for long noncoding RNAs (lncRNAs). To accelerate lncRNA annotation, the GENCODE consortium has developed RNA Capture Long Seq (CLS), combining targeted RNA capture with third-generation long-read sequencing. We present an experimental re-annotation of the GENCODE intergenic lncRNA population in matched human and mouse tissues, resulting in novel transcript models for 3574 / 561 gene loci, respectively. CLS approximately doubles the annotated complexity of targeted loci, outperforming existing short-read techniques. Full-length transcript models produced by CLS enable us to definitively characterize the genomic features of lncRNAs, including promoter- and gene-structure, and protein-coding potential. Thus CLS removes a longstanding bottleneck of transcriptome annotation, generating manual-quality full-length transcript models at high-throughput scales. 2017-11-06 2017-12 /pmc/articles/PMC5709232/ /pubmed/29106417 http://dx.doi.org/10.1038/ng.3988 Text en Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms
spellingShingle Article
Lagarde, Julien
Uszczynska-Ratajczak, Barbara
Carbonell, Silvia
Pérez-Lluch, Sílvia
Abad, Amaya
Davis, Carrie
Gingeras, Thomas R.
Frankish, Adam
Harrow, Jennifer
Guigo, Roderic
Johnson, Rory
High-throughput annotation of full-length long noncoding RNAs with Capture Long-Read Sequencing
title High-throughput annotation of full-length long noncoding RNAs with Capture Long-Read Sequencing
title_full High-throughput annotation of full-length long noncoding RNAs with Capture Long-Read Sequencing
title_fullStr High-throughput annotation of full-length long noncoding RNAs with Capture Long-Read Sequencing
title_full_unstemmed High-throughput annotation of full-length long noncoding RNAs with Capture Long-Read Sequencing
title_short High-throughput annotation of full-length long noncoding RNAs with Capture Long-Read Sequencing
title_sort high-throughput annotation of full-length long noncoding rnas with capture long-read sequencing
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5709232/
https://www.ncbi.nlm.nih.gov/pubmed/29106417
http://dx.doi.org/10.1038/ng.3988
work_keys_str_mv AT lagardejulien highthroughputannotationoffulllengthlongnoncodingrnaswithcapturelongreadsequencing
AT uszczynskaratajczakbarbara highthroughputannotationoffulllengthlongnoncodingrnaswithcapturelongreadsequencing
AT carbonellsilvia highthroughputannotationoffulllengthlongnoncodingrnaswithcapturelongreadsequencing
AT perezlluchsilvia highthroughputannotationoffulllengthlongnoncodingrnaswithcapturelongreadsequencing
AT abadamaya highthroughputannotationoffulllengthlongnoncodingrnaswithcapturelongreadsequencing
AT daviscarrie highthroughputannotationoffulllengthlongnoncodingrnaswithcapturelongreadsequencing
AT gingerasthomasr highthroughputannotationoffulllengthlongnoncodingrnaswithcapturelongreadsequencing
AT frankishadam highthroughputannotationoffulllengthlongnoncodingrnaswithcapturelongreadsequencing
AT harrowjennifer highthroughputannotationoffulllengthlongnoncodingrnaswithcapturelongreadsequencing
AT guigoroderic highthroughputannotationoffulllengthlongnoncodingrnaswithcapturelongreadsequencing
AT johnsonrory highthroughputannotationoffulllengthlongnoncodingrnaswithcapturelongreadsequencing