Cargando…

Prediction of transcript isoforms in 19 chicken tissues by Oxford Nanopore long-read sequencing

To identify and annotate transcript isoforms in the chicken genome, we generated Nanopore long-read sequencing data from 68 samples that encompassed 19 diverse tissues collected from experimental adult male and female White Leghorn chickens. More than 23.8 million reads with mean read length of 790...

Descripción completa

Detalles Bibliográficos
Autores principales: Guan, Dailu, Halstead, Michelle M., Islas-Trejo, Alma D., Goszczynski, Daniel E., Cheng, Hans H., Ross, Pablo J., Zhou, Huaijun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9561881/
https://www.ncbi.nlm.nih.gov/pubmed/36246588
http://dx.doi.org/10.3389/fgene.2022.997460
_version_ 1784808045165936640
author Guan, Dailu
Halstead, Michelle M.
Islas-Trejo, Alma D.
Goszczynski, Daniel E.
Cheng, Hans H.
Ross, Pablo J.
Zhou, Huaijun
author_facet Guan, Dailu
Halstead, Michelle M.
Islas-Trejo, Alma D.
Goszczynski, Daniel E.
Cheng, Hans H.
Ross, Pablo J.
Zhou, Huaijun
author_sort Guan, Dailu
collection PubMed
description To identify and annotate transcript isoforms in the chicken genome, we generated Nanopore long-read sequencing data from 68 samples that encompassed 19 diverse tissues collected from experimental adult male and female White Leghorn chickens. More than 23.8 million reads with mean read length of 790 bases and average quality of 18.2 were generated. The annotation and subsequent filtering resulted in the identification of 55,382 transcripts at 40,547 loci with mean length of 1,700 bases. We predicted 30,967 coding transcripts at 19,461 loci, and 16,495 lncRNA transcripts at 15,512 loci. Compared to existing reference annotations, we found ∼52% of annotated transcripts could be partially or fully matched while ∼47% were novel. Seventy percent of novel transcripts were potentially transcribed from lncRNA loci. Based on our annotation, we quantified transcript expression across tissues and found two brain tissues (i.e., cerebellum and cortex) expressed the highest number of transcripts and loci. Furthermore, ∼22% of the transcripts displayed tissue specificity with the reproductive tissues (i.e., testis and ovary) exhibiting the most tissue-specific transcripts. Despite our wide sampling, ∼20% of Ensembl reference loci were not detected. This suggests that deeper sequencing and additional samples that include different breeds, cell types, developmental stages, and physiological conditions, are needed to fully annotate the chicken genome. The application of Nanopore sequencing in this study demonstrates the usefulness of long-read data in discovering additional novel loci (e.g., lncRNA loci) and resolving complex transcripts (e.g., the longest transcript for the TTN locus).
format Online
Article
Text
id pubmed-9561881
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-95618812022-10-15 Prediction of transcript isoforms in 19 chicken tissues by Oxford Nanopore long-read sequencing Guan, Dailu Halstead, Michelle M. Islas-Trejo, Alma D. Goszczynski, Daniel E. Cheng, Hans H. Ross, Pablo J. Zhou, Huaijun Front Genet Genetics To identify and annotate transcript isoforms in the chicken genome, we generated Nanopore long-read sequencing data from 68 samples that encompassed 19 diverse tissues collected from experimental adult male and female White Leghorn chickens. More than 23.8 million reads with mean read length of 790 bases and average quality of 18.2 were generated. The annotation and subsequent filtering resulted in the identification of 55,382 transcripts at 40,547 loci with mean length of 1,700 bases. We predicted 30,967 coding transcripts at 19,461 loci, and 16,495 lncRNA transcripts at 15,512 loci. Compared to existing reference annotations, we found ∼52% of annotated transcripts could be partially or fully matched while ∼47% were novel. Seventy percent of novel transcripts were potentially transcribed from lncRNA loci. Based on our annotation, we quantified transcript expression across tissues and found two brain tissues (i.e., cerebellum and cortex) expressed the highest number of transcripts and loci. Furthermore, ∼22% of the transcripts displayed tissue specificity with the reproductive tissues (i.e., testis and ovary) exhibiting the most tissue-specific transcripts. Despite our wide sampling, ∼20% of Ensembl reference loci were not detected. This suggests that deeper sequencing and additional samples that include different breeds, cell types, developmental stages, and physiological conditions, are needed to fully annotate the chicken genome. The application of Nanopore sequencing in this study demonstrates the usefulness of long-read data in discovering additional novel loci (e.g., lncRNA loci) and resolving complex transcripts (e.g., the longest transcript for the TTN locus). Frontiers Media S.A. 2022-10-03 /pmc/articles/PMC9561881/ /pubmed/36246588 http://dx.doi.org/10.3389/fgene.2022.997460 Text en Copyright © 2022 Guan, Halstead, Islas-Trejo, Goszczynski, Cheng, Ross and Zhou. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Guan, Dailu
Halstead, Michelle M.
Islas-Trejo, Alma D.
Goszczynski, Daniel E.
Cheng, Hans H.
Ross, Pablo J.
Zhou, Huaijun
Prediction of transcript isoforms in 19 chicken tissues by Oxford Nanopore long-read sequencing
title Prediction of transcript isoforms in 19 chicken tissues by Oxford Nanopore long-read sequencing
title_full Prediction of transcript isoforms in 19 chicken tissues by Oxford Nanopore long-read sequencing
title_fullStr Prediction of transcript isoforms in 19 chicken tissues by Oxford Nanopore long-read sequencing
title_full_unstemmed Prediction of transcript isoforms in 19 chicken tissues by Oxford Nanopore long-read sequencing
title_short Prediction of transcript isoforms in 19 chicken tissues by Oxford Nanopore long-read sequencing
title_sort prediction of transcript isoforms in 19 chicken tissues by oxford nanopore long-read sequencing
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9561881/
https://www.ncbi.nlm.nih.gov/pubmed/36246588
http://dx.doi.org/10.3389/fgene.2022.997460
work_keys_str_mv AT guandailu predictionoftranscriptisoformsin19chickentissuesbyoxfordnanoporelongreadsequencing
AT halsteadmichellem predictionoftranscriptisoformsin19chickentissuesbyoxfordnanoporelongreadsequencing
AT islastrejoalmad predictionoftranscriptisoformsin19chickentissuesbyoxfordnanoporelongreadsequencing
AT goszczynskidaniele predictionoftranscriptisoformsin19chickentissuesbyoxfordnanoporelongreadsequencing
AT chenghansh predictionoftranscriptisoformsin19chickentissuesbyoxfordnanoporelongreadsequencing
AT rosspabloj predictionoftranscriptisoformsin19chickentissuesbyoxfordnanoporelongreadsequencing
AT zhouhuaijun predictionoftranscriptisoformsin19chickentissuesbyoxfordnanoporelongreadsequencing