Cargando…

Long-read RNA sequencing of human and animal filarial parasites improves gene models and discovers operons

Filarial parasitic nematodes (Filarioidea) cause substantial disease burden to humans and animals around the world. Recently there has been a coordinated global effort to generate, annotate, and curate genomic data from nematode species of medical and veterinary importance. This has resulted in two...

Descripción completa

Detalles Bibliográficos
Autores principales: Wheeler, Nicolas J, Airs, Paul M., Zamanian, Mostafa
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7704054/
https://www.ncbi.nlm.nih.gov/pubmed/33196647
http://dx.doi.org/10.1371/journal.pntd.0008869
_version_ 1783616746365648896
author Wheeler, Nicolas J
Airs, Paul M.
Zamanian, Mostafa
author_facet Wheeler, Nicolas J
Airs, Paul M.
Zamanian, Mostafa
author_sort Wheeler, Nicolas J
collection PubMed
description Filarial parasitic nematodes (Filarioidea) cause substantial disease burden to humans and animals around the world. Recently there has been a coordinated global effort to generate, annotate, and curate genomic data from nematode species of medical and veterinary importance. This has resulted in two chromosome-level assemblies (Brugia malayi and Onchocerca volvulus) and 11 additional draft genomes from Filarioidea. These reference assemblies facilitate comparative genomics to explore basic helminth biology and prioritize new drug and vaccine targets. While the continual improvement of genome contiguity and completeness advances these goals, experimental functional annotation of genes is often hindered by poor gene models. Short-read RNA sequencing data and expressed sequence tags, in cooperation with ab initio prediction algorithms, are employed for gene prediction, but these can result in missing clade-specific genes, fragmented models, imperfect mapping of gene ends, and lack of isoform resolution. Long-read RNA sequencing can overcome these drawbacks and greatly improve gene model quality. Here, we present Iso-Seq data for B. malayi and Dirofilaria immitis, etiological agents of lymphatic filariasis and canine heartworm disease, respectively. These data cover approximately half of the known coding genomes and substantially improve gene models by extending untranslated regions, cataloging novel splice junctions from novel isoforms, and correcting mispredicted junctions. Furthermore, we validated computationally predicted operons, manually curated new operons, and merged fragmented gene models. We carried out analyses of poly(A) tails in both species, leading to the identification of non-canonical poly(A) signals. Finally, we prioritized and assessed known and putative anthelmintic targets, correcting or validating gene models for molecular cloning and target-based anthelmintic screening efforts. Overall, these data significantly improve the catalog of gene models for two important parasites, and they demonstrate how long-read RNA sequencing should be prioritized for ongoing improvement of parasitic nematode genome assemblies.
format Online
Article
Text
id pubmed-7704054
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-77040542020-12-08 Long-read RNA sequencing of human and animal filarial parasites improves gene models and discovers operons Wheeler, Nicolas J Airs, Paul M. Zamanian, Mostafa PLoS Negl Trop Dis Research Article Filarial parasitic nematodes (Filarioidea) cause substantial disease burden to humans and animals around the world. Recently there has been a coordinated global effort to generate, annotate, and curate genomic data from nematode species of medical and veterinary importance. This has resulted in two chromosome-level assemblies (Brugia malayi and Onchocerca volvulus) and 11 additional draft genomes from Filarioidea. These reference assemblies facilitate comparative genomics to explore basic helminth biology and prioritize new drug and vaccine targets. While the continual improvement of genome contiguity and completeness advances these goals, experimental functional annotation of genes is often hindered by poor gene models. Short-read RNA sequencing data and expressed sequence tags, in cooperation with ab initio prediction algorithms, are employed for gene prediction, but these can result in missing clade-specific genes, fragmented models, imperfect mapping of gene ends, and lack of isoform resolution. Long-read RNA sequencing can overcome these drawbacks and greatly improve gene model quality. Here, we present Iso-Seq data for B. malayi and Dirofilaria immitis, etiological agents of lymphatic filariasis and canine heartworm disease, respectively. These data cover approximately half of the known coding genomes and substantially improve gene models by extending untranslated regions, cataloging novel splice junctions from novel isoforms, and correcting mispredicted junctions. Furthermore, we validated computationally predicted operons, manually curated new operons, and merged fragmented gene models. We carried out analyses of poly(A) tails in both species, leading to the identification of non-canonical poly(A) signals. Finally, we prioritized and assessed known and putative anthelmintic targets, correcting or validating gene models for molecular cloning and target-based anthelmintic screening efforts. Overall, these data significantly improve the catalog of gene models for two important parasites, and they demonstrate how long-read RNA sequencing should be prioritized for ongoing improvement of parasitic nematode genome assemblies. Public Library of Science 2020-11-16 /pmc/articles/PMC7704054/ /pubmed/33196647 http://dx.doi.org/10.1371/journal.pntd.0008869 Text en © 2020 Wheeler et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Wheeler, Nicolas J
Airs, Paul M.
Zamanian, Mostafa
Long-read RNA sequencing of human and animal filarial parasites improves gene models and discovers operons
title Long-read RNA sequencing of human and animal filarial parasites improves gene models and discovers operons
title_full Long-read RNA sequencing of human and animal filarial parasites improves gene models and discovers operons
title_fullStr Long-read RNA sequencing of human and animal filarial parasites improves gene models and discovers operons
title_full_unstemmed Long-read RNA sequencing of human and animal filarial parasites improves gene models and discovers operons
title_short Long-read RNA sequencing of human and animal filarial parasites improves gene models and discovers operons
title_sort long-read rna sequencing of human and animal filarial parasites improves gene models and discovers operons
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7704054/
https://www.ncbi.nlm.nih.gov/pubmed/33196647
http://dx.doi.org/10.1371/journal.pntd.0008869
work_keys_str_mv AT wheelernicolasj longreadrnasequencingofhumanandanimalfilarialparasitesimprovesgenemodelsanddiscoversoperons
AT airspaulm longreadrnasequencingofhumanandanimalfilarialparasitesimprovesgenemodelsanddiscoversoperons
AT zamanianmostafa longreadrnasequencingofhumanandanimalfilarialparasitesimprovesgenemodelsanddiscoversoperons