Cargando…

Best genome sequencing strategies for annotation of complex immune gene families in wildlife

BACKGROUND: The biodiversity crisis and increasing impact of wildlife disease on animal and human health provides impetus for studying immune genes in wildlife. Despite the recent boom in genomes for wildlife species, immune genes are poorly annotated in nonmodel species owing to their high level of...

Descripción completa

Detalles Bibliográficos
Autores principales: Peel, Emma, Silver, Luke, Brandies, Parice, Zhu, Ying, Cheng, Yuanyuan, Hogg, Carolyn J, Belov, Katherine
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9618407/
https://www.ncbi.nlm.nih.gov/pubmed/36310247
http://dx.doi.org/10.1093/gigascience/giac100
_version_ 1784821041506287616
author Peel, Emma
Silver, Luke
Brandies, Parice
Zhu, Ying
Cheng, Yuanyuan
Hogg, Carolyn J
Belov, Katherine
author_facet Peel, Emma
Silver, Luke
Brandies, Parice
Zhu, Ying
Cheng, Yuanyuan
Hogg, Carolyn J
Belov, Katherine
author_sort Peel, Emma
collection PubMed
description BACKGROUND: The biodiversity crisis and increasing impact of wildlife disease on animal and human health provides impetus for studying immune genes in wildlife. Despite the recent boom in genomes for wildlife species, immune genes are poorly annotated in nonmodel species owing to their high level of polymorphism and complex genomic organisation. Our research over the past decade and a half on Tasmanian devils and koalas highlights the importance of genomics and accurate immune annotations to investigate disease in wildlife. Given this, we have increasingly been asked the minimum levels of genome quality required to effectively annotate immune genes in order to study immunogenetic diversity. Here we set out to answer this question by manually annotating immune genes in 5 marsupial genomes and 1 monotreme genome to determine the impact of sequencing data type, assembly quality, and automated annotation on accurate immune annotation. RESULTS: Genome quality is directly linked to our ability to annotate complex immune gene families, with long reads and scaffolding technologies required to reassemble immune gene clusters and elucidate evolution, organisation, and true gene content of the immune repertoire. Draft-quality genomes generated from short reads with HiC or 10× Chromium linked reads were unable to achieve this. Despite mammalian BUSCOv5 scores of up to 94.1% amongst the 6 genomes, automated annotation pipelines incorrectly annotated up to 59% of manually annotated immune genes regardless of assembly quality or method of automated annotation. CONCLUSIONS: Our results demonstrate that long reads and scaffolding technologies, alongside manual annotation, are required to accurately study the immune gene repertoire of wildlife species.
format Online
Article
Text
id pubmed-9618407
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-96184072022-11-01 Best genome sequencing strategies for annotation of complex immune gene families in wildlife Peel, Emma Silver, Luke Brandies, Parice Zhu, Ying Cheng, Yuanyuan Hogg, Carolyn J Belov, Katherine Gigascience Research BACKGROUND: The biodiversity crisis and increasing impact of wildlife disease on animal and human health provides impetus for studying immune genes in wildlife. Despite the recent boom in genomes for wildlife species, immune genes are poorly annotated in nonmodel species owing to their high level of polymorphism and complex genomic organisation. Our research over the past decade and a half on Tasmanian devils and koalas highlights the importance of genomics and accurate immune annotations to investigate disease in wildlife. Given this, we have increasingly been asked the minimum levels of genome quality required to effectively annotate immune genes in order to study immunogenetic diversity. Here we set out to answer this question by manually annotating immune genes in 5 marsupial genomes and 1 monotreme genome to determine the impact of sequencing data type, assembly quality, and automated annotation on accurate immune annotation. RESULTS: Genome quality is directly linked to our ability to annotate complex immune gene families, with long reads and scaffolding technologies required to reassemble immune gene clusters and elucidate evolution, organisation, and true gene content of the immune repertoire. Draft-quality genomes generated from short reads with HiC or 10× Chromium linked reads were unable to achieve this. Despite mammalian BUSCOv5 scores of up to 94.1% amongst the 6 genomes, automated annotation pipelines incorrectly annotated up to 59% of manually annotated immune genes regardless of assembly quality or method of automated annotation. CONCLUSIONS: Our results demonstrate that long reads and scaffolding technologies, alongside manual annotation, are required to accurately study the immune gene repertoire of wildlife species. Oxford University Press 2022-10-30 /pmc/articles/PMC9618407/ /pubmed/36310247 http://dx.doi.org/10.1093/gigascience/giac100 Text en © The Author(s) 2022. Published by Oxford University Press GigaScience. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Peel, Emma
Silver, Luke
Brandies, Parice
Zhu, Ying
Cheng, Yuanyuan
Hogg, Carolyn J
Belov, Katherine
Best genome sequencing strategies for annotation of complex immune gene families in wildlife
title Best genome sequencing strategies for annotation of complex immune gene families in wildlife
title_full Best genome sequencing strategies for annotation of complex immune gene families in wildlife
title_fullStr Best genome sequencing strategies for annotation of complex immune gene families in wildlife
title_full_unstemmed Best genome sequencing strategies for annotation of complex immune gene families in wildlife
title_short Best genome sequencing strategies for annotation of complex immune gene families in wildlife
title_sort best genome sequencing strategies for annotation of complex immune gene families in wildlife
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9618407/
https://www.ncbi.nlm.nih.gov/pubmed/36310247
http://dx.doi.org/10.1093/gigascience/giac100
work_keys_str_mv AT peelemma bestgenomesequencingstrategiesforannotationofcompleximmunegenefamiliesinwildlife
AT silverluke bestgenomesequencingstrategiesforannotationofcompleximmunegenefamiliesinwildlife
AT brandiesparice bestgenomesequencingstrategiesforannotationofcompleximmunegenefamiliesinwildlife
AT zhuying bestgenomesequencingstrategiesforannotationofcompleximmunegenefamiliesinwildlife
AT chengyuanyuan bestgenomesequencingstrategiesforannotationofcompleximmunegenefamiliesinwildlife
AT hoggcarolynj bestgenomesequencingstrategiesforannotationofcompleximmunegenefamiliesinwildlife
AT belovkatherine bestgenomesequencingstrategiesforannotationofcompleximmunegenefamiliesinwildlife