Cargando…
Contrasting new and available reference genomes to highlight uncertainties in assemblies and areas for future improvement: an example with monodontid species
BACKGROUND: Reference genomes provide a foundational framework for evolutionary investigations, ecological analysis, and conservation science, yet uncertainties in the assembly of reference genomes are difficult to assess, and by extension rarely quantified. Reference genomes for monodontid cetacean...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10659057/ https://www.ncbi.nlm.nih.gov/pubmed/37985969 http://dx.doi.org/10.1186/s12864-023-09779-3 |
_version_ | 1785148276675182592 |
---|---|
author | Bringloe, Trevor T. Parent, Geneviève J. |
author_facet | Bringloe, Trevor T. Parent, Geneviève J. |
author_sort | Bringloe, Trevor T. |
collection | PubMed |
description | BACKGROUND: Reference genomes provide a foundational framework for evolutionary investigations, ecological analysis, and conservation science, yet uncertainties in the assembly of reference genomes are difficult to assess, and by extension rarely quantified. Reference genomes for monodontid cetaceans span a wide spectrum of data types and analytical approaches, providing the context to derive broader insights related to discrepancies and regions of uncertainty in reference genome assembly. We generated three beluga (Delphinapterus leucas) and one narwhal (Monodon monoceros) reference genomes and contrasted these with published chromosomal scale assemblies for each species to quantify discrepancies associated with genome assemblies. RESULTS: The new reference genomes achieved chromosomal scale assembly using a combination of PacBio long reads, Illumina short reads, and Hi-C scaffolding data. For beluga, we identified discrepancies in the order and orientation of contigs in 2.2–3.7% of the total genome depending on the pairwise comparison of references. In addition, unsupported higher order scaffolding was identified in published reference genomes. In contrast, we estimated 8.2% of the compared narwhal genomes featured discrepancies, with inversions being notably abundant (5.3%). Discrepancies were linked to repetitive elements in both species. CONCLUSIONS: We provide several new reference genomes for beluga (Delphinapterus leucas), while highlighting potential avenues for improvements. In particular, additional layers of data providing information on ultra-long genomic distances are needed to resolve persistent errors in reference genome construction. The comparative analyses of monodontid reference genomes suggested that the three new reference genomes for beluga are more accurate compared to the currently published reference genome, but that the new narwhal genome is less accurate than one published. We also present a conceptual summary for improving the accuracy of reference genomes with relevance to end-user needs and how they relate to levels of assembly quality and uncertainty. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-023-09779-3. |
format | Online Article Text |
id | pubmed-10659057 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-106590572023-11-20 Contrasting new and available reference genomes to highlight uncertainties in assemblies and areas for future improvement: an example with monodontid species Bringloe, Trevor T. Parent, Geneviève J. BMC Genomics Research BACKGROUND: Reference genomes provide a foundational framework for evolutionary investigations, ecological analysis, and conservation science, yet uncertainties in the assembly of reference genomes are difficult to assess, and by extension rarely quantified. Reference genomes for monodontid cetaceans span a wide spectrum of data types and analytical approaches, providing the context to derive broader insights related to discrepancies and regions of uncertainty in reference genome assembly. We generated three beluga (Delphinapterus leucas) and one narwhal (Monodon monoceros) reference genomes and contrasted these with published chromosomal scale assemblies for each species to quantify discrepancies associated with genome assemblies. RESULTS: The new reference genomes achieved chromosomal scale assembly using a combination of PacBio long reads, Illumina short reads, and Hi-C scaffolding data. For beluga, we identified discrepancies in the order and orientation of contigs in 2.2–3.7% of the total genome depending on the pairwise comparison of references. In addition, unsupported higher order scaffolding was identified in published reference genomes. In contrast, we estimated 8.2% of the compared narwhal genomes featured discrepancies, with inversions being notably abundant (5.3%). Discrepancies were linked to repetitive elements in both species. CONCLUSIONS: We provide several new reference genomes for beluga (Delphinapterus leucas), while highlighting potential avenues for improvements. In particular, additional layers of data providing information on ultra-long genomic distances are needed to resolve persistent errors in reference genome construction. The comparative analyses of monodontid reference genomes suggested that the three new reference genomes for beluga are more accurate compared to the currently published reference genome, but that the new narwhal genome is less accurate than one published. We also present a conceptual summary for improving the accuracy of reference genomes with relevance to end-user needs and how they relate to levels of assembly quality and uncertainty. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-023-09779-3. BioMed Central 2023-11-20 /pmc/articles/PMC10659057/ /pubmed/37985969 http://dx.doi.org/10.1186/s12864-023-09779-3 Text en © Crown 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Bringloe, Trevor T. Parent, Geneviève J. Contrasting new and available reference genomes to highlight uncertainties in assemblies and areas for future improvement: an example with monodontid species |
title | Contrasting new and available reference genomes to highlight uncertainties in assemblies and areas for future improvement: an example with monodontid species |
title_full | Contrasting new and available reference genomes to highlight uncertainties in assemblies and areas for future improvement: an example with monodontid species |
title_fullStr | Contrasting new and available reference genomes to highlight uncertainties in assemblies and areas for future improvement: an example with monodontid species |
title_full_unstemmed | Contrasting new and available reference genomes to highlight uncertainties in assemblies and areas for future improvement: an example with monodontid species |
title_short | Contrasting new and available reference genomes to highlight uncertainties in assemblies and areas for future improvement: an example with monodontid species |
title_sort | contrasting new and available reference genomes to highlight uncertainties in assemblies and areas for future improvement: an example with monodontid species |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10659057/ https://www.ncbi.nlm.nih.gov/pubmed/37985969 http://dx.doi.org/10.1186/s12864-023-09779-3 |
work_keys_str_mv | AT bringloetrevort contrastingnewandavailablereferencegenomestohighlightuncertaintiesinassembliesandareasforfutureimprovementanexamplewithmonodontidspecies AT parentgenevievej contrastingnewandavailablereferencegenomestohighlightuncertaintiesinassembliesandareasforfutureimprovementanexamplewithmonodontidspecies |