Cargando…

De novo genome assemblies of butterflies

BACKGROUND: The availability of thousands of genomes has enabled new advancements in biology. However, many genomes have not been investigated for their quality. Here we examine quality trends in a taxonomically diverse and well-known group, butterflies (Papilionoidea), and provide draft, de novo as...

Descripción completa

Detalles Bibliográficos
Autores principales: Ellis, Emily A, Storer, Caroline G, Kawahara, Akito Y
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8170690/
https://www.ncbi.nlm.nih.gov/pubmed/34076242
http://dx.doi.org/10.1093/gigascience/giab041
_version_ 1783702295458873344
author Ellis, Emily A
Storer, Caroline G
Kawahara, Akito Y
author_facet Ellis, Emily A
Storer, Caroline G
Kawahara, Akito Y
author_sort Ellis, Emily A
collection PubMed
description BACKGROUND: The availability of thousands of genomes has enabled new advancements in biology. However, many genomes have not been investigated for their quality. Here we examine quality trends in a taxonomically diverse and well-known group, butterflies (Papilionoidea), and provide draft, de novo assemblies for all available butterfly genomes. Owing to massive genome sequencing investment and taxonomic curation, this is an excellent group to explore genome quality. FINDINGS: We provide de novo assemblies for all 822 available butterfly genomes and interpret their quality in terms of completeness and continuity. We identify the 50 highest quality genomes across butterflies and conclude that the ringlet, Aphantopus hyperantus, has the highest quality genome. Our post-processing of draft genome assemblies identified 118 butterfly genomes that should not be reused owing to contamination or extremely low quality. However, many draft genomes are of high utility, especially because permissibility of low-quality genomes is dependent on the objective of the study. Our assemblies will serve as a key resource for papilionid genomics, especially for researchers without computational resources. CONCLUSIONS: Quality metrics and assemblies are typically presented with annotated genome accessions but rarely with de novo genomes. We recommend that studies presenting genome sequences provide the assembly and some metrics of quality because quality will significantly affect downstream results. Transparency in quality metrics is needed to improve the field of genome science and encourage data reuse.
format Online
Article
Text
id pubmed-8170690
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-81706902021-06-02 De novo genome assemblies of butterflies Ellis, Emily A Storer, Caroline G Kawahara, Akito Y Gigascience Data Note BACKGROUND: The availability of thousands of genomes has enabled new advancements in biology. However, many genomes have not been investigated for their quality. Here we examine quality trends in a taxonomically diverse and well-known group, butterflies (Papilionoidea), and provide draft, de novo assemblies for all available butterfly genomes. Owing to massive genome sequencing investment and taxonomic curation, this is an excellent group to explore genome quality. FINDINGS: We provide de novo assemblies for all 822 available butterfly genomes and interpret their quality in terms of completeness and continuity. We identify the 50 highest quality genomes across butterflies and conclude that the ringlet, Aphantopus hyperantus, has the highest quality genome. Our post-processing of draft genome assemblies identified 118 butterfly genomes that should not be reused owing to contamination or extremely low quality. However, many draft genomes are of high utility, especially because permissibility of low-quality genomes is dependent on the objective of the study. Our assemblies will serve as a key resource for papilionid genomics, especially for researchers without computational resources. CONCLUSIONS: Quality metrics and assemblies are typically presented with annotated genome accessions but rarely with de novo genomes. We recommend that studies presenting genome sequences provide the assembly and some metrics of quality because quality will significantly affect downstream results. Transparency in quality metrics is needed to improve the field of genome science and encourage data reuse. Oxford University Press 2021-06-02 /pmc/articles/PMC8170690/ /pubmed/34076242 http://dx.doi.org/10.1093/gigascience/giab041 Text en © The Author(s) 2021. Published by Oxford University Press GigaScience. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Data Note
Ellis, Emily A
Storer, Caroline G
Kawahara, Akito Y
De novo genome assemblies of butterflies
title De novo genome assemblies of butterflies
title_full De novo genome assemblies of butterflies
title_fullStr De novo genome assemblies of butterflies
title_full_unstemmed De novo genome assemblies of butterflies
title_short De novo genome assemblies of butterflies
title_sort de novo genome assemblies of butterflies
topic Data Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8170690/
https://www.ncbi.nlm.nih.gov/pubmed/34076242
http://dx.doi.org/10.1093/gigascience/giab041
work_keys_str_mv AT ellisemilya denovogenomeassembliesofbutterflies
AT storercarolineg denovogenomeassembliesofbutterflies
AT kawaharaakitoy denovogenomeassembliesofbutterflies