Cargando…

Analyses of 600+ insect genomes reveal repetitive element dynamics and highlight biodiversity-scale repeat annotation challenges

Repetitive elements (REs) are integral to the composition, structure, and function of eukaryotic genomes, yet remain understudied in most taxonomic groups. We investigated REs across 601 insect species and report wide variation in RE dynamics across groups. Analysis of associations between REs and p...

Descripción completa

Detalles Bibliográficos
Autores principales: Sproul, John S., Hotaling, Scott, Heckenhauer, Jacqueline, Powell, Ashlyn, Marshall, Dez, Larracuente, Amanda M., Kelley, Joanna L., Pauls, Steffen U., Frandsen, Paul B.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10691545/
https://www.ncbi.nlm.nih.gov/pubmed/37739812
http://dx.doi.org/10.1101/gr.277387.122
_version_ 1785152756288323584
author Sproul, John S.
Hotaling, Scott
Heckenhauer, Jacqueline
Powell, Ashlyn
Marshall, Dez
Larracuente, Amanda M.
Kelley, Joanna L.
Pauls, Steffen U.
Frandsen, Paul B.
author_facet Sproul, John S.
Hotaling, Scott
Heckenhauer, Jacqueline
Powell, Ashlyn
Marshall, Dez
Larracuente, Amanda M.
Kelley, Joanna L.
Pauls, Steffen U.
Frandsen, Paul B.
author_sort Sproul, John S.
collection PubMed
description Repetitive elements (REs) are integral to the composition, structure, and function of eukaryotic genomes, yet remain understudied in most taxonomic groups. We investigated REs across 601 insect species and report wide variation in RE dynamics across groups. Analysis of associations between REs and protein-coding genes revealed dynamic evolution at the interface between REs and coding regions across insects, including notably elevated RE–gene associations in lineages with abundant long interspersed nuclear elements (LINEs). We leveraged this large, empirical data set to quantify impacts of long-read technology on RE detection and investigate fundamental challenges to RE annotation in diverse groups. In long-read assemblies, we detected ∼36% more REs than short-read assemblies, with long terminal repeats (LTRs) showing 162% increased detection, whereas DNA transposons and LINEs showed less respective technology-related bias. In most insect lineages, 25%–85% of repetitive sequences were “unclassified” following automated annotation, compared with only ∼13% in Drosophila species. Although the diversity of available insect genomes has rapidly expanded, we show the rate of community contributions to RE databases has not kept pace, preventing efficient annotation and high-resolution study of REs in most groups. We highlight the tremendous opportunity and need for the biodiversity genomics field to embrace REs and suggest collective steps for making progress toward this goal.
format Online
Article
Text
id pubmed-10691545
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-106915452023-12-02 Analyses of 600+ insect genomes reveal repetitive element dynamics and highlight biodiversity-scale repeat annotation challenges Sproul, John S. Hotaling, Scott Heckenhauer, Jacqueline Powell, Ashlyn Marshall, Dez Larracuente, Amanda M. Kelley, Joanna L. Pauls, Steffen U. Frandsen, Paul B. Genome Res Research Repetitive elements (REs) are integral to the composition, structure, and function of eukaryotic genomes, yet remain understudied in most taxonomic groups. We investigated REs across 601 insect species and report wide variation in RE dynamics across groups. Analysis of associations between REs and protein-coding genes revealed dynamic evolution at the interface between REs and coding regions across insects, including notably elevated RE–gene associations in lineages with abundant long interspersed nuclear elements (LINEs). We leveraged this large, empirical data set to quantify impacts of long-read technology on RE detection and investigate fundamental challenges to RE annotation in diverse groups. In long-read assemblies, we detected ∼36% more REs than short-read assemblies, with long terminal repeats (LTRs) showing 162% increased detection, whereas DNA transposons and LINEs showed less respective technology-related bias. In most insect lineages, 25%–85% of repetitive sequences were “unclassified” following automated annotation, compared with only ∼13% in Drosophila species. Although the diversity of available insect genomes has rapidly expanded, we show the rate of community contributions to RE databases has not kept pace, preventing efficient annotation and high-resolution study of REs in most groups. We highlight the tremendous opportunity and need for the biodiversity genomics field to embrace REs and suggest collective steps for making progress toward this goal. Cold Spring Harbor Laboratory Press 2023-10 /pmc/articles/PMC10691545/ /pubmed/37739812 http://dx.doi.org/10.1101/gr.277387.122 Text en © 2023 Sproul et al.; Published by Cold Spring Harbor Laboratory Press https://creativecommons.org/licenses/by-nc/4.0/This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) .
spellingShingle Research
Sproul, John S.
Hotaling, Scott
Heckenhauer, Jacqueline
Powell, Ashlyn
Marshall, Dez
Larracuente, Amanda M.
Kelley, Joanna L.
Pauls, Steffen U.
Frandsen, Paul B.
Analyses of 600+ insect genomes reveal repetitive element dynamics and highlight biodiversity-scale repeat annotation challenges
title Analyses of 600+ insect genomes reveal repetitive element dynamics and highlight biodiversity-scale repeat annotation challenges
title_full Analyses of 600+ insect genomes reveal repetitive element dynamics and highlight biodiversity-scale repeat annotation challenges
title_fullStr Analyses of 600+ insect genomes reveal repetitive element dynamics and highlight biodiversity-scale repeat annotation challenges
title_full_unstemmed Analyses of 600+ insect genomes reveal repetitive element dynamics and highlight biodiversity-scale repeat annotation challenges
title_short Analyses of 600+ insect genomes reveal repetitive element dynamics and highlight biodiversity-scale repeat annotation challenges
title_sort analyses of 600+ insect genomes reveal repetitive element dynamics and highlight biodiversity-scale repeat annotation challenges
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10691545/
https://www.ncbi.nlm.nih.gov/pubmed/37739812
http://dx.doi.org/10.1101/gr.277387.122
work_keys_str_mv AT sprouljohns analysesof600insectgenomesrevealrepetitiveelementdynamicsandhighlightbiodiversityscalerepeatannotationchallenges
AT hotalingscott analysesof600insectgenomesrevealrepetitiveelementdynamicsandhighlightbiodiversityscalerepeatannotationchallenges
AT heckenhauerjacqueline analysesof600insectgenomesrevealrepetitiveelementdynamicsandhighlightbiodiversityscalerepeatannotationchallenges
AT powellashlyn analysesof600insectgenomesrevealrepetitiveelementdynamicsandhighlightbiodiversityscalerepeatannotationchallenges
AT marshalldez analysesof600insectgenomesrevealrepetitiveelementdynamicsandhighlightbiodiversityscalerepeatannotationchallenges
AT larracuenteamandam analysesof600insectgenomesrevealrepetitiveelementdynamicsandhighlightbiodiversityscalerepeatannotationchallenges
AT kelleyjoannal analysesof600insectgenomesrevealrepetitiveelementdynamicsandhighlightbiodiversityscalerepeatannotationchallenges
AT paulssteffenu analysesof600insectgenomesrevealrepetitiveelementdynamicsandhighlightbiodiversityscalerepeatannotationchallenges
AT frandsenpaulb analysesof600insectgenomesrevealrepetitiveelementdynamicsandhighlightbiodiversityscalerepeatannotationchallenges