Cargando…

Fake IDs? Widespread misannotation of DNA transposons as a general transcription factor

Accurate annotation of genes and transposable elements (TEs) is vital for understanding genomes, but current annotation pipelines often misannotate TEs as genes. This study reveals how the general transcription factor II-I repeat domain-containing protein 2 (GTF2IRD2) erroneously annotated DNA trans...

Descripción completa

Detalles Bibliográficos
Autores principales: Hassan, Nozhat T., Adelson, David L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10641963/
https://www.ncbi.nlm.nih.gov/pubmed/37957683
http://dx.doi.org/10.1186/s13059-023-03102-9
Descripción
Sumario:Accurate annotation of genes and transposable elements (TEs) is vital for understanding genomes, but current annotation pipelines often misannotate TEs as genes. This study reveals how the general transcription factor II-I repeat domain-containing protein 2 (GTF2IRD2) erroneously annotated DNA transposons in non-mammalian species, as it contains a 3′ fused hAT transposase domain. We also demonstrate the generality of this problem by identifying misannotated TEs as genes in other vertebrate genomes. Such misannotations can lead to errors in phylogenetic analyses and wasted time for investigators. The study proposes adding a final TE-check to gene annotation pipelines to mitigate this problem. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-023-03102-9.