Cargando…

Accurate Transposable Element Annotation Is Vital When Analyzing New Genome Assemblies

Transposable elements (TEs) are mobile genetic elements with the ability to replicate themselves throughout the host genome. In some taxa TEs reach copy numbers in hundreds of thousands and can occupy more than half of the genome. The increasing number of reference genomes from nonmodel species has...

Descripción completa

Detalles Bibliográficos
Autores principales: Platt, Roy N., Blanco-Berdugo, Laura, Ray, David A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4779615/
https://www.ncbi.nlm.nih.gov/pubmed/26802115
http://dx.doi.org/10.1093/gbe/evw009
_version_ 1782419648946372608
author Platt, Roy N.
Blanco-Berdugo, Laura
Ray, David A.
author_facet Platt, Roy N.
Blanco-Berdugo, Laura
Ray, David A.
author_sort Platt, Roy N.
collection PubMed
description Transposable elements (TEs) are mobile genetic elements with the ability to replicate themselves throughout the host genome. In some taxa TEs reach copy numbers in hundreds of thousands and can occupy more than half of the genome. The increasing number of reference genomes from nonmodel species has begun to outpace efforts to identify and annotate TE content and methods that are used vary significantly between projects. Here, we demonstrate variation that arises in TE annotations when less than optimal methods are used. We found that across a variety of taxa, the ability to accurately identify TEs based solely on homology decreased as the phylogenetic distance between the queried genome and a reference increased. Next we annotated repeats using homology alone, as is often the case in new genome analyses, and a combination of homology and de novo methods as well as an additional manual curation step. Reannotation using these methods identified a substantial number of new TE subfamilies in previously characterized genomes, recognized a higher proportion of the genome as repetitive, and decreased the average genetic distance within TE families, implying recent TE accumulation. Finally, these finding—increased recognition of younger TEs—were confirmed via an analysis of the postman butterfly (Heliconius melpomene). These observations imply that complete TE annotation relies on a combination of homology and de novo–based repeat identification, manual curation, and classification and that relying on simple, homology-based methods is insufficient to accurately describe the TE landscape of a newly sequenced genome.
format Online
Article
Text
id pubmed-4779615
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-47796152016-03-07 Accurate Transposable Element Annotation Is Vital When Analyzing New Genome Assemblies Platt, Roy N. Blanco-Berdugo, Laura Ray, David A. Genome Biol Evol Letter Transposable elements (TEs) are mobile genetic elements with the ability to replicate themselves throughout the host genome. In some taxa TEs reach copy numbers in hundreds of thousands and can occupy more than half of the genome. The increasing number of reference genomes from nonmodel species has begun to outpace efforts to identify and annotate TE content and methods that are used vary significantly between projects. Here, we demonstrate variation that arises in TE annotations when less than optimal methods are used. We found that across a variety of taxa, the ability to accurately identify TEs based solely on homology decreased as the phylogenetic distance between the queried genome and a reference increased. Next we annotated repeats using homology alone, as is often the case in new genome analyses, and a combination of homology and de novo methods as well as an additional manual curation step. Reannotation using these methods identified a substantial number of new TE subfamilies in previously characterized genomes, recognized a higher proportion of the genome as repetitive, and decreased the average genetic distance within TE families, implying recent TE accumulation. Finally, these finding—increased recognition of younger TEs—were confirmed via an analysis of the postman butterfly (Heliconius melpomene). These observations imply that complete TE annotation relies on a combination of homology and de novo–based repeat identification, manual curation, and classification and that relying on simple, homology-based methods is insufficient to accurately describe the TE landscape of a newly sequenced genome. Oxford University Press 2016-01-21 /pmc/articles/PMC4779615/ /pubmed/26802115 http://dx.doi.org/10.1093/gbe/evw009 Text en © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Letter
Platt, Roy N.
Blanco-Berdugo, Laura
Ray, David A.
Accurate Transposable Element Annotation Is Vital When Analyzing New Genome Assemblies
title Accurate Transposable Element Annotation Is Vital When Analyzing New Genome Assemblies
title_full Accurate Transposable Element Annotation Is Vital When Analyzing New Genome Assemblies
title_fullStr Accurate Transposable Element Annotation Is Vital When Analyzing New Genome Assemblies
title_full_unstemmed Accurate Transposable Element Annotation Is Vital When Analyzing New Genome Assemblies
title_short Accurate Transposable Element Annotation Is Vital When Analyzing New Genome Assemblies
title_sort accurate transposable element annotation is vital when analyzing new genome assemblies
topic Letter
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4779615/
https://www.ncbi.nlm.nih.gov/pubmed/26802115
http://dx.doi.org/10.1093/gbe/evw009
work_keys_str_mv AT plattroyn accuratetransposableelementannotationisvitalwhenanalyzingnewgenomeassemblies
AT blancoberdugolaura accuratetransposableelementannotationisvitalwhenanalyzingnewgenomeassemblies
AT raydavida accuratetransposableelementannotationisvitalwhenanalyzingnewgenomeassemblies