Cargando…
Transposable element subfamily annotation has a reproducibility problem
BACKGROUND: Transposable element (TE) sequences are classified into families based on the reconstructed history of replication, and into subfamilies based on more fine-grained features that are often intended to capture family history. We evaluate the reliability of annotation with common subfamilie...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7827986/ https://www.ncbi.nlm.nih.gov/pubmed/33485368 http://dx.doi.org/10.1186/s13100-021-00232-4 |
_version_ | 1783640900518281216 |
---|---|
author | Carey, Kaitlin M. Patterson, Gilia Wheeler, Travis J. |
author_facet | Carey, Kaitlin M. Patterson, Gilia Wheeler, Travis J. |
author_sort | Carey, Kaitlin M. |
collection | PubMed |
description | BACKGROUND: Transposable element (TE) sequences are classified into families based on the reconstructed history of replication, and into subfamilies based on more fine-grained features that are often intended to capture family history. We evaluate the reliability of annotation with common subfamilies by assessing the extent to which subfamily annotation is reproducible in replicate copies created by segmental duplications in the human genome, and in homologous copies shared by human and chimpanzee. RESULTS: We find that standard methods annotate over 10% of replicates as belonging to different subfamilies, despite the fact that they are expected to be annotated as belonging to the same subfamily. Point mutations and homologous recombination appear to be responsible for some of this discordant annotation (particularly in the young Alu family), but are unlikely to fully explain the annotation unreliability. CONCLUSIONS: The surprisingly high level of disagreement in subfamily annotation of homologous sequences highlights a need for further research into definition of TE subfamilies, methods for representing subfamily annotation confidence of TE instances, and approaches to better utilizing such nuanced annotation data in downstream analysis. |
format | Online Article Text |
id | pubmed-7827986 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-78279862021-01-26 Transposable element subfamily annotation has a reproducibility problem Carey, Kaitlin M. Patterson, Gilia Wheeler, Travis J. Mob DNA Research BACKGROUND: Transposable element (TE) sequences are classified into families based on the reconstructed history of replication, and into subfamilies based on more fine-grained features that are often intended to capture family history. We evaluate the reliability of annotation with common subfamilies by assessing the extent to which subfamily annotation is reproducible in replicate copies created by segmental duplications in the human genome, and in homologous copies shared by human and chimpanzee. RESULTS: We find that standard methods annotate over 10% of replicates as belonging to different subfamilies, despite the fact that they are expected to be annotated as belonging to the same subfamily. Point mutations and homologous recombination appear to be responsible for some of this discordant annotation (particularly in the young Alu family), but are unlikely to fully explain the annotation unreliability. CONCLUSIONS: The surprisingly high level of disagreement in subfamily annotation of homologous sequences highlights a need for further research into definition of TE subfamilies, methods for representing subfamily annotation confidence of TE instances, and approaches to better utilizing such nuanced annotation data in downstream analysis. BioMed Central 2021-01-23 /pmc/articles/PMC7827986/ /pubmed/33485368 http://dx.doi.org/10.1186/s13100-021-00232-4 Text en © The Author(s) 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Carey, Kaitlin M. Patterson, Gilia Wheeler, Travis J. Transposable element subfamily annotation has a reproducibility problem |
title | Transposable element subfamily annotation has a reproducibility problem |
title_full | Transposable element subfamily annotation has a reproducibility problem |
title_fullStr | Transposable element subfamily annotation has a reproducibility problem |
title_full_unstemmed | Transposable element subfamily annotation has a reproducibility problem |
title_short | Transposable element subfamily annotation has a reproducibility problem |
title_sort | transposable element subfamily annotation has a reproducibility problem |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7827986/ https://www.ncbi.nlm.nih.gov/pubmed/33485368 http://dx.doi.org/10.1186/s13100-021-00232-4 |
work_keys_str_mv | AT careykaitlinm transposableelementsubfamilyannotationhasareproducibilityproblem AT pattersongilia transposableelementsubfamilyannotationhasareproducibilityproblem AT wheelertravisj transposableelementsubfamilyannotationhasareproducibilityproblem |