Cargando…

Comprehensive identification of transposable element insertions using multiple sequencing technologies

Transposable elements (TEs) help shape the structure and function of the human genome. When inserted into some locations, TEs may disrupt gene regulation and cause diseases. Here, we present xTea (x-Transposable element analyzer), a tool for identifying TE insertions in whole-genome sequencing data....

Descripción completa

Detalles Bibliográficos
Autores principales: Chu, Chong, Borges-Monroy, Rebeca, Viswanadham, Vinayak V., Lee, Soohyun, Li, Heng, Lee, Eunjung Alice, Park, Peter J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8219666/
https://www.ncbi.nlm.nih.gov/pubmed/34158502
http://dx.doi.org/10.1038/s41467-021-24041-8
_version_ 1783710984084389888
author Chu, Chong
Borges-Monroy, Rebeca
Viswanadham, Vinayak V.
Lee, Soohyun
Li, Heng
Lee, Eunjung Alice
Park, Peter J.
author_facet Chu, Chong
Borges-Monroy, Rebeca
Viswanadham, Vinayak V.
Lee, Soohyun
Li, Heng
Lee, Eunjung Alice
Park, Peter J.
author_sort Chu, Chong
collection PubMed
description Transposable elements (TEs) help shape the structure and function of the human genome. When inserted into some locations, TEs may disrupt gene regulation and cause diseases. Here, we present xTea (x-Transposable element analyzer), a tool for identifying TE insertions in whole-genome sequencing data. Whereas existing methods are mostly designed for short-read data, xTea can be applied to both short-read and long-read data. Our analysis shows that xTea outperforms other short read-based methods for both germline and somatic TE insertion discovery. With long-read data, we created a catalogue of polymorphic insertions with full assembly and annotation of insertional sequences for various types of retroelements, including pseudogenes and endogenous retroviruses. Notably, we find that individual genomes have an average of nine groups of full-length L1s in centromeres, suggesting that centromeres and other highly repetitive regions such as telomeres are a significant yet unexplored source of active L1s. xTea is available at https://github.com/parklab/xTea.
format Online
Article
Text
id pubmed-8219666
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-82196662021-07-09 Comprehensive identification of transposable element insertions using multiple sequencing technologies Chu, Chong Borges-Monroy, Rebeca Viswanadham, Vinayak V. Lee, Soohyun Li, Heng Lee, Eunjung Alice Park, Peter J. Nat Commun Article Transposable elements (TEs) help shape the structure and function of the human genome. When inserted into some locations, TEs may disrupt gene regulation and cause diseases. Here, we present xTea (x-Transposable element analyzer), a tool for identifying TE insertions in whole-genome sequencing data. Whereas existing methods are mostly designed for short-read data, xTea can be applied to both short-read and long-read data. Our analysis shows that xTea outperforms other short read-based methods for both germline and somatic TE insertion discovery. With long-read data, we created a catalogue of polymorphic insertions with full assembly and annotation of insertional sequences for various types of retroelements, including pseudogenes and endogenous retroviruses. Notably, we find that individual genomes have an average of nine groups of full-length L1s in centromeres, suggesting that centromeres and other highly repetitive regions such as telomeres are a significant yet unexplored source of active L1s. xTea is available at https://github.com/parklab/xTea. Nature Publishing Group UK 2021-06-22 /pmc/articles/PMC8219666/ /pubmed/34158502 http://dx.doi.org/10.1038/s41467-021-24041-8 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Chu, Chong
Borges-Monroy, Rebeca
Viswanadham, Vinayak V.
Lee, Soohyun
Li, Heng
Lee, Eunjung Alice
Park, Peter J.
Comprehensive identification of transposable element insertions using multiple sequencing technologies
title Comprehensive identification of transposable element insertions using multiple sequencing technologies
title_full Comprehensive identification of transposable element insertions using multiple sequencing technologies
title_fullStr Comprehensive identification of transposable element insertions using multiple sequencing technologies
title_full_unstemmed Comprehensive identification of transposable element insertions using multiple sequencing technologies
title_short Comprehensive identification of transposable element insertions using multiple sequencing technologies
title_sort comprehensive identification of transposable element insertions using multiple sequencing technologies
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8219666/
https://www.ncbi.nlm.nih.gov/pubmed/34158502
http://dx.doi.org/10.1038/s41467-021-24041-8
work_keys_str_mv AT chuchong comprehensiveidentificationoftransposableelementinsertionsusingmultiplesequencingtechnologies
AT borgesmonroyrebeca comprehensiveidentificationoftransposableelementinsertionsusingmultiplesequencingtechnologies
AT viswanadhamvinayakv comprehensiveidentificationoftransposableelementinsertionsusingmultiplesequencingtechnologies
AT leesoohyun comprehensiveidentificationoftransposableelementinsertionsusingmultiplesequencingtechnologies
AT liheng comprehensiveidentificationoftransposableelementinsertionsusingmultiplesequencingtechnologies
AT leeeunjungalice comprehensiveidentificationoftransposableelementinsertionsusingmultiplesequencingtechnologies
AT parkpeterj comprehensiveidentificationoftransposableelementinsertionsusingmultiplesequencingtechnologies