Cargando…

Transposable element finder (TEF): finding active transposable elements from next generation sequencing data

BACKGROUND: Detection of newly transposed events by transposable elements (TEs) from next generation sequence (NGS) data is difficult, due to their multiple distribution sites over the genome containing older TEs. The previously reported Transposon Insertion Finder (TIF) detects TE transpositions on...

Descripción completa

Detalles Bibliográficos
Autores principales: Miyao, Akio, Yamanouchi, Utako
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9682801/
https://www.ncbi.nlm.nih.gov/pubmed/36418944
http://dx.doi.org/10.1186/s12859-022-05011-3
_version_ 1784834934396944384
author Miyao, Akio
Yamanouchi, Utako
author_facet Miyao, Akio
Yamanouchi, Utako
author_sort Miyao, Akio
collection PubMed
description BACKGROUND: Detection of newly transposed events by transposable elements (TEs) from next generation sequence (NGS) data is difficult, due to their multiple distribution sites over the genome containing older TEs. The previously reported Transposon Insertion Finder (TIF) detects TE transpositions on the reference genome from NGS short reads using end sequences of target TE. TIF requires the sequence of target TE and is not able to detect transpositions for TEs with an unknown sequence. RESULT: The new algorithm Transposable Element Finder (TEF) enables the detection of TE transpositions, even for TEs with an unknown sequence. TEF is a finding tool of transposed TEs, in contrast to TIF as a detection tool of transposed sites for TEs with a known sequence. The transposition event is often accompanied with a target site duplication (TSD). Focusing on TSD, two algorithms to detect both ends of TE, TSDs and target sites are reported here. One is based on the grouping with TSDs and direct comparison of k-mers from NGS without similarity search. The other is based on the junction mapping of TE end sequence candidates. Both methods succeed to detect both ends and TSDs of known active TEs in several tests with rice, Arabidopsis and Drosophila data and discover several new TEs in new locations. PCR confirmed the detected transpositions of TEs in several test cases in rice. CONCLUSIONS: TEF detects transposed TEs with TSDs as a result of TE transposition, sequences of both ends and their inserted positions of transposed TEs by direct comparison of NGS data between two samples. Genotypes of transpositions are verified by counting of junctions of head and tail, and non-insertion sequences in NGS reads. TEF is easy to run and independent of any TE library, which makes it useful to detect insertions from unknown TEs bypassed by common TE annotation pipelines. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-05011-3.
format Online
Article
Text
id pubmed-9682801
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-96828012022-11-24 Transposable element finder (TEF): finding active transposable elements from next generation sequencing data Miyao, Akio Yamanouchi, Utako BMC Bioinformatics Research BACKGROUND: Detection of newly transposed events by transposable elements (TEs) from next generation sequence (NGS) data is difficult, due to their multiple distribution sites over the genome containing older TEs. The previously reported Transposon Insertion Finder (TIF) detects TE transpositions on the reference genome from NGS short reads using end sequences of target TE. TIF requires the sequence of target TE and is not able to detect transpositions for TEs with an unknown sequence. RESULT: The new algorithm Transposable Element Finder (TEF) enables the detection of TE transpositions, even for TEs with an unknown sequence. TEF is a finding tool of transposed TEs, in contrast to TIF as a detection tool of transposed sites for TEs with a known sequence. The transposition event is often accompanied with a target site duplication (TSD). Focusing on TSD, two algorithms to detect both ends of TE, TSDs and target sites are reported here. One is based on the grouping with TSDs and direct comparison of k-mers from NGS without similarity search. The other is based on the junction mapping of TE end sequence candidates. Both methods succeed to detect both ends and TSDs of known active TEs in several tests with rice, Arabidopsis and Drosophila data and discover several new TEs in new locations. PCR confirmed the detected transpositions of TEs in several test cases in rice. CONCLUSIONS: TEF detects transposed TEs with TSDs as a result of TE transposition, sequences of both ends and their inserted positions of transposed TEs by direct comparison of NGS data between two samples. Genotypes of transpositions are verified by counting of junctions of head and tail, and non-insertion sequences in NGS reads. TEF is easy to run and independent of any TE library, which makes it useful to detect insertions from unknown TEs bypassed by common TE annotation pipelines. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-05011-3. BioMed Central 2022-11-22 /pmc/articles/PMC9682801/ /pubmed/36418944 http://dx.doi.org/10.1186/s12859-022-05011-3 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Miyao, Akio
Yamanouchi, Utako
Transposable element finder (TEF): finding active transposable elements from next generation sequencing data
title Transposable element finder (TEF): finding active transposable elements from next generation sequencing data
title_full Transposable element finder (TEF): finding active transposable elements from next generation sequencing data
title_fullStr Transposable element finder (TEF): finding active transposable elements from next generation sequencing data
title_full_unstemmed Transposable element finder (TEF): finding active transposable elements from next generation sequencing data
title_short Transposable element finder (TEF): finding active transposable elements from next generation sequencing data
title_sort transposable element finder (tef): finding active transposable elements from next generation sequencing data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9682801/
https://www.ncbi.nlm.nih.gov/pubmed/36418944
http://dx.doi.org/10.1186/s12859-022-05011-3
work_keys_str_mv AT miyaoakio transposableelementfinderteffindingactivetransposableelementsfromnextgenerationsequencingdata
AT yamanouchiutako transposableelementfinderteffindingactivetransposableelementsfromnextgenerationsequencingdata