Cargando…

A benchmark and an algorithm for detecting germline transposon insertions and measuring de novo transposon insertion frequencies

Transposons are genomic parasites, and their new insertions can cause instability and spur the evolution of their host genomes. Rapid accumulation of short-read whole-genome sequencing data provides a great opportunity for studying new transposon insertions and their impacts on the host genome. Alth...

Descripción completa

Detalles Bibliográficos
Autores principales: Yu, Tianxiong, Huang, Xiao, Dou, Shengqian, Tang, Xiaolu, Luo, Shiqi, Theurkauf, William E, Lu, Jian, Weng, Zhiping
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8096211/
https://www.ncbi.nlm.nih.gov/pubmed/33511407
http://dx.doi.org/10.1093/nar/gkab010
_version_ 1783688117170995200
author Yu, Tianxiong
Huang, Xiao
Dou, Shengqian
Tang, Xiaolu
Luo, Shiqi
Theurkauf, William E
Lu, Jian
Weng, Zhiping
author_facet Yu, Tianxiong
Huang, Xiao
Dou, Shengqian
Tang, Xiaolu
Luo, Shiqi
Theurkauf, William E
Lu, Jian
Weng, Zhiping
author_sort Yu, Tianxiong
collection PubMed
description Transposons are genomic parasites, and their new insertions can cause instability and spur the evolution of their host genomes. Rapid accumulation of short-read whole-genome sequencing data provides a great opportunity for studying new transposon insertions and their impacts on the host genome. Although many algorithms are available for detecting transposon insertions, the task remains challenging and existing tools are not designed for identifying de novo insertions. Here, we present a new benchmark fly dataset based on PacBio long-read sequencing and a new method TEMP2 for detecting germline insertions and measuring de novo ‘singleton’ insertion frequencies in eukaryotic genomes. TEMP2 achieves high sensitivity and precision for detecting germline insertions when compared with existing tools using both simulated data in fly and experimental data in fly and human. Furthermore, TEMP2 can accurately assess the frequencies of de novo transposon insertions even with high levels of chimeric reads in simulated datasets; such chimeric reads often occur during the construction of short-read sequencing libraries. By applying TEMP2 to published data on hybrid dysgenic flies inflicted by de-repressed P-elements, we confirmed the continuous new insertions of P-elements in dysgenic offspring before they regain piRNAs for P-element repression. TEMP2 is freely available at Github: https://github.com/weng-lab/TEMP2.
format Online
Article
Text
id pubmed-8096211
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-80962112021-05-10 A benchmark and an algorithm for detecting germline transposon insertions and measuring de novo transposon insertion frequencies Yu, Tianxiong Huang, Xiao Dou, Shengqian Tang, Xiaolu Luo, Shiqi Theurkauf, William E Lu, Jian Weng, Zhiping Nucleic Acids Res Methods Online Transposons are genomic parasites, and their new insertions can cause instability and spur the evolution of their host genomes. Rapid accumulation of short-read whole-genome sequencing data provides a great opportunity for studying new transposon insertions and their impacts on the host genome. Although many algorithms are available for detecting transposon insertions, the task remains challenging and existing tools are not designed for identifying de novo insertions. Here, we present a new benchmark fly dataset based on PacBio long-read sequencing and a new method TEMP2 for detecting germline insertions and measuring de novo ‘singleton’ insertion frequencies in eukaryotic genomes. TEMP2 achieves high sensitivity and precision for detecting germline insertions when compared with existing tools using both simulated data in fly and experimental data in fly and human. Furthermore, TEMP2 can accurately assess the frequencies of de novo transposon insertions even with high levels of chimeric reads in simulated datasets; such chimeric reads often occur during the construction of short-read sequencing libraries. By applying TEMP2 to published data on hybrid dysgenic flies inflicted by de-repressed P-elements, we confirmed the continuous new insertions of P-elements in dysgenic offspring before they regain piRNAs for P-element repression. TEMP2 is freely available at Github: https://github.com/weng-lab/TEMP2. Oxford University Press 2021-01-28 /pmc/articles/PMC8096211/ /pubmed/33511407 http://dx.doi.org/10.1093/nar/gkab010 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of Nucleic Acids Research. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Yu, Tianxiong
Huang, Xiao
Dou, Shengqian
Tang, Xiaolu
Luo, Shiqi
Theurkauf, William E
Lu, Jian
Weng, Zhiping
A benchmark and an algorithm for detecting germline transposon insertions and measuring de novo transposon insertion frequencies
title A benchmark and an algorithm for detecting germline transposon insertions and measuring de novo transposon insertion frequencies
title_full A benchmark and an algorithm for detecting germline transposon insertions and measuring de novo transposon insertion frequencies
title_fullStr A benchmark and an algorithm for detecting germline transposon insertions and measuring de novo transposon insertion frequencies
title_full_unstemmed A benchmark and an algorithm for detecting germline transposon insertions and measuring de novo transposon insertion frequencies
title_short A benchmark and an algorithm for detecting germline transposon insertions and measuring de novo transposon insertion frequencies
title_sort benchmark and an algorithm for detecting germline transposon insertions and measuring de novo transposon insertion frequencies
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8096211/
https://www.ncbi.nlm.nih.gov/pubmed/33511407
http://dx.doi.org/10.1093/nar/gkab010
work_keys_str_mv AT yutianxiong abenchmarkandanalgorithmfordetectinggermlinetransposoninsertionsandmeasuringdenovotransposoninsertionfrequencies
AT huangxiao abenchmarkandanalgorithmfordetectinggermlinetransposoninsertionsandmeasuringdenovotransposoninsertionfrequencies
AT doushengqian abenchmarkandanalgorithmfordetectinggermlinetransposoninsertionsandmeasuringdenovotransposoninsertionfrequencies
AT tangxiaolu abenchmarkandanalgorithmfordetectinggermlinetransposoninsertionsandmeasuringdenovotransposoninsertionfrequencies
AT luoshiqi abenchmarkandanalgorithmfordetectinggermlinetransposoninsertionsandmeasuringdenovotransposoninsertionfrequencies
AT theurkaufwilliame abenchmarkandanalgorithmfordetectinggermlinetransposoninsertionsandmeasuringdenovotransposoninsertionfrequencies
AT lujian abenchmarkandanalgorithmfordetectinggermlinetransposoninsertionsandmeasuringdenovotransposoninsertionfrequencies
AT wengzhiping abenchmarkandanalgorithmfordetectinggermlinetransposoninsertionsandmeasuringdenovotransposoninsertionfrequencies
AT yutianxiong benchmarkandanalgorithmfordetectinggermlinetransposoninsertionsandmeasuringdenovotransposoninsertionfrequencies
AT huangxiao benchmarkandanalgorithmfordetectinggermlinetransposoninsertionsandmeasuringdenovotransposoninsertionfrequencies
AT doushengqian benchmarkandanalgorithmfordetectinggermlinetransposoninsertionsandmeasuringdenovotransposoninsertionfrequencies
AT tangxiaolu benchmarkandanalgorithmfordetectinggermlinetransposoninsertionsandmeasuringdenovotransposoninsertionfrequencies
AT luoshiqi benchmarkandanalgorithmfordetectinggermlinetransposoninsertionsandmeasuringdenovotransposoninsertionfrequencies
AT theurkaufwilliame benchmarkandanalgorithmfordetectinggermlinetransposoninsertionsandmeasuringdenovotransposoninsertionfrequencies
AT lujian benchmarkandanalgorithmfordetectinggermlinetransposoninsertionsandmeasuringdenovotransposoninsertionfrequencies
AT wengzhiping benchmarkandanalgorithmfordetectinggermlinetransposoninsertionsandmeasuringdenovotransposoninsertionfrequencies