Cargando…

T-lex3: an accurate tool to genotype and estimate population frequencies of transposable elements using the latest short-read whole genome sequencing data

MOTIVATION: Transposable elements (TEs) constitute a significant proportion of the majority of genomes sequenced to date. TEs are responsible for a considerable fraction of the genetic variation within and among species. Accurate genotyping of TEs in genomes is therefore crucial for a complete ident...

Descripción completa

Detalles Bibliográficos
Autores principales: Bogaerts-Márquez, María, Barrón, Maite G, Fiston-Lavier, Anna-Sophie, Vendrell-Mir, Pol, Castanera, Raúl, Casacuberta, Josep M, González, Josefa
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7703783/
https://www.ncbi.nlm.nih.gov/pubmed/31580402
http://dx.doi.org/10.1093/bioinformatics/btz727
_version_ 1783616695439458304
author Bogaerts-Márquez, María
Barrón, Maite G
Fiston-Lavier, Anna-Sophie
Vendrell-Mir, Pol
Castanera, Raúl
Casacuberta, Josep M
González, Josefa
author_facet Bogaerts-Márquez, María
Barrón, Maite G
Fiston-Lavier, Anna-Sophie
Vendrell-Mir, Pol
Castanera, Raúl
Casacuberta, Josep M
González, Josefa
author_sort Bogaerts-Márquez, María
collection PubMed
description MOTIVATION: Transposable elements (TEs) constitute a significant proportion of the majority of genomes sequenced to date. TEs are responsible for a considerable fraction of the genetic variation within and among species. Accurate genotyping of TEs in genomes is therefore crucial for a complete identification of the genetic differences among individuals, populations and species. RESULTS: In this work, we present a new version of T-lex, a computational pipeline that accurately genotypes and estimates the population frequencies of reference TE insertions using short-read high-throughput sequencing data. In this new version, we have re-designed the T-lex algorithm to integrate the BWA-MEM short-read aligner, which is one of the most accurate short-read mappers and can be launched on longer short-reads (e.g. reads >150 bp). We have added new filtering steps to increase the accuracy of the genotyping, and new parameters that allow the user to control both the minimum and maximum number of reads, and the minimum number of strains to genotype a TE insertion. We also showed for the first time that T-lex3 provides accurate TE calls in a plant genome. AVAILABILITY AND IMPLEMENTATION: To test the accuracy of T-lex3, we called 1630 individual TE insertions in Drosophila melanogaster, 1600 individual TE insertions in humans, and 3067 individual TE insertions in the rice genome. We showed that this new version of T-lex is a broadly applicable and accurate tool for genotyping and estimating TE frequencies in organisms with different genome sizes and different TE contents. T-lex3 is available at Github: https://github.com/GonzalezLab/T-lex3. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-7703783
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-77037832020-12-07 T-lex3: an accurate tool to genotype and estimate population frequencies of transposable elements using the latest short-read whole genome sequencing data Bogaerts-Márquez, María Barrón, Maite G Fiston-Lavier, Anna-Sophie Vendrell-Mir, Pol Castanera, Raúl Casacuberta, Josep M González, Josefa Bioinformatics Original Papers MOTIVATION: Transposable elements (TEs) constitute a significant proportion of the majority of genomes sequenced to date. TEs are responsible for a considerable fraction of the genetic variation within and among species. Accurate genotyping of TEs in genomes is therefore crucial for a complete identification of the genetic differences among individuals, populations and species. RESULTS: In this work, we present a new version of T-lex, a computational pipeline that accurately genotypes and estimates the population frequencies of reference TE insertions using short-read high-throughput sequencing data. In this new version, we have re-designed the T-lex algorithm to integrate the BWA-MEM short-read aligner, which is one of the most accurate short-read mappers and can be launched on longer short-reads (e.g. reads >150 bp). We have added new filtering steps to increase the accuracy of the genotyping, and new parameters that allow the user to control both the minimum and maximum number of reads, and the minimum number of strains to genotype a TE insertion. We also showed for the first time that T-lex3 provides accurate TE calls in a plant genome. AVAILABILITY AND IMPLEMENTATION: To test the accuracy of T-lex3, we called 1630 individual TE insertions in Drosophila melanogaster, 1600 individual TE insertions in humans, and 3067 individual TE insertions in the rice genome. We showed that this new version of T-lex is a broadly applicable and accurate tool for genotyping and estimating TE frequencies in organisms with different genome sizes and different TE contents. T-lex3 is available at Github: https://github.com/GonzalezLab/T-lex3. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-02-15 2019-10-03 /pmc/articles/PMC7703783/ /pubmed/31580402 http://dx.doi.org/10.1093/bioinformatics/btz727 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Original Papers
Bogaerts-Márquez, María
Barrón, Maite G
Fiston-Lavier, Anna-Sophie
Vendrell-Mir, Pol
Castanera, Raúl
Casacuberta, Josep M
González, Josefa
T-lex3: an accurate tool to genotype and estimate population frequencies of transposable elements using the latest short-read whole genome sequencing data
title T-lex3: an accurate tool to genotype and estimate population frequencies of transposable elements using the latest short-read whole genome sequencing data
title_full T-lex3: an accurate tool to genotype and estimate population frequencies of transposable elements using the latest short-read whole genome sequencing data
title_fullStr T-lex3: an accurate tool to genotype and estimate population frequencies of transposable elements using the latest short-read whole genome sequencing data
title_full_unstemmed T-lex3: an accurate tool to genotype and estimate population frequencies of transposable elements using the latest short-read whole genome sequencing data
title_short T-lex3: an accurate tool to genotype and estimate population frequencies of transposable elements using the latest short-read whole genome sequencing data
title_sort t-lex3: an accurate tool to genotype and estimate population frequencies of transposable elements using the latest short-read whole genome sequencing data
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7703783/
https://www.ncbi.nlm.nih.gov/pubmed/31580402
http://dx.doi.org/10.1093/bioinformatics/btz727
work_keys_str_mv AT bogaertsmarquezmaria tlex3anaccuratetooltogenotypeandestimatepopulationfrequenciesoftransposableelementsusingthelatestshortreadwholegenomesequencingdata
AT barronmaiteg tlex3anaccuratetooltogenotypeandestimatepopulationfrequenciesoftransposableelementsusingthelatestshortreadwholegenomesequencingdata
AT fistonlavierannasophie tlex3anaccuratetooltogenotypeandestimatepopulationfrequenciesoftransposableelementsusingthelatestshortreadwholegenomesequencingdata
AT vendrellmirpol tlex3anaccuratetooltogenotypeandestimatepopulationfrequenciesoftransposableelementsusingthelatestshortreadwholegenomesequencingdata
AT castaneraraul tlex3anaccuratetooltogenotypeandestimatepopulationfrequenciesoftransposableelementsusingthelatestshortreadwholegenomesequencingdata
AT casacubertajosepm tlex3anaccuratetooltogenotypeandestimatepopulationfrequenciesoftransposableelementsusingthelatestshortreadwholegenomesequencingdata
AT gonzalezjosefa tlex3anaccuratetooltogenotypeandestimatepopulationfrequenciesoftransposableelementsusingthelatestshortreadwholegenomesequencingdata