Cargando…
RelocaTE2: a high resolution transposable element insertion site mapping tool for population resequencing
BACKGROUND: Transposable element (TE) polymorphisms are important components of population genetic variation. The functional impacts of TEs in gene regulation and generating genetic diversity have been observed in multiple species, but the frequency and magnitude of TE variation is under appreciated...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5274521/ https://www.ncbi.nlm.nih.gov/pubmed/28149701 http://dx.doi.org/10.7717/peerj.2942 |
_version_ | 1782501932391202816 |
---|---|
author | Chen, Jinfeng Wrightsman, Travis R. Wessler, Susan R. Stajich, Jason E. |
author_facet | Chen, Jinfeng Wrightsman, Travis R. Wessler, Susan R. Stajich, Jason E. |
author_sort | Chen, Jinfeng |
collection | PubMed |
description | BACKGROUND: Transposable element (TE) polymorphisms are important components of population genetic variation. The functional impacts of TEs in gene regulation and generating genetic diversity have been observed in multiple species, but the frequency and magnitude of TE variation is under appreciated. Inexpensive and deep sequencing technology has made it affordable to apply population genetic methods to whole genomes with methods that identify single nucleotide and insertion/deletion polymorphisms. However, identifying TE polymorphisms, particularly transposition events or non-reference insertion sites can be challenging due to the repetitive nature of these sequences, which hamper both the sensitivity and specificity of analysis tools. METHODS: We have developed the tool RelocaTE2 for identification of TE insertion sites at high sensitivity and specificity. RelocaTE2 searches for known TE sequences in whole genome sequencing reads from second generation sequencing platforms such as Illumina. These sequence reads are used as seeds to pinpoint chromosome locations where TEs have transposed. RelocaTE2 detects target site duplication (TSD) of TE insertions allowing it to report TE polymorphism loci with single base pair precision. RESULTS AND DISCUSSION: The performance of RelocaTE2 is evaluated using both simulated and real sequence data. RelocaTE2 demonstrate high level of sensitivity and specificity, particularly when the sequence coverage is not shallow. In comparison to other tools tested, RelocaTE2 achieves the best balance between sensitivity and specificity. In particular, RelocaTE2 performs best in prediction of TSDs for TE insertions. Even in highly repetitive regions, such as those tested on rice chromosome 4, RelocaTE2 is able to report up to 95% of simulated TE insertions with less than 0.1% false positive rate using 10-fold genome coverage resequencing data. RelocaTE2 provides a robust solution to identify TE insertion sites and can be incorporated into analysis workflows in support of describing the complete genotype from light coverage genome sequencing. |
format | Online Article Text |
id | pubmed-5274521 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-52745212017-02-01 RelocaTE2: a high resolution transposable element insertion site mapping tool for population resequencing Chen, Jinfeng Wrightsman, Travis R. Wessler, Susan R. Stajich, Jason E. PeerJ Bioinformatics BACKGROUND: Transposable element (TE) polymorphisms are important components of population genetic variation. The functional impacts of TEs in gene regulation and generating genetic diversity have been observed in multiple species, but the frequency and magnitude of TE variation is under appreciated. Inexpensive and deep sequencing technology has made it affordable to apply population genetic methods to whole genomes with methods that identify single nucleotide and insertion/deletion polymorphisms. However, identifying TE polymorphisms, particularly transposition events or non-reference insertion sites can be challenging due to the repetitive nature of these sequences, which hamper both the sensitivity and specificity of analysis tools. METHODS: We have developed the tool RelocaTE2 for identification of TE insertion sites at high sensitivity and specificity. RelocaTE2 searches for known TE sequences in whole genome sequencing reads from second generation sequencing platforms such as Illumina. These sequence reads are used as seeds to pinpoint chromosome locations where TEs have transposed. RelocaTE2 detects target site duplication (TSD) of TE insertions allowing it to report TE polymorphism loci with single base pair precision. RESULTS AND DISCUSSION: The performance of RelocaTE2 is evaluated using both simulated and real sequence data. RelocaTE2 demonstrate high level of sensitivity and specificity, particularly when the sequence coverage is not shallow. In comparison to other tools tested, RelocaTE2 achieves the best balance between sensitivity and specificity. In particular, RelocaTE2 performs best in prediction of TSDs for TE insertions. Even in highly repetitive regions, such as those tested on rice chromosome 4, RelocaTE2 is able to report up to 95% of simulated TE insertions with less than 0.1% false positive rate using 10-fold genome coverage resequencing data. RelocaTE2 provides a robust solution to identify TE insertion sites and can be incorporated into analysis workflows in support of describing the complete genotype from light coverage genome sequencing. PeerJ Inc. 2017-01-26 /pmc/articles/PMC5274521/ /pubmed/28149701 http://dx.doi.org/10.7717/peerj.2942 Text en ©2017 Chen et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited. |
spellingShingle | Bioinformatics Chen, Jinfeng Wrightsman, Travis R. Wessler, Susan R. Stajich, Jason E. RelocaTE2: a high resolution transposable element insertion site mapping tool for population resequencing |
title | RelocaTE2: a high resolution transposable element insertion site mapping tool for population resequencing |
title_full | RelocaTE2: a high resolution transposable element insertion site mapping tool for population resequencing |
title_fullStr | RelocaTE2: a high resolution transposable element insertion site mapping tool for population resequencing |
title_full_unstemmed | RelocaTE2: a high resolution transposable element insertion site mapping tool for population resequencing |
title_short | RelocaTE2: a high resolution transposable element insertion site mapping tool for population resequencing |
title_sort | relocate2: a high resolution transposable element insertion site mapping tool for population resequencing |
topic | Bioinformatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5274521/ https://www.ncbi.nlm.nih.gov/pubmed/28149701 http://dx.doi.org/10.7717/peerj.2942 |
work_keys_str_mv | AT chenjinfeng relocate2ahighresolutiontransposableelementinsertionsitemappingtoolforpopulationresequencing AT wrightsmantravisr relocate2ahighresolutiontransposableelementinsertionsitemappingtoolforpopulationresequencing AT wesslersusanr relocate2ahighresolutiontransposableelementinsertionsitemappingtoolforpopulationresequencing AT stajichjasone relocate2ahighresolutiontransposableelementinsertionsitemappingtoolforpopulationresequencing |