Cargando…

TIP_finder: An HPC Software to Detect Transposable Element Insertion Polymorphisms in Large Genomic Datasets

Transposable elements (TEs) are non-static genomic units capable of moving indistinctly from one chromosomal location to another. Their insertion polymorphisms may cause beneficial mutations, such as the creation of new gene function, or deleterious in eukaryotes, e.g., different types of cancer in...

Descripción completa

Detalles Bibliográficos
Autores principales: Orozco-Arias, Simon, Tobon-Orozco, Nicolas, Piña, Johan S., Jiménez-Varón, Cristian Felipe, Tabares-Soto, Reinel, Guyot, Romain
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7563458/
https://www.ncbi.nlm.nih.gov/pubmed/32917036
http://dx.doi.org/10.3390/biology9090281
_version_ 1783595493578768384
author Orozco-Arias, Simon
Tobon-Orozco, Nicolas
Piña, Johan S.
Jiménez-Varón, Cristian Felipe
Tabares-Soto, Reinel
Guyot, Romain
author_facet Orozco-Arias, Simon
Tobon-Orozco, Nicolas
Piña, Johan S.
Jiménez-Varón, Cristian Felipe
Tabares-Soto, Reinel
Guyot, Romain
author_sort Orozco-Arias, Simon
collection PubMed
description Transposable elements (TEs) are non-static genomic units capable of moving indistinctly from one chromosomal location to another. Their insertion polymorphisms may cause beneficial mutations, such as the creation of new gene function, or deleterious in eukaryotes, e.g., different types of cancer in humans. A particular type of TE called LTR-retrotransposons comprises almost 8% of the human genome. Among LTR retrotransposons, human endogenous retroviruses (HERVs) bear structural and functional similarities to retroviruses. Several tools allow the detection of transposon insertion polymorphisms (TIPs) but fail to efficiently analyze large genomes or large datasets. Here, we developed a computational tool, named TIP_finder, able to detect mobile element insertions in very large genomes, through high-performance computing (HPC) and parallel programming, using the inference of discordant read pair analysis. TIP_finder inputs are (i) short pair reads such as those obtained by Illumina, (ii) a chromosome-level reference genome sequence, and (iii) a database of consensus TE sequences. The HPC strategy we propose adds scalability and provides a useful tool to analyze huge genomic datasets in a decent running time. TIP_finder accelerates the detection of transposon insertion polymorphisms (TIPs) by up to 55 times in breast cancer datasets and 46 times in cancer-free datasets compared to the fastest available algorithms. TIP_finder applies a validated strategy to find TIPs, accelerates the process through HPC, and addresses the issues of runtime for large-scale analyses in the post-genomic era.
format Online
Article
Text
id pubmed-7563458
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-75634582020-10-27 TIP_finder: An HPC Software to Detect Transposable Element Insertion Polymorphisms in Large Genomic Datasets Orozco-Arias, Simon Tobon-Orozco, Nicolas Piña, Johan S. Jiménez-Varón, Cristian Felipe Tabares-Soto, Reinel Guyot, Romain Biology (Basel) Article Transposable elements (TEs) are non-static genomic units capable of moving indistinctly from one chromosomal location to another. Their insertion polymorphisms may cause beneficial mutations, such as the creation of new gene function, or deleterious in eukaryotes, e.g., different types of cancer in humans. A particular type of TE called LTR-retrotransposons comprises almost 8% of the human genome. Among LTR retrotransposons, human endogenous retroviruses (HERVs) bear structural and functional similarities to retroviruses. Several tools allow the detection of transposon insertion polymorphisms (TIPs) but fail to efficiently analyze large genomes or large datasets. Here, we developed a computational tool, named TIP_finder, able to detect mobile element insertions in very large genomes, through high-performance computing (HPC) and parallel programming, using the inference of discordant read pair analysis. TIP_finder inputs are (i) short pair reads such as those obtained by Illumina, (ii) a chromosome-level reference genome sequence, and (iii) a database of consensus TE sequences. The HPC strategy we propose adds scalability and provides a useful tool to analyze huge genomic datasets in a decent running time. TIP_finder accelerates the detection of transposon insertion polymorphisms (TIPs) by up to 55 times in breast cancer datasets and 46 times in cancer-free datasets compared to the fastest available algorithms. TIP_finder applies a validated strategy to find TIPs, accelerates the process through HPC, and addresses the issues of runtime for large-scale analyses in the post-genomic era. MDPI 2020-09-09 /pmc/articles/PMC7563458/ /pubmed/32917036 http://dx.doi.org/10.3390/biology9090281 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Orozco-Arias, Simon
Tobon-Orozco, Nicolas
Piña, Johan S.
Jiménez-Varón, Cristian Felipe
Tabares-Soto, Reinel
Guyot, Romain
TIP_finder: An HPC Software to Detect Transposable Element Insertion Polymorphisms in Large Genomic Datasets
title TIP_finder: An HPC Software to Detect Transposable Element Insertion Polymorphisms in Large Genomic Datasets
title_full TIP_finder: An HPC Software to Detect Transposable Element Insertion Polymorphisms in Large Genomic Datasets
title_fullStr TIP_finder: An HPC Software to Detect Transposable Element Insertion Polymorphisms in Large Genomic Datasets
title_full_unstemmed TIP_finder: An HPC Software to Detect Transposable Element Insertion Polymorphisms in Large Genomic Datasets
title_short TIP_finder: An HPC Software to Detect Transposable Element Insertion Polymorphisms in Large Genomic Datasets
title_sort tip_finder: an hpc software to detect transposable element insertion polymorphisms in large genomic datasets
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7563458/
https://www.ncbi.nlm.nih.gov/pubmed/32917036
http://dx.doi.org/10.3390/biology9090281
work_keys_str_mv AT orozcoariassimon tipfinderanhpcsoftwaretodetecttransposableelementinsertionpolymorphismsinlargegenomicdatasets
AT tobonorozconicolas tipfinderanhpcsoftwaretodetecttransposableelementinsertionpolymorphismsinlargegenomicdatasets
AT pinajohans tipfinderanhpcsoftwaretodetecttransposableelementinsertionpolymorphismsinlargegenomicdatasets
AT jimenezvaroncristianfelipe tipfinderanhpcsoftwaretodetecttransposableelementinsertionpolymorphismsinlargegenomicdatasets
AT tabaressotoreinel tipfinderanhpcsoftwaretodetecttransposableelementinsertionpolymorphismsinlargegenomicdatasets
AT guyotromain tipfinderanhpcsoftwaretodetecttransposableelementinsertionpolymorphismsinlargegenomicdatasets