Cargando…
Inpactor2: a software based on deep learning to identify and classify LTR-retrotransposons in plant genomes
LTR-retrotransposons are the most abundant repeat sequences in plant genomes and play an important role in evolution and biodiversity. Their characterization is of great importance to understand their dynamics. However, the identification and classification of these elements remains a challenge toda...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9851300/ https://www.ncbi.nlm.nih.gov/pubmed/36502372 http://dx.doi.org/10.1093/bib/bbac511 |
_version_ | 1784872367212724224 |
---|---|
author | Orozco-Arias, Simon Humberto Lopez-Murillo, Luis Candamil-Cortés, Mariana S Arias, Maradey Jaimes, Paula A Rossi Paschoal, Alexandre Tabares-Soto, Reinel Isaza, Gustavo Guyot, Romain |
author_facet | Orozco-Arias, Simon Humberto Lopez-Murillo, Luis Candamil-Cortés, Mariana S Arias, Maradey Jaimes, Paula A Rossi Paschoal, Alexandre Tabares-Soto, Reinel Isaza, Gustavo Guyot, Romain |
author_sort | Orozco-Arias, Simon |
collection | PubMed |
description | LTR-retrotransposons are the most abundant repeat sequences in plant genomes and play an important role in evolution and biodiversity. Their characterization is of great importance to understand their dynamics. However, the identification and classification of these elements remains a challenge today. Moreover, current software can be relatively slow (from hours to days), sometimes involve a lot of manual work and do not reach satisfactory levels in terms of precision and sensitivity. Here we present Inpactor2, an accurate and fast application that creates LTR-retrotransposon reference libraries in a very short time. Inpactor2 takes an assembled genome as input and follows a hybrid approach (deep learning and structure-based) to detect elements, filter partial sequences and finally classify intact sequences into superfamilies and, as very few tools do, into lineages. This tool takes advantage of multi-core and GPU architectures to decrease execution times. Using the rice genome, Inpactor2 showed a run time of 5 minutes (faster than other tools) and has the best accuracy and F1-Score of the tools tested here, also having the second best accuracy and specificity only surpassed by EDTA, but achieving 28% higher sensitivity. For large genomes, Inpactor2 is up to seven times faster than other available bioinformatics tools. |
format | Online Article Text |
id | pubmed-9851300 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-98513002023-01-20 Inpactor2: a software based on deep learning to identify and classify LTR-retrotransposons in plant genomes Orozco-Arias, Simon Humberto Lopez-Murillo, Luis Candamil-Cortés, Mariana S Arias, Maradey Jaimes, Paula A Rossi Paschoal, Alexandre Tabares-Soto, Reinel Isaza, Gustavo Guyot, Romain Brief Bioinform Problem Solving Protocol LTR-retrotransposons are the most abundant repeat sequences in plant genomes and play an important role in evolution and biodiversity. Their characterization is of great importance to understand their dynamics. However, the identification and classification of these elements remains a challenge today. Moreover, current software can be relatively slow (from hours to days), sometimes involve a lot of manual work and do not reach satisfactory levels in terms of precision and sensitivity. Here we present Inpactor2, an accurate and fast application that creates LTR-retrotransposon reference libraries in a very short time. Inpactor2 takes an assembled genome as input and follows a hybrid approach (deep learning and structure-based) to detect elements, filter partial sequences and finally classify intact sequences into superfamilies and, as very few tools do, into lineages. This tool takes advantage of multi-core and GPU architectures to decrease execution times. Using the rice genome, Inpactor2 showed a run time of 5 minutes (faster than other tools) and has the best accuracy and F1-Score of the tools tested here, also having the second best accuracy and specificity only surpassed by EDTA, but achieving 28% higher sensitivity. For large genomes, Inpactor2 is up to seven times faster than other available bioinformatics tools. Oxford University Press 2022-12-10 /pmc/articles/PMC9851300/ /pubmed/36502372 http://dx.doi.org/10.1093/bib/bbac511 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Problem Solving Protocol Orozco-Arias, Simon Humberto Lopez-Murillo, Luis Candamil-Cortés, Mariana S Arias, Maradey Jaimes, Paula A Rossi Paschoal, Alexandre Tabares-Soto, Reinel Isaza, Gustavo Guyot, Romain Inpactor2: a software based on deep learning to identify and classify LTR-retrotransposons in plant genomes |
title | Inpactor2: a software based on deep learning to identify and classify LTR-retrotransposons in plant genomes |
title_full | Inpactor2: a software based on deep learning to identify and classify LTR-retrotransposons in plant genomes |
title_fullStr | Inpactor2: a software based on deep learning to identify and classify LTR-retrotransposons in plant genomes |
title_full_unstemmed | Inpactor2: a software based on deep learning to identify and classify LTR-retrotransposons in plant genomes |
title_short | Inpactor2: a software based on deep learning to identify and classify LTR-retrotransposons in plant genomes |
title_sort | inpactor2: a software based on deep learning to identify and classify ltr-retrotransposons in plant genomes |
topic | Problem Solving Protocol |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9851300/ https://www.ncbi.nlm.nih.gov/pubmed/36502372 http://dx.doi.org/10.1093/bib/bbac511 |
work_keys_str_mv | AT orozcoariassimon inpactor2asoftwarebasedondeeplearningtoidentifyandclassifyltrretrotransposonsinplantgenomes AT humbertolopezmurilloluis inpactor2asoftwarebasedondeeplearningtoidentifyandclassifyltrretrotransposonsinplantgenomes AT candamilcortesmarianas inpactor2asoftwarebasedondeeplearningtoidentifyandclassifyltrretrotransposonsinplantgenomes AT ariasmaradey inpactor2asoftwarebasedondeeplearningtoidentifyandclassifyltrretrotransposonsinplantgenomes AT jaimespaulaa inpactor2asoftwarebasedondeeplearningtoidentifyandclassifyltrretrotransposonsinplantgenomes AT rossipaschoalalexandre inpactor2asoftwarebasedondeeplearningtoidentifyandclassifyltrretrotransposonsinplantgenomes AT tabaressotoreinel inpactor2asoftwarebasedondeeplearningtoidentifyandclassifyltrretrotransposonsinplantgenomes AT isazagustavo inpactor2asoftwarebasedondeeplearningtoidentifyandclassifyltrretrotransposonsinplantgenomes AT guyotromain inpactor2asoftwarebasedondeeplearningtoidentifyandclassifyltrretrotransposonsinplantgenomes |