Cargando…
A systematic review of the application of machine learning in the detection and classification of transposable elements
BACKGROUND: Transposable elements (TEs) constitute the most common repeated sequences in eukaryotic genomes. Recent studies demonstrated their deep impact on species diversity, adaptation to the environment and diseases. Although there are many conventional bioinformatics algorithms for detecting an...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6967008/ https://www.ncbi.nlm.nih.gov/pubmed/31976169 http://dx.doi.org/10.7717/peerj.8311 |
_version_ | 1783488863025496064 |
---|---|
author | Orozco-Arias, Simon Isaza, Gustavo Guyot, Romain Tabares-Soto, Reinel |
author_facet | Orozco-Arias, Simon Isaza, Gustavo Guyot, Romain Tabares-Soto, Reinel |
author_sort | Orozco-Arias, Simon |
collection | PubMed |
description | BACKGROUND: Transposable elements (TEs) constitute the most common repeated sequences in eukaryotic genomes. Recent studies demonstrated their deep impact on species diversity, adaptation to the environment and diseases. Although there are many conventional bioinformatics algorithms for detecting and classifying TEs, none have achieved reliable results on different types of TEs. Machine learning (ML) techniques can automatically extract hidden patterns and novel information from labeled or non-labeled data and have been applied to solving several scientific problems. METHODOLOGY: We followed the Systematic Literature Review (SLR) process, applying the six stages of the review protocol from it, but added a previous stage, which aims to detect the need for a review. Then search equations were formulated and executed in several literature databases. Relevant publications were scanned and used to extract evidence to answer research questions. RESULTS: Several ML approaches have already been tested on other bioinformatics problems with promising results, yet there are few algorithms and architectures available in literature focused specifically on TEs, despite representing the majority of the nuclear DNA of many organisms. Only 35 articles were found and categorized as relevant in TE or related fields. CONCLUSIONS: ML is a powerful tool that can be used to address many problems. Although ML techniques have been used widely in other biological tasks, their utilization in TE analyses is still limited. Following the SLR, it was possible to notice that the use of ML for TE analyses (detection and classification) is an open problem, and this new field of research is growing in interest. |
format | Online Article Text |
id | pubmed-6967008 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-69670082020-01-23 A systematic review of the application of machine learning in the detection and classification of transposable elements Orozco-Arias, Simon Isaza, Gustavo Guyot, Romain Tabares-Soto, Reinel PeerJ Bioinformatics BACKGROUND: Transposable elements (TEs) constitute the most common repeated sequences in eukaryotic genomes. Recent studies demonstrated their deep impact on species diversity, adaptation to the environment and diseases. Although there are many conventional bioinformatics algorithms for detecting and classifying TEs, none have achieved reliable results on different types of TEs. Machine learning (ML) techniques can automatically extract hidden patterns and novel information from labeled or non-labeled data and have been applied to solving several scientific problems. METHODOLOGY: We followed the Systematic Literature Review (SLR) process, applying the six stages of the review protocol from it, but added a previous stage, which aims to detect the need for a review. Then search equations were formulated and executed in several literature databases. Relevant publications were scanned and used to extract evidence to answer research questions. RESULTS: Several ML approaches have already been tested on other bioinformatics problems with promising results, yet there are few algorithms and architectures available in literature focused specifically on TEs, despite representing the majority of the nuclear DNA of many organisms. Only 35 articles were found and categorized as relevant in TE or related fields. CONCLUSIONS: ML is a powerful tool that can be used to address many problems. Although ML techniques have been used widely in other biological tasks, their utilization in TE analyses is still limited. Following the SLR, it was possible to notice that the use of ML for TE analyses (detection and classification) is an open problem, and this new field of research is growing in interest. PeerJ Inc. 2019-12-18 /pmc/articles/PMC6967008/ /pubmed/31976169 http://dx.doi.org/10.7717/peerj.8311 Text en © 2019 Orozco-Arias et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited. |
spellingShingle | Bioinformatics Orozco-Arias, Simon Isaza, Gustavo Guyot, Romain Tabares-Soto, Reinel A systematic review of the application of machine learning in the detection and classification of transposable elements |
title | A systematic review of the application of machine learning in the detection and classification of transposable elements |
title_full | A systematic review of the application of machine learning in the detection and classification of transposable elements |
title_fullStr | A systematic review of the application of machine learning in the detection and classification of transposable elements |
title_full_unstemmed | A systematic review of the application of machine learning in the detection and classification of transposable elements |
title_short | A systematic review of the application of machine learning in the detection and classification of transposable elements |
title_sort | systematic review of the application of machine learning in the detection and classification of transposable elements |
topic | Bioinformatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6967008/ https://www.ncbi.nlm.nih.gov/pubmed/31976169 http://dx.doi.org/10.7717/peerj.8311 |
work_keys_str_mv | AT orozcoariassimon asystematicreviewoftheapplicationofmachinelearninginthedetectionandclassificationoftransposableelements AT isazagustavo asystematicreviewoftheapplicationofmachinelearninginthedetectionandclassificationoftransposableelements AT guyotromain asystematicreviewoftheapplicationofmachinelearninginthedetectionandclassificationoftransposableelements AT tabaressotoreinel asystematicreviewoftheapplicationofmachinelearninginthedetectionandclassificationoftransposableelements AT orozcoariassimon systematicreviewoftheapplicationofmachinelearninginthedetectionandclassificationoftransposableelements AT isazagustavo systematicreviewoftheapplicationofmachinelearninginthedetectionandclassificationoftransposableelements AT guyotromain systematicreviewoftheapplicationofmachinelearninginthedetectionandclassificationoftransposableelements AT tabaressotoreinel systematicreviewoftheapplicationofmachinelearninginthedetectionandclassificationoftransposableelements |