Cargando…

DeepGSR: an optimized deep-learning structure for the recognition of genomic signals and regions

MOTIVATION: Recognition of different genomic signals and regions (GSRs) in DNA is crucial for understanding genome organization, gene regulation, and gene function, which in turn generate better genome and gene annotations. Although many methods have been developed to recognize GSRs, their pure comp...

Descripción completa

Detalles Bibliográficos
Autores principales: Kalkatawi, Manal, Magana-Mora, Arturo, Jankovic, Boris, Bajic, Vladimir B
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6449759/
https://www.ncbi.nlm.nih.gov/pubmed/30184052
http://dx.doi.org/10.1093/bioinformatics/bty752
_version_ 1783408917408120832
author Kalkatawi, Manal
Magana-Mora, Arturo
Jankovic, Boris
Bajic, Vladimir B
author_facet Kalkatawi, Manal
Magana-Mora, Arturo
Jankovic, Boris
Bajic, Vladimir B
author_sort Kalkatawi, Manal
collection PubMed
description MOTIVATION: Recognition of different genomic signals and regions (GSRs) in DNA is crucial for understanding genome organization, gene regulation, and gene function, which in turn generate better genome and gene annotations. Although many methods have been developed to recognize GSRs, their pure computational identification remains challenging. Moreover, various GSRs usually require a specialized set of features for developing robust recognition models. Recently, deep-learning (DL) methods have been shown to generate more accurate prediction models than ‘shallow’ methods without the need to develop specialized features for the problems in question. Here, we explore the potential use of DL for the recognition of GSRs. RESULTS: We developed DeepGSR, an optimized DL architecture for the prediction of different types of GSRs. The performance of the DeepGSR structure is evaluated on the recognition of polyadenylation signals (PAS) and translation initiation sites (TIS) of different organisms: human, mouse, bovine and fruit fly. The results show that DeepGSR outperformed the state-of-the-art methods, reducing the classification error rate of the PAS and TIS prediction in the human genome by up to 29% and 86%, respectively. Moreover, the cross-organisms and genome-wide analyses we performed, confirmed the robustness of DeepGSR and provided new insights into the conservation of examined GSRs across species. AVAILABILITY AND IMPLEMENTATION: DeepGSR is implemented in Python using Keras API; it is available as open-source software and can be obtained at https://doi.org/10.5281/zenodo.1117159. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-6449759
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-64497592019-04-09 DeepGSR: an optimized deep-learning structure for the recognition of genomic signals and regions Kalkatawi, Manal Magana-Mora, Arturo Jankovic, Boris Bajic, Vladimir B Bioinformatics Original Papers MOTIVATION: Recognition of different genomic signals and regions (GSRs) in DNA is crucial for understanding genome organization, gene regulation, and gene function, which in turn generate better genome and gene annotations. Although many methods have been developed to recognize GSRs, their pure computational identification remains challenging. Moreover, various GSRs usually require a specialized set of features for developing robust recognition models. Recently, deep-learning (DL) methods have been shown to generate more accurate prediction models than ‘shallow’ methods without the need to develop specialized features for the problems in question. Here, we explore the potential use of DL for the recognition of GSRs. RESULTS: We developed DeepGSR, an optimized DL architecture for the prediction of different types of GSRs. The performance of the DeepGSR structure is evaluated on the recognition of polyadenylation signals (PAS) and translation initiation sites (TIS) of different organisms: human, mouse, bovine and fruit fly. The results show that DeepGSR outperformed the state-of-the-art methods, reducing the classification error rate of the PAS and TIS prediction in the human genome by up to 29% and 86%, respectively. Moreover, the cross-organisms and genome-wide analyses we performed, confirmed the robustness of DeepGSR and provided new insights into the conservation of examined GSRs across species. AVAILABILITY AND IMPLEMENTATION: DeepGSR is implemented in Python using Keras API; it is available as open-source software and can be obtained at https://doi.org/10.5281/zenodo.1117159. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2019-04-01 2018-09-01 /pmc/articles/PMC6449759/ /pubmed/30184052 http://dx.doi.org/10.1093/bioinformatics/bty752 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Original Papers
Kalkatawi, Manal
Magana-Mora, Arturo
Jankovic, Boris
Bajic, Vladimir B
DeepGSR: an optimized deep-learning structure for the recognition of genomic signals and regions
title DeepGSR: an optimized deep-learning structure for the recognition of genomic signals and regions
title_full DeepGSR: an optimized deep-learning structure for the recognition of genomic signals and regions
title_fullStr DeepGSR: an optimized deep-learning structure for the recognition of genomic signals and regions
title_full_unstemmed DeepGSR: an optimized deep-learning structure for the recognition of genomic signals and regions
title_short DeepGSR: an optimized deep-learning structure for the recognition of genomic signals and regions
title_sort deepgsr: an optimized deep-learning structure for the recognition of genomic signals and regions
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6449759/
https://www.ncbi.nlm.nih.gov/pubmed/30184052
http://dx.doi.org/10.1093/bioinformatics/bty752
work_keys_str_mv AT kalkatawimanal deepgsranoptimizeddeeplearningstructurefortherecognitionofgenomicsignalsandregions
AT maganamoraarturo deepgsranoptimizeddeeplearningstructurefortherecognitionofgenomicsignalsandregions
AT jankovicboris deepgsranoptimizeddeeplearningstructurefortherecognitionofgenomicsignalsandregions
AT bajicvladimirb deepgsranoptimizeddeeplearningstructurefortherecognitionofgenomicsignalsandregions