Cargando…
An optimized approach for annotation of large eukaryotic genomic sequences using genetic algorithm
BACKGROUND: Detection of important functional and/or structural elements and identification of their positions in a large eukaryotic genomic sequence are an active research area. Gene is an important functional and structural unit of DNA. The computation of gene prediction is, therefore, very essent...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5655831/ https://www.ncbi.nlm.nih.gov/pubmed/29065853 http://dx.doi.org/10.1186/s12859-017-1874-7 |
_version_ | 1783273611405033472 |
---|---|
author | Chowdhury, Biswanath Garai, Arnav Garai, Gautam |
author_facet | Chowdhury, Biswanath Garai, Arnav Garai, Gautam |
author_sort | Chowdhury, Biswanath |
collection | PubMed |
description | BACKGROUND: Detection of important functional and/or structural elements and identification of their positions in a large eukaryotic genomic sequence are an active research area. Gene is an important functional and structural unit of DNA. The computation of gene prediction is, therefore, very essential for detailed genome annotation. RESULTS: In this paper, we propose a new gene prediction technique based on Genetic Algorithm (GA) to determine the optimal positions of exons of a gene in a chromosome or genome. The correct identification of the coding and non-coding regions is difficult and computationally demanding. The proposed genetic-based method, named Gene Prediction with Genetic Algorithm (GPGA), reduces this problem by searching only one exon at a time instead of all exons along with its introns. This representation carries a significant advantage in that it breaks the entire gene-finding problem into a number of smaller sub-problems, thereby reducing the computational complexity. We tested the performance of the GPGA with existing benchmark datasets and compared the results with well-known and relevant techniques. The comparison shows the better or comparable performance of the proposed method. We also used GPGA for annotating the human chromosome 21 (HS21) using cross-species comparisons with the mouse orthologs. CONCLUSION: It was noted that the GPGA predicted true genes with better accuracy than other well-known approaches. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-017-1874-7) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5655831 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-56558312017-10-31 An optimized approach for annotation of large eukaryotic genomic sequences using genetic algorithm Chowdhury, Biswanath Garai, Arnav Garai, Gautam BMC Bioinformatics Methodology Article BACKGROUND: Detection of important functional and/or structural elements and identification of their positions in a large eukaryotic genomic sequence are an active research area. Gene is an important functional and structural unit of DNA. The computation of gene prediction is, therefore, very essential for detailed genome annotation. RESULTS: In this paper, we propose a new gene prediction technique based on Genetic Algorithm (GA) to determine the optimal positions of exons of a gene in a chromosome or genome. The correct identification of the coding and non-coding regions is difficult and computationally demanding. The proposed genetic-based method, named Gene Prediction with Genetic Algorithm (GPGA), reduces this problem by searching only one exon at a time instead of all exons along with its introns. This representation carries a significant advantage in that it breaks the entire gene-finding problem into a number of smaller sub-problems, thereby reducing the computational complexity. We tested the performance of the GPGA with existing benchmark datasets and compared the results with well-known and relevant techniques. The comparison shows the better or comparable performance of the proposed method. We also used GPGA for annotating the human chromosome 21 (HS21) using cross-species comparisons with the mouse orthologs. CONCLUSION: It was noted that the GPGA predicted true genes with better accuracy than other well-known approaches. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-017-1874-7) contains supplementary material, which is available to authorized users. BioMed Central 2017-10-24 /pmc/articles/PMC5655831/ /pubmed/29065853 http://dx.doi.org/10.1186/s12859-017-1874-7 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Chowdhury, Biswanath Garai, Arnav Garai, Gautam An optimized approach for annotation of large eukaryotic genomic sequences using genetic algorithm |
title | An optimized approach for annotation of large eukaryotic genomic sequences using genetic algorithm |
title_full | An optimized approach for annotation of large eukaryotic genomic sequences using genetic algorithm |
title_fullStr | An optimized approach for annotation of large eukaryotic genomic sequences using genetic algorithm |
title_full_unstemmed | An optimized approach for annotation of large eukaryotic genomic sequences using genetic algorithm |
title_short | An optimized approach for annotation of large eukaryotic genomic sequences using genetic algorithm |
title_sort | optimized approach for annotation of large eukaryotic genomic sequences using genetic algorithm |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5655831/ https://www.ncbi.nlm.nih.gov/pubmed/29065853 http://dx.doi.org/10.1186/s12859-017-1874-7 |
work_keys_str_mv | AT chowdhurybiswanath anoptimizedapproachforannotationoflargeeukaryoticgenomicsequencesusinggeneticalgorithm AT garaiarnav anoptimizedapproachforannotationoflargeeukaryoticgenomicsequencesusinggeneticalgorithm AT garaigautam anoptimizedapproachforannotationoflargeeukaryoticgenomicsequencesusinggeneticalgorithm AT chowdhurybiswanath optimizedapproachforannotationoflargeeukaryoticgenomicsequencesusinggeneticalgorithm AT garaiarnav optimizedapproachforannotationoflargeeukaryoticgenomicsequencesusinggeneticalgorithm AT garaigautam optimizedapproachforannotationoflargeeukaryoticgenomicsequencesusinggeneticalgorithm |