Cargando…

Integrated entropy-based approach for analyzing exons and introns in DNA sequences

BACKGROUND: Numerous essential algorithms and methods, including entropy-based quantitative methods, have been developed to analyze complex DNA sequences since the last decade. Exons and introns are the most notable components of DNA and their identification and prediction are always the focus of st...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Junyi, Zhang, Li, Li, Huinian, Ping, Yuan, Xu, Qingzhe, Wang, Rongjie, Tan, Renjie, Wang, Zhen, Liu, Bo, Wang, Yadong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6557737/
https://www.ncbi.nlm.nih.gov/pubmed/31182012
http://dx.doi.org/10.1186/s12859-019-2772-y
_version_ 1783425481528311808
author Li, Junyi
Zhang, Li
Li, Huinian
Ping, Yuan
Xu, Qingzhe
Wang, Rongjie
Tan, Renjie
Wang, Zhen
Liu, Bo
Wang, Yadong
author_facet Li, Junyi
Zhang, Li
Li, Huinian
Ping, Yuan
Xu, Qingzhe
Wang, Rongjie
Tan, Renjie
Wang, Zhen
Liu, Bo
Wang, Yadong
author_sort Li, Junyi
collection PubMed
description BACKGROUND: Numerous essential algorithms and methods, including entropy-based quantitative methods, have been developed to analyze complex DNA sequences since the last decade. Exons and introns are the most notable components of DNA and their identification and prediction are always the focus of state-of-the-art research. RESULTS: In this study, we designed an integrated entropy-based analysis approach, which involves modified topological entropy calculation, genomic signal processing (GSP) method and singular value decomposition (SVD), to investigate exons and introns in DNA sequences. We optimized and implemented the topological entropy and the generalized topological entropy to calculate the complexity of DNA sequences, highlighting the characteristics of repetition sequences. By comparing digitalizing entropy values of exons and introns, we observed that they are significantly different. After we converted DNA data to numerical topological entropy value, we applied SVD method to effectively investigate exon and intron regions on a single gene sequence. Additionally, several genes across five species are used for exon predictions. CONCLUSIONS: Our approach not only helps to explore the complexity of DNA sequence and its functional elements, but also provides an entropy-based GSP method to analyze exon and intron regions. Our work is feasible across different species and extendable to analyze other components in both coding and noncoding region of DNA sequences. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2772-y) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6557737
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-65577372019-06-13 Integrated entropy-based approach for analyzing exons and introns in DNA sequences Li, Junyi Zhang, Li Li, Huinian Ping, Yuan Xu, Qingzhe Wang, Rongjie Tan, Renjie Wang, Zhen Liu, Bo Wang, Yadong BMC Bioinformatics Research BACKGROUND: Numerous essential algorithms and methods, including entropy-based quantitative methods, have been developed to analyze complex DNA sequences since the last decade. Exons and introns are the most notable components of DNA and their identification and prediction are always the focus of state-of-the-art research. RESULTS: In this study, we designed an integrated entropy-based analysis approach, which involves modified topological entropy calculation, genomic signal processing (GSP) method and singular value decomposition (SVD), to investigate exons and introns in DNA sequences. We optimized and implemented the topological entropy and the generalized topological entropy to calculate the complexity of DNA sequences, highlighting the characteristics of repetition sequences. By comparing digitalizing entropy values of exons and introns, we observed that they are significantly different. After we converted DNA data to numerical topological entropy value, we applied SVD method to effectively investigate exon and intron regions on a single gene sequence. Additionally, several genes across five species are used for exon predictions. CONCLUSIONS: Our approach not only helps to explore the complexity of DNA sequence and its functional elements, but also provides an entropy-based GSP method to analyze exon and intron regions. Our work is feasible across different species and extendable to analyze other components in both coding and noncoding region of DNA sequences. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2772-y) contains supplementary material, which is available to authorized users. BioMed Central 2019-06-10 /pmc/articles/PMC6557737/ /pubmed/31182012 http://dx.doi.org/10.1186/s12859-019-2772-y Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Li, Junyi
Zhang, Li
Li, Huinian
Ping, Yuan
Xu, Qingzhe
Wang, Rongjie
Tan, Renjie
Wang, Zhen
Liu, Bo
Wang, Yadong
Integrated entropy-based approach for analyzing exons and introns in DNA sequences
title Integrated entropy-based approach for analyzing exons and introns in DNA sequences
title_full Integrated entropy-based approach for analyzing exons and introns in DNA sequences
title_fullStr Integrated entropy-based approach for analyzing exons and introns in DNA sequences
title_full_unstemmed Integrated entropy-based approach for analyzing exons and introns in DNA sequences
title_short Integrated entropy-based approach for analyzing exons and introns in DNA sequences
title_sort integrated entropy-based approach for analyzing exons and introns in dna sequences
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6557737/
https://www.ncbi.nlm.nih.gov/pubmed/31182012
http://dx.doi.org/10.1186/s12859-019-2772-y
work_keys_str_mv AT lijunyi integratedentropybasedapproachforanalyzingexonsandintronsindnasequences
AT zhangli integratedentropybasedapproachforanalyzingexonsandintronsindnasequences
AT lihuinian integratedentropybasedapproachforanalyzingexonsandintronsindnasequences
AT pingyuan integratedentropybasedapproachforanalyzingexonsandintronsindnasequences
AT xuqingzhe integratedentropybasedapproachforanalyzingexonsandintronsindnasequences
AT wangrongjie integratedentropybasedapproachforanalyzingexonsandintronsindnasequences
AT tanrenjie integratedentropybasedapproachforanalyzingexonsandintronsindnasequences
AT wangzhen integratedentropybasedapproachforanalyzingexonsandintronsindnasequences
AT liubo integratedentropybasedapproachforanalyzingexonsandintronsindnasequences
AT wangyadong integratedentropybasedapproachforanalyzingexonsandintronsindnasequences