Cargando…
Integrated entropy-based approach for analyzing exons and introns in DNA sequences
BACKGROUND: Numerous essential algorithms and methods, including entropy-based quantitative methods, have been developed to analyze complex DNA sequences since the last decade. Exons and introns are the most notable components of DNA and their identification and prediction are always the focus of st...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6557737/ https://www.ncbi.nlm.nih.gov/pubmed/31182012 http://dx.doi.org/10.1186/s12859-019-2772-y |
_version_ | 1783425481528311808 |
---|---|
author | Li, Junyi Zhang, Li Li, Huinian Ping, Yuan Xu, Qingzhe Wang, Rongjie Tan, Renjie Wang, Zhen Liu, Bo Wang, Yadong |
author_facet | Li, Junyi Zhang, Li Li, Huinian Ping, Yuan Xu, Qingzhe Wang, Rongjie Tan, Renjie Wang, Zhen Liu, Bo Wang, Yadong |
author_sort | Li, Junyi |
collection | PubMed |
description | BACKGROUND: Numerous essential algorithms and methods, including entropy-based quantitative methods, have been developed to analyze complex DNA sequences since the last decade. Exons and introns are the most notable components of DNA and their identification and prediction are always the focus of state-of-the-art research. RESULTS: In this study, we designed an integrated entropy-based analysis approach, which involves modified topological entropy calculation, genomic signal processing (GSP) method and singular value decomposition (SVD), to investigate exons and introns in DNA sequences. We optimized and implemented the topological entropy and the generalized topological entropy to calculate the complexity of DNA sequences, highlighting the characteristics of repetition sequences. By comparing digitalizing entropy values of exons and introns, we observed that they are significantly different. After we converted DNA data to numerical topological entropy value, we applied SVD method to effectively investigate exon and intron regions on a single gene sequence. Additionally, several genes across five species are used for exon predictions. CONCLUSIONS: Our approach not only helps to explore the complexity of DNA sequence and its functional elements, but also provides an entropy-based GSP method to analyze exon and intron regions. Our work is feasible across different species and extendable to analyze other components in both coding and noncoding region of DNA sequences. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2772-y) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6557737 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-65577372019-06-13 Integrated entropy-based approach for analyzing exons and introns in DNA sequences Li, Junyi Zhang, Li Li, Huinian Ping, Yuan Xu, Qingzhe Wang, Rongjie Tan, Renjie Wang, Zhen Liu, Bo Wang, Yadong BMC Bioinformatics Research BACKGROUND: Numerous essential algorithms and methods, including entropy-based quantitative methods, have been developed to analyze complex DNA sequences since the last decade. Exons and introns are the most notable components of DNA and their identification and prediction are always the focus of state-of-the-art research. RESULTS: In this study, we designed an integrated entropy-based analysis approach, which involves modified topological entropy calculation, genomic signal processing (GSP) method and singular value decomposition (SVD), to investigate exons and introns in DNA sequences. We optimized and implemented the topological entropy and the generalized topological entropy to calculate the complexity of DNA sequences, highlighting the characteristics of repetition sequences. By comparing digitalizing entropy values of exons and introns, we observed that they are significantly different. After we converted DNA data to numerical topological entropy value, we applied SVD method to effectively investigate exon and intron regions on a single gene sequence. Additionally, several genes across five species are used for exon predictions. CONCLUSIONS: Our approach not only helps to explore the complexity of DNA sequence and its functional elements, but also provides an entropy-based GSP method to analyze exon and intron regions. Our work is feasible across different species and extendable to analyze other components in both coding and noncoding region of DNA sequences. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2772-y) contains supplementary material, which is available to authorized users. BioMed Central 2019-06-10 /pmc/articles/PMC6557737/ /pubmed/31182012 http://dx.doi.org/10.1186/s12859-019-2772-y Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Li, Junyi Zhang, Li Li, Huinian Ping, Yuan Xu, Qingzhe Wang, Rongjie Tan, Renjie Wang, Zhen Liu, Bo Wang, Yadong Integrated entropy-based approach for analyzing exons and introns in DNA sequences |
title | Integrated entropy-based approach for analyzing exons and introns in DNA sequences |
title_full | Integrated entropy-based approach for analyzing exons and introns in DNA sequences |
title_fullStr | Integrated entropy-based approach for analyzing exons and introns in DNA sequences |
title_full_unstemmed | Integrated entropy-based approach for analyzing exons and introns in DNA sequences |
title_short | Integrated entropy-based approach for analyzing exons and introns in DNA sequences |
title_sort | integrated entropy-based approach for analyzing exons and introns in dna sequences |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6557737/ https://www.ncbi.nlm.nih.gov/pubmed/31182012 http://dx.doi.org/10.1186/s12859-019-2772-y |
work_keys_str_mv | AT lijunyi integratedentropybasedapproachforanalyzingexonsandintronsindnasequences AT zhangli integratedentropybasedapproachforanalyzingexonsandintronsindnasequences AT lihuinian integratedentropybasedapproachforanalyzingexonsandintronsindnasequences AT pingyuan integratedentropybasedapproachforanalyzingexonsandintronsindnasequences AT xuqingzhe integratedentropybasedapproachforanalyzingexonsandintronsindnasequences AT wangrongjie integratedentropybasedapproachforanalyzingexonsandintronsindnasequences AT tanrenjie integratedentropybasedapproachforanalyzingexonsandintronsindnasequences AT wangzhen integratedentropybasedapproachforanalyzingexonsandintronsindnasequences AT liubo integratedentropybasedapproachforanalyzingexonsandintronsindnasequences AT wangyadong integratedentropybasedapproachforanalyzingexonsandintronsindnasequences |