Cargando…
Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability
The low complexity domain (LCD) sequence has been defined in terms of entropy using a 12 amino acid sliding window along a protein sequence in the study of disease-related genes. The amyotrophic lateral sclerosis (ALS)-related TDP-43 protein sequence with intra-LCD structural information based on cr...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8393862/ https://www.ncbi.nlm.nih.gov/pubmed/34441178 http://dx.doi.org/10.3390/e23081038 |
_version_ | 1783743820622462976 |
---|---|
author | Dehipawala, Sunil Cheung, Eric Tremberger, George Cheung, Tak |
author_facet | Dehipawala, Sunil Cheung, Eric Tremberger, George Cheung, Tak |
author_sort | Dehipawala, Sunil |
collection | PubMed |
description | The low complexity domain (LCD) sequence has been defined in terms of entropy using a 12 amino acid sliding window along a protein sequence in the study of disease-related genes. The amyotrophic lateral sclerosis (ALS)-related TDP-43 protein sequence with intra-LCD structural information based on cryo-EM data was published recently. An application of entropy and Higuchi fractal dimension calculations was described using the Znf521 and HAR1 sequences. A computational analysis of the intra-LCD sequence entropy and Higuchi fractal dimension values at the amino acid level and at the ATCG nucleotide level were conducted without the sliding window requirement. The computational results were consistent in predicting the intermediate entropy/fractal dimension value produced when two subsequences at two different entropy/fractal dimension values were combined. The computational method without the application of a sliding-window was extended to an analysis of the recently reported virulent genes—Orf6, Nsp6, and Orf7a—in SARS-CoV-2. The relationship between the virulence functionality and entropy values was found to have correlation coefficients between 0.84 and 0.99, using a 5% uncertainty on the cell viability data. The analysis found that the most virulent Orf6 gene sequence had the lowest nucleotide entropy and the highest protein fractal dimension, in line with extreme value theory. The Orf6 codon usage bias in relation to vaccine design was discussed. |
format | Online Article Text |
id | pubmed-8393862 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-83938622021-08-28 Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability Dehipawala, Sunil Cheung, Eric Tremberger, George Cheung, Tak Entropy (Basel) Article The low complexity domain (LCD) sequence has been defined in terms of entropy using a 12 amino acid sliding window along a protein sequence in the study of disease-related genes. The amyotrophic lateral sclerosis (ALS)-related TDP-43 protein sequence with intra-LCD structural information based on cryo-EM data was published recently. An application of entropy and Higuchi fractal dimension calculations was described using the Znf521 and HAR1 sequences. A computational analysis of the intra-LCD sequence entropy and Higuchi fractal dimension values at the amino acid level and at the ATCG nucleotide level were conducted without the sliding window requirement. The computational results were consistent in predicting the intermediate entropy/fractal dimension value produced when two subsequences at two different entropy/fractal dimension values were combined. The computational method without the application of a sliding-window was extended to an analysis of the recently reported virulent genes—Orf6, Nsp6, and Orf7a—in SARS-CoV-2. The relationship between the virulence functionality and entropy values was found to have correlation coefficients between 0.84 and 0.99, using a 5% uncertainty on the cell viability data. The analysis found that the most virulent Orf6 gene sequence had the lowest nucleotide entropy and the highest protein fractal dimension, in line with extreme value theory. The Orf6 codon usage bias in relation to vaccine design was discussed. MDPI 2021-08-12 /pmc/articles/PMC8393862/ /pubmed/34441178 http://dx.doi.org/10.3390/e23081038 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Dehipawala, Sunil Cheung, Eric Tremberger, George Cheung, Tak Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability |
title | Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability |
title_full | Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability |
title_fullStr | Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability |
title_full_unstemmed | Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability |
title_short | Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability |
title_sort | entropy and fractal dimension study of the tdp-43 protein low complexity domain sequence in als disease severity and sars-cov-2 gene sequences in virulence variability |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8393862/ https://www.ncbi.nlm.nih.gov/pubmed/34441178 http://dx.doi.org/10.3390/e23081038 |
work_keys_str_mv | AT dehipawalasunil entropyandfractaldimensionstudyofthetdp43proteinlowcomplexitydomainsequenceinalsdiseaseseverityandsarscov2genesequencesinvirulencevariability AT cheungeric entropyandfractaldimensionstudyofthetdp43proteinlowcomplexitydomainsequenceinalsdiseaseseverityandsarscov2genesequencesinvirulencevariability AT trembergergeorge entropyandfractaldimensionstudyofthetdp43proteinlowcomplexitydomainsequenceinalsdiseaseseverityandsarscov2genesequencesinvirulencevariability AT cheungtak entropyandfractaldimensionstudyofthetdp43proteinlowcomplexitydomainsequenceinalsdiseaseseverityandsarscov2genesequencesinvirulencevariability |