Cargando…

Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability

The low complexity domain (LCD) sequence has been defined in terms of entropy using a 12 amino acid sliding window along a protein sequence in the study of disease-related genes. The amyotrophic lateral sclerosis (ALS)-related TDP-43 protein sequence with intra-LCD structural information based on cr...

Descripción completa

Detalles Bibliográficos
Autores principales: Dehipawala, Sunil, Cheung, Eric, Tremberger, George, Cheung, Tak
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8393862/
https://www.ncbi.nlm.nih.gov/pubmed/34441178
http://dx.doi.org/10.3390/e23081038
_version_ 1783743820622462976
author Dehipawala, Sunil
Cheung, Eric
Tremberger, George
Cheung, Tak
author_facet Dehipawala, Sunil
Cheung, Eric
Tremberger, George
Cheung, Tak
author_sort Dehipawala, Sunil
collection PubMed
description The low complexity domain (LCD) sequence has been defined in terms of entropy using a 12 amino acid sliding window along a protein sequence in the study of disease-related genes. The amyotrophic lateral sclerosis (ALS)-related TDP-43 protein sequence with intra-LCD structural information based on cryo-EM data was published recently. An application of entropy and Higuchi fractal dimension calculations was described using the Znf521 and HAR1 sequences. A computational analysis of the intra-LCD sequence entropy and Higuchi fractal dimension values at the amino acid level and at the ATCG nucleotide level were conducted without the sliding window requirement. The computational results were consistent in predicting the intermediate entropy/fractal dimension value produced when two subsequences at two different entropy/fractal dimension values were combined. The computational method without the application of a sliding-window was extended to an analysis of the recently reported virulent genes—Orf6, Nsp6, and Orf7a—in SARS-CoV-2. The relationship between the virulence functionality and entropy values was found to have correlation coefficients between 0.84 and 0.99, using a 5% uncertainty on the cell viability data. The analysis found that the most virulent Orf6 gene sequence had the lowest nucleotide entropy and the highest protein fractal dimension, in line with extreme value theory. The Orf6 codon usage bias in relation to vaccine design was discussed.
format Online
Article
Text
id pubmed-8393862
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-83938622021-08-28 Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability Dehipawala, Sunil Cheung, Eric Tremberger, George Cheung, Tak Entropy (Basel) Article The low complexity domain (LCD) sequence has been defined in terms of entropy using a 12 amino acid sliding window along a protein sequence in the study of disease-related genes. The amyotrophic lateral sclerosis (ALS)-related TDP-43 protein sequence with intra-LCD structural information based on cryo-EM data was published recently. An application of entropy and Higuchi fractal dimension calculations was described using the Znf521 and HAR1 sequences. A computational analysis of the intra-LCD sequence entropy and Higuchi fractal dimension values at the amino acid level and at the ATCG nucleotide level were conducted without the sliding window requirement. The computational results were consistent in predicting the intermediate entropy/fractal dimension value produced when two subsequences at two different entropy/fractal dimension values were combined. The computational method without the application of a sliding-window was extended to an analysis of the recently reported virulent genes—Orf6, Nsp6, and Orf7a—in SARS-CoV-2. The relationship between the virulence functionality and entropy values was found to have correlation coefficients between 0.84 and 0.99, using a 5% uncertainty on the cell viability data. The analysis found that the most virulent Orf6 gene sequence had the lowest nucleotide entropy and the highest protein fractal dimension, in line with extreme value theory. The Orf6 codon usage bias in relation to vaccine design was discussed. MDPI 2021-08-12 /pmc/articles/PMC8393862/ /pubmed/34441178 http://dx.doi.org/10.3390/e23081038 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Dehipawala, Sunil
Cheung, Eric
Tremberger, George
Cheung, Tak
Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability
title Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability
title_full Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability
title_fullStr Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability
title_full_unstemmed Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability
title_short Entropy and Fractal Dimension Study of the TDP-43 Protein Low Complexity Domain Sequence in ALS Disease Severity and SARS-CoV-2 Gene Sequences in Virulence Variability
title_sort entropy and fractal dimension study of the tdp-43 protein low complexity domain sequence in als disease severity and sars-cov-2 gene sequences in virulence variability
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8393862/
https://www.ncbi.nlm.nih.gov/pubmed/34441178
http://dx.doi.org/10.3390/e23081038
work_keys_str_mv AT dehipawalasunil entropyandfractaldimensionstudyofthetdp43proteinlowcomplexitydomainsequenceinalsdiseaseseverityandsarscov2genesequencesinvirulencevariability
AT cheungeric entropyandfractaldimensionstudyofthetdp43proteinlowcomplexitydomainsequenceinalsdiseaseseverityandsarscov2genesequencesinvirulencevariability
AT trembergergeorge entropyandfractaldimensionstudyofthetdp43proteinlowcomplexitydomainsequenceinalsdiseaseseverityandsarscov2genesequencesinvirulencevariability
AT cheungtak entropyandfractaldimensionstudyofthetdp43proteinlowcomplexitydomainsequenceinalsdiseaseseverityandsarscov2genesequencesinvirulencevariability