Cargando…
Feature-extraction and analysis based on spatial distribution of amino acids for SARS-CoV-2 Protein sequences
BACKGROUND AND OBJECTIVE: The world is currently facing a global emergency due to COVID-19, which requires immediate strategies to strengthen healthcare facilities and prevent further deaths. To achieve effective remedies and solutions, research on different aspects, including the genomic and proteo...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier Ltd.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8577876/ https://www.ncbi.nlm.nih.gov/pubmed/34815067 http://dx.doi.org/10.1016/j.compbiomed.2021.105024 |
_version_ | 1784596152371380224 |
---|---|
author | Rout, Ranjeet Kumar Hassan, Sk Sarif Sheikh, Sabha Umer, Saiyed Sahoo, Kshira Sagar Gandomi, Amir H. |
author_facet | Rout, Ranjeet Kumar Hassan, Sk Sarif Sheikh, Sabha Umer, Saiyed Sahoo, Kshira Sagar Gandomi, Amir H. |
author_sort | Rout, Ranjeet Kumar |
collection | PubMed |
description | BACKGROUND AND OBJECTIVE: The world is currently facing a global emergency due to COVID-19, which requires immediate strategies to strengthen healthcare facilities and prevent further deaths. To achieve effective remedies and solutions, research on different aspects, including the genomic and proteomic level characterizations of SARS-CoV-2, are critical. In this work, the spatial representation/composition and distribution frequency of 20 amino acids across the primary protein sequences of SARS-CoV-2 were examined according to different parameters. METHOD: To identify the spatial distribution of amino acids over the primary protein sequences of SARS-CoV-2, the Hurst exponent and Shannon entropy were applied as parameters to fetch the autocorrelation and amount of information over the spatial representations. The frequency distribution of each amino acid over the protein sequences was also evaluated. In the case of a one-dimensional sequence, the Hurst exponent (HE) was utilized due to its linear relationship with the fractal dimension (D), i.e. [Formula: see text] , to characterize fractality. Moreover, binary Shannon entropy was considered to measure the uncertainty in a binary sequence then further applied to calculate amino acid conservation in the primary protein sequences. RESULTS AND CONCLUSION: Fourteen (14) SARS-CoV protein sequences were evaluated and compared with 105 SARS-CoV-2 proteins. The simulation results demonstrate the differences in the collected information about the amino acid spatial distribution in the SARS-CoV-2 and SARS-CoV proteins, enabling researchers to distinguish between the two types of CoV. The spatial arrangement of amino acids also reveals similarities and dissimilarities among the important structural proteins, E, M, N and S, which is pivotal to establish an evolutionary tree with other CoV strains. |
format | Online Article Text |
id | pubmed-8577876 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Elsevier Ltd. |
record_format | MEDLINE/PubMed |
spelling | pubmed-85778762021-11-10 Feature-extraction and analysis based on spatial distribution of amino acids for SARS-CoV-2 Protein sequences Rout, Ranjeet Kumar Hassan, Sk Sarif Sheikh, Sabha Umer, Saiyed Sahoo, Kshira Sagar Gandomi, Amir H. Comput Biol Med Article BACKGROUND AND OBJECTIVE: The world is currently facing a global emergency due to COVID-19, which requires immediate strategies to strengthen healthcare facilities and prevent further deaths. To achieve effective remedies and solutions, research on different aspects, including the genomic and proteomic level characterizations of SARS-CoV-2, are critical. In this work, the spatial representation/composition and distribution frequency of 20 amino acids across the primary protein sequences of SARS-CoV-2 were examined according to different parameters. METHOD: To identify the spatial distribution of amino acids over the primary protein sequences of SARS-CoV-2, the Hurst exponent and Shannon entropy were applied as parameters to fetch the autocorrelation and amount of information over the spatial representations. The frequency distribution of each amino acid over the protein sequences was also evaluated. In the case of a one-dimensional sequence, the Hurst exponent (HE) was utilized due to its linear relationship with the fractal dimension (D), i.e. [Formula: see text] , to characterize fractality. Moreover, binary Shannon entropy was considered to measure the uncertainty in a binary sequence then further applied to calculate amino acid conservation in the primary protein sequences. RESULTS AND CONCLUSION: Fourteen (14) SARS-CoV protein sequences were evaluated and compared with 105 SARS-CoV-2 proteins. The simulation results demonstrate the differences in the collected information about the amino acid spatial distribution in the SARS-CoV-2 and SARS-CoV proteins, enabling researchers to distinguish between the two types of CoV. The spatial arrangement of amino acids also reveals similarities and dissimilarities among the important structural proteins, E, M, N and S, which is pivotal to establish an evolutionary tree with other CoV strains. Elsevier Ltd. 2022-02 2021-11-10 /pmc/articles/PMC8577876/ /pubmed/34815067 http://dx.doi.org/10.1016/j.compbiomed.2021.105024 Text en © 2021 Elsevier Ltd. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Article Rout, Ranjeet Kumar Hassan, Sk Sarif Sheikh, Sabha Umer, Saiyed Sahoo, Kshira Sagar Gandomi, Amir H. Feature-extraction and analysis based on spatial distribution of amino acids for SARS-CoV-2 Protein sequences |
title | Feature-extraction and analysis based on spatial distribution of amino acids for SARS-CoV-2 Protein sequences |
title_full | Feature-extraction and analysis based on spatial distribution of amino acids for SARS-CoV-2 Protein sequences |
title_fullStr | Feature-extraction and analysis based on spatial distribution of amino acids for SARS-CoV-2 Protein sequences |
title_full_unstemmed | Feature-extraction and analysis based on spatial distribution of amino acids for SARS-CoV-2 Protein sequences |
title_short | Feature-extraction and analysis based on spatial distribution of amino acids for SARS-CoV-2 Protein sequences |
title_sort | feature-extraction and analysis based on spatial distribution of amino acids for sars-cov-2 protein sequences |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8577876/ https://www.ncbi.nlm.nih.gov/pubmed/34815067 http://dx.doi.org/10.1016/j.compbiomed.2021.105024 |
work_keys_str_mv | AT routranjeetkumar featureextractionandanalysisbasedonspatialdistributionofaminoacidsforsarscov2proteinsequences AT hassansksarif featureextractionandanalysisbasedonspatialdistributionofaminoacidsforsarscov2proteinsequences AT sheikhsabha featureextractionandanalysisbasedonspatialdistributionofaminoacidsforsarscov2proteinsequences AT umersaiyed featureextractionandanalysisbasedonspatialdistributionofaminoacidsforsarscov2proteinsequences AT sahookshirasagar featureextractionandanalysisbasedonspatialdistributionofaminoacidsforsarscov2proteinsequences AT gandomiamirh featureextractionandanalysisbasedonspatialdistributionofaminoacidsforsarscov2proteinsequences |