Cargando…

Frequent contiguous pattern mining over biological sequences of protein misfolded diseases

BACKGROUND: Proteins are integral part of all living beings, which are building blocks of many amino acids. To be functionally active, amino acids chain folds up in a complex way to give each protein a unique 3D shape, where a minor error may cause misfolded structure. Genetic disorder diseases i.e....

Descripción completa

Detalles Bibliográficos
Autores principales: Islam, Mohammad Shahedul, Mia, Md. Abul Kashem, Rahman, Mohammad Shamsur, Arefin, Mohammad Shamsul, Dhar, Pranab Kumar, Koshiba, Takeshi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8436569/
https://www.ncbi.nlm.nih.gov/pubmed/34511072
http://dx.doi.org/10.1186/s12859-021-04341-y
_version_ 1783752018920210432
author Islam, Mohammad Shahedul
Mia, Md. Abul Kashem
Rahman, Mohammad Shamsur
Arefin, Mohammad Shamsul
Dhar, Pranab Kumar
Koshiba, Takeshi
author_facet Islam, Mohammad Shahedul
Mia, Md. Abul Kashem
Rahman, Mohammad Shamsur
Arefin, Mohammad Shamsul
Dhar, Pranab Kumar
Koshiba, Takeshi
author_sort Islam, Mohammad Shahedul
collection PubMed
description BACKGROUND: Proteins are integral part of all living beings, which are building blocks of many amino acids. To be functionally active, amino acids chain folds up in a complex way to give each protein a unique 3D shape, where a minor error may cause misfolded structure. Genetic disorder diseases i.e. Alzheimer, Parkinson, etc. arise due to misfolding in protein sequences. Thus, identifying patterns of amino acids is important for inferring protein associated genetic diseases. Recent studies in predicting amino acids patterns focused on only simple protein misfolded disease i.e. Chromaffin Tumor, by association rule mining. However, more complex diseases are yet to be attempted. Moreover, association rules obtained by these studies were not verified by usefulness measuring tools. RESULTS: In this work, we analyzed protein sequences associated with complex protein misfolded diseases (i.e. Sickle Cell Anemia, Breast Cancer, Cystic Fibrosis, Nephrogenic Diabetes Insipidus, and Retinitis Pigmentosa 4) by association rule mining technique and objective interestingness measuring tools. Experimental results show the effectiveness of our method. CONCLUSION: Adopting quantitative experimental methods, this work can form more reliable, useful and strong association rules i. e. dominating patterns of amino acid of complex protein misfolded diseases. Thus, in addition to usual applications, the identified patterns can be more useful in discovering medicines for protein misfolded diseases and thereby may open up new opportunities in medical science to handle genetic disorder diseases.
format Online
Article
Text
id pubmed-8436569
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-84365692021-09-13 Frequent contiguous pattern mining over biological sequences of protein misfolded diseases Islam, Mohammad Shahedul Mia, Md. Abul Kashem Rahman, Mohammad Shamsur Arefin, Mohammad Shamsul Dhar, Pranab Kumar Koshiba, Takeshi BMC Bioinformatics Research BACKGROUND: Proteins are integral part of all living beings, which are building blocks of many amino acids. To be functionally active, amino acids chain folds up in a complex way to give each protein a unique 3D shape, where a minor error may cause misfolded structure. Genetic disorder diseases i.e. Alzheimer, Parkinson, etc. arise due to misfolding in protein sequences. Thus, identifying patterns of amino acids is important for inferring protein associated genetic diseases. Recent studies in predicting amino acids patterns focused on only simple protein misfolded disease i.e. Chromaffin Tumor, by association rule mining. However, more complex diseases are yet to be attempted. Moreover, association rules obtained by these studies were not verified by usefulness measuring tools. RESULTS: In this work, we analyzed protein sequences associated with complex protein misfolded diseases (i.e. Sickle Cell Anemia, Breast Cancer, Cystic Fibrosis, Nephrogenic Diabetes Insipidus, and Retinitis Pigmentosa 4) by association rule mining technique and objective interestingness measuring tools. Experimental results show the effectiveness of our method. CONCLUSION: Adopting quantitative experimental methods, this work can form more reliable, useful and strong association rules i. e. dominating patterns of amino acid of complex protein misfolded diseases. Thus, in addition to usual applications, the identified patterns can be more useful in discovering medicines for protein misfolded diseases and thereby may open up new opportunities in medical science to handle genetic disorder diseases. BioMed Central 2021-09-11 /pmc/articles/PMC8436569/ /pubmed/34511072 http://dx.doi.org/10.1186/s12859-021-04341-y Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Islam, Mohammad Shahedul
Mia, Md. Abul Kashem
Rahman, Mohammad Shamsur
Arefin, Mohammad Shamsul
Dhar, Pranab Kumar
Koshiba, Takeshi
Frequent contiguous pattern mining over biological sequences of protein misfolded diseases
title Frequent contiguous pattern mining over biological sequences of protein misfolded diseases
title_full Frequent contiguous pattern mining over biological sequences of protein misfolded diseases
title_fullStr Frequent contiguous pattern mining over biological sequences of protein misfolded diseases
title_full_unstemmed Frequent contiguous pattern mining over biological sequences of protein misfolded diseases
title_short Frequent contiguous pattern mining over biological sequences of protein misfolded diseases
title_sort frequent contiguous pattern mining over biological sequences of protein misfolded diseases
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8436569/
https://www.ncbi.nlm.nih.gov/pubmed/34511072
http://dx.doi.org/10.1186/s12859-021-04341-y
work_keys_str_mv AT islammohammadshahedul frequentcontiguouspatternminingoverbiologicalsequencesofproteinmisfoldeddiseases
AT miamdabulkashem frequentcontiguouspatternminingoverbiologicalsequencesofproteinmisfoldeddiseases
AT rahmanmohammadshamsur frequentcontiguouspatternminingoverbiologicalsequencesofproteinmisfoldeddiseases
AT arefinmohammadshamsul frequentcontiguouspatternminingoverbiologicalsequencesofproteinmisfoldeddiseases
AT dharpranabkumar frequentcontiguouspatternminingoverbiologicalsequencesofproteinmisfoldeddiseases
AT koshibatakeshi frequentcontiguouspatternminingoverbiologicalsequencesofproteinmisfoldeddiseases