Cargando…
Distributed ICSA Clustering Approach for Large Scale Protein Sequences and Cancer Diagnosis
OBJECTIVE: With the over saturating growth of biological sequence databases, handling of these amounts of data has increasingly become a problem. Clustering has become one of the principal research objectives in structural and functional genomics. However, exact clustering algorithms, such as partit...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
West Asia Organization for Cancer Prevention
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6318385/ https://www.ncbi.nlm.nih.gov/pubmed/30486549 http://dx.doi.org/10.31557/APJCP.2018.19.11.3105 |
_version_ | 1783384865368965120 |
---|---|
author | K, Thenmozhi N, Karthikeyani Visalakshi S, Shanthi M, Pyingkodi |
author_facet | K, Thenmozhi N, Karthikeyani Visalakshi S, Shanthi M, Pyingkodi |
author_sort | K, Thenmozhi |
collection | PubMed |
description | OBJECTIVE: With the over saturating growth of biological sequence databases, handling of these amounts of data has increasingly become a problem. Clustering has become one of the principal research objectives in structural and functional genomics. However, exact clustering algorithms, such as partitioned and hierarchical clustering, scale relatively poorly in terms of run time and memory usage with large sets of sequences. METHODS: From these performance limits, heuristic optimizations such as Cuckoo Search Algorithm with genetic operators (ICSA) algorithm have been implemented in distributed computing environment. The proposed ICSA, a global optimized algorithm that can cluster large numbers of protein sequences by running on distributed computing hardware. RESULTS: It allocates both memory and computing resources efficiently. Compare with the latest research results, our method requires only 15% of the execution time and obtains even higher quality information of protein sequence. CONCLUSION: From the experimental analysis, We noticed that the cluster of large protein sequence data sets using ICSA technique instead of only alignment methods reduce extremely the execution time and improve the efficiency of this important task in molecular biology. Moreover, the new era of proteomics is providing us with extensive knowledge of mutations and other alterations in cancer study. |
format | Online Article Text |
id | pubmed-6318385 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | West Asia Organization for Cancer Prevention |
record_format | MEDLINE/PubMed |
spelling | pubmed-63183852019-01-14 Distributed ICSA Clustering Approach for Large Scale Protein Sequences and Cancer Diagnosis K, Thenmozhi N, Karthikeyani Visalakshi S, Shanthi M, Pyingkodi Asian Pac J Cancer Prev Research Article OBJECTIVE: With the over saturating growth of biological sequence databases, handling of these amounts of data has increasingly become a problem. Clustering has become one of the principal research objectives in structural and functional genomics. However, exact clustering algorithms, such as partitioned and hierarchical clustering, scale relatively poorly in terms of run time and memory usage with large sets of sequences. METHODS: From these performance limits, heuristic optimizations such as Cuckoo Search Algorithm with genetic operators (ICSA) algorithm have been implemented in distributed computing environment. The proposed ICSA, a global optimized algorithm that can cluster large numbers of protein sequences by running on distributed computing hardware. RESULTS: It allocates both memory and computing resources efficiently. Compare with the latest research results, our method requires only 15% of the execution time and obtains even higher quality information of protein sequence. CONCLUSION: From the experimental analysis, We noticed that the cluster of large protein sequence data sets using ICSA technique instead of only alignment methods reduce extremely the execution time and improve the efficiency of this important task in molecular biology. Moreover, the new era of proteomics is providing us with extensive knowledge of mutations and other alterations in cancer study. West Asia Organization for Cancer Prevention 2018 /pmc/articles/PMC6318385/ /pubmed/30486549 http://dx.doi.org/10.31557/APJCP.2018.19.11.3105 Text en Copyright: © Asian Pacific Journal of Cancer Prevention http://creativecommons.org/licenses/BY-SA/4.0 This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License |
spellingShingle | Research Article K, Thenmozhi N, Karthikeyani Visalakshi S, Shanthi M, Pyingkodi Distributed ICSA Clustering Approach for Large Scale Protein Sequences and Cancer Diagnosis |
title | Distributed ICSA Clustering Approach for Large Scale Protein Sequences and Cancer Diagnosis |
title_full | Distributed ICSA Clustering Approach for Large Scale Protein Sequences and Cancer Diagnosis |
title_fullStr | Distributed ICSA Clustering Approach for Large Scale Protein Sequences and Cancer Diagnosis |
title_full_unstemmed | Distributed ICSA Clustering Approach for Large Scale Protein Sequences and Cancer Diagnosis |
title_short | Distributed ICSA Clustering Approach for Large Scale Protein Sequences and Cancer Diagnosis |
title_sort | distributed icsa clustering approach for large scale protein sequences and cancer diagnosis |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6318385/ https://www.ncbi.nlm.nih.gov/pubmed/30486549 http://dx.doi.org/10.31557/APJCP.2018.19.11.3105 |
work_keys_str_mv | AT kthenmozhi distributedicsaclusteringapproachforlargescaleproteinsequencesandcancerdiagnosis AT nkarthikeyanivisalakshi distributedicsaclusteringapproachforlargescaleproteinsequencesandcancerdiagnosis AT sshanthi distributedicsaclusteringapproachforlargescaleproteinsequencesandcancerdiagnosis AT mpyingkodi distributedicsaclusteringapproachforlargescaleproteinsequencesandcancerdiagnosis |