Cargando…

A multi-objective gene clustering algorithm guided by apriori biological knowledge with intensification and diversification strategies

BACKGROUND: Biologists aim to understand the genetic background of diseases, metabolic disorders or any other genetic condition. Microarrays are one of the main high-throughput technologies for collecting information about the behaviour of genetic information on different conditions. In order to ana...

Descripción completa

Detalles Bibliográficos
Autores principales: Parraga-Alava, Jorge, Dorn, Marcio, Inostroza-Ponta, Mario
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6081857/
https://www.ncbi.nlm.nih.gov/pubmed/30100924
http://dx.doi.org/10.1186/s13040-018-0178-4
_version_ 1783345723064975360
author Parraga-Alava, Jorge
Dorn, Marcio
Inostroza-Ponta, Mario
author_facet Parraga-Alava, Jorge
Dorn, Marcio
Inostroza-Ponta, Mario
author_sort Parraga-Alava, Jorge
collection PubMed
description BACKGROUND: Biologists aim to understand the genetic background of diseases, metabolic disorders or any other genetic condition. Microarrays are one of the main high-throughput technologies for collecting information about the behaviour of genetic information on different conditions. In order to analyse this data, clustering arises as one of the main techniques used, and it aims at finding groups of genes that have some criterion in common, like similar expression profile. However, the problem of finding groups is normally multi dimensional, making necessary to approach the clustering as a multi-objective problem where various cluster validity indexes are simultaneously optimised. They are usually based on criteria like compactness and separation, which may not be sufficient since they can not guarantee the generation of clusters that have both similar expression patterns and biological coherence. METHOD: We propose a Multi-Objective Clustering algorithm Guided by a-Priori Biological Knowledge (MOC-GaPBK) to find clusters of genes with high levels of co-expression, biological coherence, and also good compactness and separation. Cluster quality indexes are used to optimise simultaneously gene relationships at expression level and biological functionality. Our proposal also includes intensification and diversification strategies to improve the search process. RESULTS: The effectiveness of the proposed algorithm is demonstrated on four publicly available datasets. Comparative studies of the use of different objective functions and other widely used microarray clustering techniques are reported. Statistical, visual and biological significance tests are carried out to show the superiority of the proposed algorithm. CONCLUSIONS: Integrating a-priori biological knowledge into a multi-objective approach and using intensification and diversification strategies allow the proposed algorithm to find solutions with higher quality than other microarray clustering techniques available in the literature in terms of co-expression, biological coherence, compactness and separation.
format Online
Article
Text
id pubmed-6081857
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-60818572018-08-10 A multi-objective gene clustering algorithm guided by apriori biological knowledge with intensification and diversification strategies Parraga-Alava, Jorge Dorn, Marcio Inostroza-Ponta, Mario BioData Min Methodology BACKGROUND: Biologists aim to understand the genetic background of diseases, metabolic disorders or any other genetic condition. Microarrays are one of the main high-throughput technologies for collecting information about the behaviour of genetic information on different conditions. In order to analyse this data, clustering arises as one of the main techniques used, and it aims at finding groups of genes that have some criterion in common, like similar expression profile. However, the problem of finding groups is normally multi dimensional, making necessary to approach the clustering as a multi-objective problem where various cluster validity indexes are simultaneously optimised. They are usually based on criteria like compactness and separation, which may not be sufficient since they can not guarantee the generation of clusters that have both similar expression patterns and biological coherence. METHOD: We propose a Multi-Objective Clustering algorithm Guided by a-Priori Biological Knowledge (MOC-GaPBK) to find clusters of genes with high levels of co-expression, biological coherence, and also good compactness and separation. Cluster quality indexes are used to optimise simultaneously gene relationships at expression level and biological functionality. Our proposal also includes intensification and diversification strategies to improve the search process. RESULTS: The effectiveness of the proposed algorithm is demonstrated on four publicly available datasets. Comparative studies of the use of different objective functions and other widely used microarray clustering techniques are reported. Statistical, visual and biological significance tests are carried out to show the superiority of the proposed algorithm. CONCLUSIONS: Integrating a-priori biological knowledge into a multi-objective approach and using intensification and diversification strategies allow the proposed algorithm to find solutions with higher quality than other microarray clustering techniques available in the literature in terms of co-expression, biological coherence, compactness and separation. BioMed Central 2018-08-07 /pmc/articles/PMC6081857/ /pubmed/30100924 http://dx.doi.org/10.1186/s13040-018-0178-4 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology
Parraga-Alava, Jorge
Dorn, Marcio
Inostroza-Ponta, Mario
A multi-objective gene clustering algorithm guided by apriori biological knowledge with intensification and diversification strategies
title A multi-objective gene clustering algorithm guided by apriori biological knowledge with intensification and diversification strategies
title_full A multi-objective gene clustering algorithm guided by apriori biological knowledge with intensification and diversification strategies
title_fullStr A multi-objective gene clustering algorithm guided by apriori biological knowledge with intensification and diversification strategies
title_full_unstemmed A multi-objective gene clustering algorithm guided by apriori biological knowledge with intensification and diversification strategies
title_short A multi-objective gene clustering algorithm guided by apriori biological knowledge with intensification and diversification strategies
title_sort multi-objective gene clustering algorithm guided by apriori biological knowledge with intensification and diversification strategies
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6081857/
https://www.ncbi.nlm.nih.gov/pubmed/30100924
http://dx.doi.org/10.1186/s13040-018-0178-4
work_keys_str_mv AT parragaalavajorge amultiobjectivegeneclusteringalgorithmguidedbyaprioribiologicalknowledgewithintensificationanddiversificationstrategies
AT dornmarcio amultiobjectivegeneclusteringalgorithmguidedbyaprioribiologicalknowledgewithintensificationanddiversificationstrategies
AT inostrozapontamario amultiobjectivegeneclusteringalgorithmguidedbyaprioribiologicalknowledgewithintensificationanddiversificationstrategies
AT parragaalavajorge multiobjectivegeneclusteringalgorithmguidedbyaprioribiologicalknowledgewithintensificationanddiversificationstrategies
AT dornmarcio multiobjectivegeneclusteringalgorithmguidedbyaprioribiologicalknowledgewithintensificationanddiversificationstrategies
AT inostrozapontamario multiobjectivegeneclusteringalgorithmguidedbyaprioribiologicalknowledgewithintensificationanddiversificationstrategies