Cargando…

A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related

BACKGROUND: The ageing of the worldwide population means there is a growing need for research on the biology of ageing. DNA damage is likely a key contributor to the ageing process and elucidating the role of different DNA repair systems in ageing is of great interest. In this paper we propose a dat...

Descripción completa

Detalles Bibliográficos
Autores principales: Freitas, Alex A, Vasieva, Olga, de Magalhães, João Pedro
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3031233/
https://www.ncbi.nlm.nih.gov/pubmed/21226956
http://dx.doi.org/10.1186/1471-2164-12-27
_version_ 1782197332252557312
author Freitas, Alex A
Vasieva, Olga
de Magalhães, João Pedro
author_facet Freitas, Alex A
Vasieva, Olga
de Magalhães, João Pedro
author_sort Freitas, Alex A
collection PubMed
description BACKGROUND: The ageing of the worldwide population means there is a growing need for research on the biology of ageing. DNA damage is likely a key contributor to the ageing process and elucidating the role of different DNA repair systems in ageing is of great interest. In this paper we propose a data mining approach, based on classification methods (decision trees and Naive Bayes), for analysing data about human DNA repair genes. The goal is to build classification models that allow us to discriminate between ageing-related and non-ageing-related DNA repair genes, in order to better understand their different properties. RESULTS: The main patterns discovered by the classification methods are as follows: (a) the number of protein-protein interactions was a predictor of DNA repair proteins being ageing-related; (b) the use of predictor attributes based on protein-protein interactions considerably increased predictive accuracy of attributes based on Gene Ontology (GO) annotations; (c) GO terms related to "response to stimulus" seem reasonably good predictors of ageing-relatedness for DNA repair genes; (d) interaction with the XRCC5 (Ku80) protein is a strong predictor of ageing-relatedness for DNA repair genes; and (e) DNA repair genes with a high expression in T lymphocytes are more likely to be ageing-related. CONCLUSIONS: The above patterns are broadly integrated in an analysis discussing relations between Ku, the non-homologous end joining DNA repair pathway, ageing and lymphocyte development. These patterns and their analysis support non-homologous end joining double strand break repair as central to the ageing-relatedness of DNA repair genes. Our work also showcases the use of protein interaction partners to improve accuracy in data mining methods and our approach could be applied to other ageing-related pathways.
format Text
id pubmed-3031233
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30312332011-02-01 A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related Freitas, Alex A Vasieva, Olga de Magalhães, João Pedro BMC Genomics Research Article BACKGROUND: The ageing of the worldwide population means there is a growing need for research on the biology of ageing. DNA damage is likely a key contributor to the ageing process and elucidating the role of different DNA repair systems in ageing is of great interest. In this paper we propose a data mining approach, based on classification methods (decision trees and Naive Bayes), for analysing data about human DNA repair genes. The goal is to build classification models that allow us to discriminate between ageing-related and non-ageing-related DNA repair genes, in order to better understand their different properties. RESULTS: The main patterns discovered by the classification methods are as follows: (a) the number of protein-protein interactions was a predictor of DNA repair proteins being ageing-related; (b) the use of predictor attributes based on protein-protein interactions considerably increased predictive accuracy of attributes based on Gene Ontology (GO) annotations; (c) GO terms related to "response to stimulus" seem reasonably good predictors of ageing-relatedness for DNA repair genes; (d) interaction with the XRCC5 (Ku80) protein is a strong predictor of ageing-relatedness for DNA repair genes; and (e) DNA repair genes with a high expression in T lymphocytes are more likely to be ageing-related. CONCLUSIONS: The above patterns are broadly integrated in an analysis discussing relations between Ku, the non-homologous end joining DNA repair pathway, ageing and lymphocyte development. These patterns and their analysis support non-homologous end joining double strand break repair as central to the ageing-relatedness of DNA repair genes. Our work also showcases the use of protein interaction partners to improve accuracy in data mining methods and our approach could be applied to other ageing-related pathways. BioMed Central 2011-01-12 /pmc/articles/PMC3031233/ /pubmed/21226956 http://dx.doi.org/10.1186/1471-2164-12-27 Text en Copyright ©2011 Freitas et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Freitas, Alex A
Vasieva, Olga
de Magalhães, João Pedro
A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related
title A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related
title_full A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related
title_fullStr A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related
title_full_unstemmed A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related
title_short A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related
title_sort data mining approach for classifying dna repair genes into ageing-related or non-ageing-related
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3031233/
https://www.ncbi.nlm.nih.gov/pubmed/21226956
http://dx.doi.org/10.1186/1471-2164-12-27
work_keys_str_mv AT freitasalexa adataminingapproachforclassifyingdnarepairgenesintoageingrelatedornonageingrelated
AT vasievaolga adataminingapproachforclassifyingdnarepairgenesintoageingrelatedornonageingrelated
AT demagalhaesjoaopedro adataminingapproachforclassifyingdnarepairgenesintoageingrelatedornonageingrelated
AT freitasalexa dataminingapproachforclassifyingdnarepairgenesintoageingrelatedornonageingrelated
AT vasievaolga dataminingapproachforclassifyingdnarepairgenesintoageingrelatedornonageingrelated
AT demagalhaesjoaopedro dataminingapproachforclassifyingdnarepairgenesintoageingrelatedornonageingrelated