Cargando…
A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related
BACKGROUND: The ageing of the worldwide population means there is a growing need for research on the biology of ageing. DNA damage is likely a key contributor to the ageing process and elucidating the role of different DNA repair systems in ageing is of great interest. In this paper we propose a dat...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3031233/ https://www.ncbi.nlm.nih.gov/pubmed/21226956 http://dx.doi.org/10.1186/1471-2164-12-27 |
_version_ | 1782197332252557312 |
---|---|
author | Freitas, Alex A Vasieva, Olga de Magalhães, João Pedro |
author_facet | Freitas, Alex A Vasieva, Olga de Magalhães, João Pedro |
author_sort | Freitas, Alex A |
collection | PubMed |
description | BACKGROUND: The ageing of the worldwide population means there is a growing need for research on the biology of ageing. DNA damage is likely a key contributor to the ageing process and elucidating the role of different DNA repair systems in ageing is of great interest. In this paper we propose a data mining approach, based on classification methods (decision trees and Naive Bayes), for analysing data about human DNA repair genes. The goal is to build classification models that allow us to discriminate between ageing-related and non-ageing-related DNA repair genes, in order to better understand their different properties. RESULTS: The main patterns discovered by the classification methods are as follows: (a) the number of protein-protein interactions was a predictor of DNA repair proteins being ageing-related; (b) the use of predictor attributes based on protein-protein interactions considerably increased predictive accuracy of attributes based on Gene Ontology (GO) annotations; (c) GO terms related to "response to stimulus" seem reasonably good predictors of ageing-relatedness for DNA repair genes; (d) interaction with the XRCC5 (Ku80) protein is a strong predictor of ageing-relatedness for DNA repair genes; and (e) DNA repair genes with a high expression in T lymphocytes are more likely to be ageing-related. CONCLUSIONS: The above patterns are broadly integrated in an analysis discussing relations between Ku, the non-homologous end joining DNA repair pathway, ageing and lymphocyte development. These patterns and their analysis support non-homologous end joining double strand break repair as central to the ageing-relatedness of DNA repair genes. Our work also showcases the use of protein interaction partners to improve accuracy in data mining methods and our approach could be applied to other ageing-related pathways. |
format | Text |
id | pubmed-3031233 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-30312332011-02-01 A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related Freitas, Alex A Vasieva, Olga de Magalhães, João Pedro BMC Genomics Research Article BACKGROUND: The ageing of the worldwide population means there is a growing need for research on the biology of ageing. DNA damage is likely a key contributor to the ageing process and elucidating the role of different DNA repair systems in ageing is of great interest. In this paper we propose a data mining approach, based on classification methods (decision trees and Naive Bayes), for analysing data about human DNA repair genes. The goal is to build classification models that allow us to discriminate between ageing-related and non-ageing-related DNA repair genes, in order to better understand their different properties. RESULTS: The main patterns discovered by the classification methods are as follows: (a) the number of protein-protein interactions was a predictor of DNA repair proteins being ageing-related; (b) the use of predictor attributes based on protein-protein interactions considerably increased predictive accuracy of attributes based on Gene Ontology (GO) annotations; (c) GO terms related to "response to stimulus" seem reasonably good predictors of ageing-relatedness for DNA repair genes; (d) interaction with the XRCC5 (Ku80) protein is a strong predictor of ageing-relatedness for DNA repair genes; and (e) DNA repair genes with a high expression in T lymphocytes are more likely to be ageing-related. CONCLUSIONS: The above patterns are broadly integrated in an analysis discussing relations between Ku, the non-homologous end joining DNA repair pathway, ageing and lymphocyte development. These patterns and their analysis support non-homologous end joining double strand break repair as central to the ageing-relatedness of DNA repair genes. Our work also showcases the use of protein interaction partners to improve accuracy in data mining methods and our approach could be applied to other ageing-related pathways. BioMed Central 2011-01-12 /pmc/articles/PMC3031233/ /pubmed/21226956 http://dx.doi.org/10.1186/1471-2164-12-27 Text en Copyright ©2011 Freitas et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Freitas, Alex A Vasieva, Olga de Magalhães, João Pedro A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related |
title | A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related |
title_full | A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related |
title_fullStr | A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related |
title_full_unstemmed | A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related |
title_short | A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related |
title_sort | data mining approach for classifying dna repair genes into ageing-related or non-ageing-related |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3031233/ https://www.ncbi.nlm.nih.gov/pubmed/21226956 http://dx.doi.org/10.1186/1471-2164-12-27 |
work_keys_str_mv | AT freitasalexa adataminingapproachforclassifyingdnarepairgenesintoageingrelatedornonageingrelated AT vasievaolga adataminingapproachforclassifyingdnarepairgenesintoageingrelatedornonageingrelated AT demagalhaesjoaopedro adataminingapproachforclassifyingdnarepairgenesintoageingrelatedornonageingrelated AT freitasalexa dataminingapproachforclassifyingdnarepairgenesintoageingrelatedornonageingrelated AT vasievaolga dataminingapproachforclassifyingdnarepairgenesintoageingrelatedornonageingrelated AT demagalhaesjoaopedro dataminingapproachforclassifyingdnarepairgenesintoageingrelatedornonageingrelated |