Cargando…
Using association rule mining to determine promising secondary phenotyping hypotheses
Motivation: Large-scale phenotyping projects such as the Sanger Mouse Genetics project are ongoing efforts to help identify the influences of genes and their modification on phenotypes. Gene–phenotype relations are crucial to the improvement of our understanding of human heritable diseases as well a...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4059059/ https://www.ncbi.nlm.nih.gov/pubmed/24932005 http://dx.doi.org/10.1093/bioinformatics/btu260 |
_version_ | 1782321202987008000 |
---|---|
author | Oellrich, Anika Jacobsen, Julius Papatheodorou, Irene Smedley, Damian |
author_facet | Oellrich, Anika Jacobsen, Julius Papatheodorou, Irene Smedley, Damian |
author_sort | Oellrich, Anika |
collection | PubMed |
description | Motivation: Large-scale phenotyping projects such as the Sanger Mouse Genetics project are ongoing efforts to help identify the influences of genes and their modification on phenotypes. Gene–phenotype relations are crucial to the improvement of our understanding of human heritable diseases as well as the development of drugs. However, given that there are ∼20 000 genes in higher vertebrate genomes and the experimental verification of gene–phenotype relations requires a lot of resources, methods are needed that determine good candidates for testing. Results: In this study, we applied an association rule mining approach to the identification of promising secondary phenotype candidates. The predictions rely on a large gene–phenotype annotation set that is used to find occurrence patterns of phenotypes. Applying an association rule mining approach, we could identify 1967 secondary phenotype hypotheses that cover 244 genes and 136 phenotypes. Using two automated and one manual evaluation strategies, we demonstrate that the secondary phenotype candidates possess biological relevance to the genes they are predicted for. From the results we conclude that the predicted secondary phenotypes constitute good candidates to be experimentally tested and confirmed. Availability: The secondary phenotype candidates can be browsed through at http://www.sanger.ac.uk/resources/databases/phenodigm/gene/secondaryphenotype/list. Contact: ao5@sanger.ac.uk or ds5@sanger.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-4059059 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-40590592014-06-16 Using association rule mining to determine promising secondary phenotyping hypotheses Oellrich, Anika Jacobsen, Julius Papatheodorou, Irene Smedley, Damian Bioinformatics Ismb 2014 Proceedings Papers Committee Motivation: Large-scale phenotyping projects such as the Sanger Mouse Genetics project are ongoing efforts to help identify the influences of genes and their modification on phenotypes. Gene–phenotype relations are crucial to the improvement of our understanding of human heritable diseases as well as the development of drugs. However, given that there are ∼20 000 genes in higher vertebrate genomes and the experimental verification of gene–phenotype relations requires a lot of resources, methods are needed that determine good candidates for testing. Results: In this study, we applied an association rule mining approach to the identification of promising secondary phenotype candidates. The predictions rely on a large gene–phenotype annotation set that is used to find occurrence patterns of phenotypes. Applying an association rule mining approach, we could identify 1967 secondary phenotype hypotheses that cover 244 genes and 136 phenotypes. Using two automated and one manual evaluation strategies, we demonstrate that the secondary phenotype candidates possess biological relevance to the genes they are predicted for. From the results we conclude that the predicted secondary phenotypes constitute good candidates to be experimentally tested and confirmed. Availability: The secondary phenotype candidates can be browsed through at http://www.sanger.ac.uk/resources/databases/phenodigm/gene/secondaryphenotype/list. Contact: ao5@sanger.ac.uk or ds5@sanger.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2014-06-15 2014-06-11 /pmc/articles/PMC4059059/ /pubmed/24932005 http://dx.doi.org/10.1093/bioinformatics/btu260 Text en © The Author 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Ismb 2014 Proceedings Papers Committee Oellrich, Anika Jacobsen, Julius Papatheodorou, Irene Smedley, Damian Using association rule mining to determine promising secondary phenotyping hypotheses |
title | Using association rule mining to determine promising secondary phenotyping
hypotheses |
title_full | Using association rule mining to determine promising secondary phenotyping
hypotheses |
title_fullStr | Using association rule mining to determine promising secondary phenotyping
hypotheses |
title_full_unstemmed | Using association rule mining to determine promising secondary phenotyping
hypotheses |
title_short | Using association rule mining to determine promising secondary phenotyping
hypotheses |
title_sort | using association rule mining to determine promising secondary phenotyping
hypotheses |
topic | Ismb 2014 Proceedings Papers Committee |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4059059/ https://www.ncbi.nlm.nih.gov/pubmed/24932005 http://dx.doi.org/10.1093/bioinformatics/btu260 |
work_keys_str_mv | AT oellrichanika usingassociationruleminingtodeterminepromisingsecondaryphenotypinghypotheses AT jacobsenjulius usingassociationruleminingtodeterminepromisingsecondaryphenotypinghypotheses AT papatheodorouirene usingassociationruleminingtodeterminepromisingsecondaryphenotypinghypotheses AT usingassociationruleminingtodeterminepromisingsecondaryphenotypinghypotheses AT smedleydamian usingassociationruleminingtodeterminepromisingsecondaryphenotypinghypotheses |