Cargando…

Using association rule mining to determine promising secondary phenotyping hypotheses

Motivation: Large-scale phenotyping projects such as the Sanger Mouse Genetics project are ongoing efforts to help identify the influences of genes and their modification on phenotypes. Gene–phenotype relations are crucial to the improvement of our understanding of human heritable diseases as well a...

Descripción completa

Detalles Bibliográficos
Autores principales: Oellrich, Anika, Jacobsen, Julius, Papatheodorou, Irene, Smedley, Damian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4059059/
https://www.ncbi.nlm.nih.gov/pubmed/24932005
http://dx.doi.org/10.1093/bioinformatics/btu260
_version_ 1782321202987008000
author Oellrich, Anika
Jacobsen, Julius
Papatheodorou, Irene
Smedley, Damian
author_facet Oellrich, Anika
Jacobsen, Julius
Papatheodorou, Irene
Smedley, Damian
author_sort Oellrich, Anika
collection PubMed
description Motivation: Large-scale phenotyping projects such as the Sanger Mouse Genetics project are ongoing efforts to help identify the influences of genes and their modification on phenotypes. Gene–phenotype relations are crucial to the improvement of our understanding of human heritable diseases as well as the development of drugs. However, given that there are ∼20 000 genes in higher vertebrate genomes and the experimental verification of gene–phenotype relations requires a lot of resources, methods are needed that determine good candidates for testing. Results: In this study, we applied an association rule mining approach to the identification of promising secondary phenotype candidates. The predictions rely on a large gene–phenotype annotation set that is used to find occurrence patterns of phenotypes. Applying an association rule mining approach, we could identify 1967 secondary phenotype hypotheses that cover 244 genes and 136 phenotypes. Using two automated and one manual evaluation strategies, we demonstrate that the secondary phenotype candidates possess biological relevance to the genes they are predicted for. From the results we conclude that the predicted secondary phenotypes constitute good candidates to be experimentally tested and confirmed. Availability: The secondary phenotype candidates can be browsed through at http://www.sanger.ac.uk/resources/databases/phenodigm/gene/secondaryphenotype/list. Contact: ao5@sanger.ac.uk or ds5@sanger.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-4059059
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-40590592014-06-16 Using association rule mining to determine promising secondary phenotyping hypotheses Oellrich, Anika Jacobsen, Julius Papatheodorou, Irene Smedley, Damian Bioinformatics Ismb 2014 Proceedings Papers Committee Motivation: Large-scale phenotyping projects such as the Sanger Mouse Genetics project are ongoing efforts to help identify the influences of genes and their modification on phenotypes. Gene–phenotype relations are crucial to the improvement of our understanding of human heritable diseases as well as the development of drugs. However, given that there are ∼20 000 genes in higher vertebrate genomes and the experimental verification of gene–phenotype relations requires a lot of resources, methods are needed that determine good candidates for testing. Results: In this study, we applied an association rule mining approach to the identification of promising secondary phenotype candidates. The predictions rely on a large gene–phenotype annotation set that is used to find occurrence patterns of phenotypes. Applying an association rule mining approach, we could identify 1967 secondary phenotype hypotheses that cover 244 genes and 136 phenotypes. Using two automated and one manual evaluation strategies, we demonstrate that the secondary phenotype candidates possess biological relevance to the genes they are predicted for. From the results we conclude that the predicted secondary phenotypes constitute good candidates to be experimentally tested and confirmed. Availability: The secondary phenotype candidates can be browsed through at http://www.sanger.ac.uk/resources/databases/phenodigm/gene/secondaryphenotype/list. Contact: ao5@sanger.ac.uk or ds5@sanger.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2014-06-15 2014-06-11 /pmc/articles/PMC4059059/ /pubmed/24932005 http://dx.doi.org/10.1093/bioinformatics/btu260 Text en © The Author 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Ismb 2014 Proceedings Papers Committee
Oellrich, Anika
Jacobsen, Julius
Papatheodorou, Irene
Smedley, Damian
Using association rule mining to determine promising secondary phenotyping hypotheses
title Using association rule mining to determine promising secondary phenotyping hypotheses
title_full Using association rule mining to determine promising secondary phenotyping hypotheses
title_fullStr Using association rule mining to determine promising secondary phenotyping hypotheses
title_full_unstemmed Using association rule mining to determine promising secondary phenotyping hypotheses
title_short Using association rule mining to determine promising secondary phenotyping hypotheses
title_sort using association rule mining to determine promising secondary phenotyping hypotheses
topic Ismb 2014 Proceedings Papers Committee
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4059059/
https://www.ncbi.nlm.nih.gov/pubmed/24932005
http://dx.doi.org/10.1093/bioinformatics/btu260
work_keys_str_mv AT oellrichanika usingassociationruleminingtodeterminepromisingsecondaryphenotypinghypotheses
AT jacobsenjulius usingassociationruleminingtodeterminepromisingsecondaryphenotypinghypotheses
AT papatheodorouirene usingassociationruleminingtodeterminepromisingsecondaryphenotypinghypotheses
AT usingassociationruleminingtodeterminepromisingsecondaryphenotypinghypotheses
AT smedleydamian usingassociationruleminingtodeterminepromisingsecondaryphenotypinghypotheses