Cargando…

Prediction and Validation of Gene-Disease Associations Using Methods Inspired by Social Network Analyses

Correctly identifying associations of genes with diseases has long been a goal in biology. With the emergence of large-scale gene-phenotype association datasets in biology, we can leverage statistical and machine learning methods to help us achieve this goal. In this paper, we present two methods fo...

Descripción completa

Detalles Bibliográficos
Autores principales: Singh-Blom, U. Martin, Natarajan, Nagarajan, Tewari, Ambuj, Woods, John O., Dhillon, Inderjit S., Marcotte, Edward M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3641094/
https://www.ncbi.nlm.nih.gov/pubmed/23650495
http://dx.doi.org/10.1371/journal.pone.0058977
_version_ 1782267979748081664
author Singh-Blom, U. Martin
Natarajan, Nagarajan
Tewari, Ambuj
Woods, John O.
Dhillon, Inderjit S.
Marcotte, Edward M.
author_facet Singh-Blom, U. Martin
Natarajan, Nagarajan
Tewari, Ambuj
Woods, John O.
Dhillon, Inderjit S.
Marcotte, Edward M.
author_sort Singh-Blom, U. Martin
collection PubMed
description Correctly identifying associations of genes with diseases has long been a goal in biology. With the emergence of large-scale gene-phenotype association datasets in biology, we can leverage statistical and machine learning methods to help us achieve this goal. In this paper, we present two methods for predicting gene-disease associations based on functional gene associations and gene-phenotype associations in model organisms. The first method, the Katz measure, is motivated from its success in social network link prediction, and is very closely related to some of the recent methods proposed for gene-disease association inference. The second method, called Catapult (Combining dATa Across species using Positive-Unlabeled Learning Techniques), is a supervised machine learning method that uses a biased support vector machine where the features are derived from walks in a heterogeneous gene-trait network. We study the performance of the proposed methods and related state-of-the-art methods using two different evaluation strategies, on two distinct data sets, namely OMIM phenotypes and drug-target interactions. Finally, by measuring the performance of the methods using two different evaluation strategies, we show that even though both methods perform very well, the Katz measure is better at identifying associations between traits and poorly studied genes, whereas Catapult is better suited to correctly identifying gene-trait associations overall. The authors want to thank Jon Laurent and Kris McGary for some of the data used, and Li and Patra for making their code available. Most of Ambuj Tewari's contribution to this work happened while he was a postdoctoral fellow at the University of Texas at Austin.
format Online
Article
Text
id pubmed-3641094
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-36410942013-05-06 Prediction and Validation of Gene-Disease Associations Using Methods Inspired by Social Network Analyses Singh-Blom, U. Martin Natarajan, Nagarajan Tewari, Ambuj Woods, John O. Dhillon, Inderjit S. Marcotte, Edward M. PLoS One Research Article Correctly identifying associations of genes with diseases has long been a goal in biology. With the emergence of large-scale gene-phenotype association datasets in biology, we can leverage statistical and machine learning methods to help us achieve this goal. In this paper, we present two methods for predicting gene-disease associations based on functional gene associations and gene-phenotype associations in model organisms. The first method, the Katz measure, is motivated from its success in social network link prediction, and is very closely related to some of the recent methods proposed for gene-disease association inference. The second method, called Catapult (Combining dATa Across species using Positive-Unlabeled Learning Techniques), is a supervised machine learning method that uses a biased support vector machine where the features are derived from walks in a heterogeneous gene-trait network. We study the performance of the proposed methods and related state-of-the-art methods using two different evaluation strategies, on two distinct data sets, namely OMIM phenotypes and drug-target interactions. Finally, by measuring the performance of the methods using two different evaluation strategies, we show that even though both methods perform very well, the Katz measure is better at identifying associations between traits and poorly studied genes, whereas Catapult is better suited to correctly identifying gene-trait associations overall. The authors want to thank Jon Laurent and Kris McGary for some of the data used, and Li and Patra for making their code available. Most of Ambuj Tewari's contribution to this work happened while he was a postdoctoral fellow at the University of Texas at Austin. Public Library of Science 2013-05-01 /pmc/articles/PMC3641094/ /pubmed/23650495 http://dx.doi.org/10.1371/journal.pone.0058977 Text en © 2013 Singh-Blom et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Singh-Blom, U. Martin
Natarajan, Nagarajan
Tewari, Ambuj
Woods, John O.
Dhillon, Inderjit S.
Marcotte, Edward M.
Prediction and Validation of Gene-Disease Associations Using Methods Inspired by Social Network Analyses
title Prediction and Validation of Gene-Disease Associations Using Methods Inspired by Social Network Analyses
title_full Prediction and Validation of Gene-Disease Associations Using Methods Inspired by Social Network Analyses
title_fullStr Prediction and Validation of Gene-Disease Associations Using Methods Inspired by Social Network Analyses
title_full_unstemmed Prediction and Validation of Gene-Disease Associations Using Methods Inspired by Social Network Analyses
title_short Prediction and Validation of Gene-Disease Associations Using Methods Inspired by Social Network Analyses
title_sort prediction and validation of gene-disease associations using methods inspired by social network analyses
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3641094/
https://www.ncbi.nlm.nih.gov/pubmed/23650495
http://dx.doi.org/10.1371/journal.pone.0058977
work_keys_str_mv AT singhblomumartin predictionandvalidationofgenediseaseassociationsusingmethodsinspiredbysocialnetworkanalyses
AT natarajannagarajan predictionandvalidationofgenediseaseassociationsusingmethodsinspiredbysocialnetworkanalyses
AT tewariambuj predictionandvalidationofgenediseaseassociationsusingmethodsinspiredbysocialnetworkanalyses
AT woodsjohno predictionandvalidationofgenediseaseassociationsusingmethodsinspiredbysocialnetworkanalyses
AT dhilloninderjits predictionandvalidationofgenediseaseassociationsusingmethodsinspiredbysocialnetworkanalyses
AT marcotteedwardm predictionandvalidationofgenediseaseassociationsusingmethodsinspiredbysocialnetworkanalyses