Cargando…

Machine learning-based identification of genetic interactions from heterogeneous gene expression profiles

The identification of disease-related genes and disease mechanisms is an important research goal; many studies have approached this problem by analysing genetic networks based on gene expression profiles and interaction datasets. To construct a gene network, correlations or associations among pairs...

Descripción completa

Detalles Bibliográficos
Autores principales: Park, Chihyun, Kim, JungRim, Kim, Jeongwoo, Park, Sanghyun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6062065/
https://www.ncbi.nlm.nih.gov/pubmed/30048494
http://dx.doi.org/10.1371/journal.pone.0201056
_version_ 1783342330786349056
author Park, Chihyun
Kim, JungRim
Kim, Jeongwoo
Park, Sanghyun
author_facet Park, Chihyun
Kim, JungRim
Kim, Jeongwoo
Park, Sanghyun
author_sort Park, Chihyun
collection PubMed
description The identification of disease-related genes and disease mechanisms is an important research goal; many studies have approached this problem by analysing genetic networks based on gene expression profiles and interaction datasets. To construct a gene network, correlations or associations among pairs of genes must be obtained. However, when gene expression data are heterogeneous with high levels of noise for samples assigned to the same condition, it is difficult to accurately determine whether a gene pair represents a significant gene–gene interaction (GGI). In order to solve this problem, we proposed a random forest-based method to classify significant GGIs from gene expression data. To train the model, we defined novel feature sets and utilised various high-confidence interactome datasets to deduce the correct answer set from known disease-specific genes. Using Alzheimer’s disease data, the proposed method showed remarkable accuracy, and the GGIs established in the analysis can be used to build a meaningful genetic network that can explain the mechanisms underlying Alzheimer’s disease.
format Online
Article
Text
id pubmed-6062065
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-60620652018-08-03 Machine learning-based identification of genetic interactions from heterogeneous gene expression profiles Park, Chihyun Kim, JungRim Kim, Jeongwoo Park, Sanghyun PLoS One Research Article The identification of disease-related genes and disease mechanisms is an important research goal; many studies have approached this problem by analysing genetic networks based on gene expression profiles and interaction datasets. To construct a gene network, correlations or associations among pairs of genes must be obtained. However, when gene expression data are heterogeneous with high levels of noise for samples assigned to the same condition, it is difficult to accurately determine whether a gene pair represents a significant gene–gene interaction (GGI). In order to solve this problem, we proposed a random forest-based method to classify significant GGIs from gene expression data. To train the model, we defined novel feature sets and utilised various high-confidence interactome datasets to deduce the correct answer set from known disease-specific genes. Using Alzheimer’s disease data, the proposed method showed remarkable accuracy, and the GGIs established in the analysis can be used to build a meaningful genetic network that can explain the mechanisms underlying Alzheimer’s disease. Public Library of Science 2018-07-26 /pmc/articles/PMC6062065/ /pubmed/30048494 http://dx.doi.org/10.1371/journal.pone.0201056 Text en © 2018 Park et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Park, Chihyun
Kim, JungRim
Kim, Jeongwoo
Park, Sanghyun
Machine learning-based identification of genetic interactions from heterogeneous gene expression profiles
title Machine learning-based identification of genetic interactions from heterogeneous gene expression profiles
title_full Machine learning-based identification of genetic interactions from heterogeneous gene expression profiles
title_fullStr Machine learning-based identification of genetic interactions from heterogeneous gene expression profiles
title_full_unstemmed Machine learning-based identification of genetic interactions from heterogeneous gene expression profiles
title_short Machine learning-based identification of genetic interactions from heterogeneous gene expression profiles
title_sort machine learning-based identification of genetic interactions from heterogeneous gene expression profiles
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6062065/
https://www.ncbi.nlm.nih.gov/pubmed/30048494
http://dx.doi.org/10.1371/journal.pone.0201056
work_keys_str_mv AT parkchihyun machinelearningbasedidentificationofgeneticinteractionsfromheterogeneousgeneexpressionprofiles
AT kimjungrim machinelearningbasedidentificationofgeneticinteractionsfromheterogeneousgeneexpressionprofiles
AT kimjeongwoo machinelearningbasedidentificationofgeneticinteractionsfromheterogeneousgeneexpressionprofiles
AT parksanghyun machinelearningbasedidentificationofgeneticinteractionsfromheterogeneousgeneexpressionprofiles