Cargando…
Machine learning-based identification of genetic interactions from heterogeneous gene expression profiles
The identification of disease-related genes and disease mechanisms is an important research goal; many studies have approached this problem by analysing genetic networks based on gene expression profiles and interaction datasets. To construct a gene network, correlations or associations among pairs...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6062065/ https://www.ncbi.nlm.nih.gov/pubmed/30048494 http://dx.doi.org/10.1371/journal.pone.0201056 |
_version_ | 1783342330786349056 |
---|---|
author | Park, Chihyun Kim, JungRim Kim, Jeongwoo Park, Sanghyun |
author_facet | Park, Chihyun Kim, JungRim Kim, Jeongwoo Park, Sanghyun |
author_sort | Park, Chihyun |
collection | PubMed |
description | The identification of disease-related genes and disease mechanisms is an important research goal; many studies have approached this problem by analysing genetic networks based on gene expression profiles and interaction datasets. To construct a gene network, correlations or associations among pairs of genes must be obtained. However, when gene expression data are heterogeneous with high levels of noise for samples assigned to the same condition, it is difficult to accurately determine whether a gene pair represents a significant gene–gene interaction (GGI). In order to solve this problem, we proposed a random forest-based method to classify significant GGIs from gene expression data. To train the model, we defined novel feature sets and utilised various high-confidence interactome datasets to deduce the correct answer set from known disease-specific genes. Using Alzheimer’s disease data, the proposed method showed remarkable accuracy, and the GGIs established in the analysis can be used to build a meaningful genetic network that can explain the mechanisms underlying Alzheimer’s disease. |
format | Online Article Text |
id | pubmed-6062065 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-60620652018-08-03 Machine learning-based identification of genetic interactions from heterogeneous gene expression profiles Park, Chihyun Kim, JungRim Kim, Jeongwoo Park, Sanghyun PLoS One Research Article The identification of disease-related genes and disease mechanisms is an important research goal; many studies have approached this problem by analysing genetic networks based on gene expression profiles and interaction datasets. To construct a gene network, correlations or associations among pairs of genes must be obtained. However, when gene expression data are heterogeneous with high levels of noise for samples assigned to the same condition, it is difficult to accurately determine whether a gene pair represents a significant gene–gene interaction (GGI). In order to solve this problem, we proposed a random forest-based method to classify significant GGIs from gene expression data. To train the model, we defined novel feature sets and utilised various high-confidence interactome datasets to deduce the correct answer set from known disease-specific genes. Using Alzheimer’s disease data, the proposed method showed remarkable accuracy, and the GGIs established in the analysis can be used to build a meaningful genetic network that can explain the mechanisms underlying Alzheimer’s disease. Public Library of Science 2018-07-26 /pmc/articles/PMC6062065/ /pubmed/30048494 http://dx.doi.org/10.1371/journal.pone.0201056 Text en © 2018 Park et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Park, Chihyun Kim, JungRim Kim, Jeongwoo Park, Sanghyun Machine learning-based identification of genetic interactions from heterogeneous gene expression profiles |
title | Machine learning-based identification of genetic interactions from heterogeneous gene expression profiles |
title_full | Machine learning-based identification of genetic interactions from heterogeneous gene expression profiles |
title_fullStr | Machine learning-based identification of genetic interactions from heterogeneous gene expression profiles |
title_full_unstemmed | Machine learning-based identification of genetic interactions from heterogeneous gene expression profiles |
title_short | Machine learning-based identification of genetic interactions from heterogeneous gene expression profiles |
title_sort | machine learning-based identification of genetic interactions from heterogeneous gene expression profiles |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6062065/ https://www.ncbi.nlm.nih.gov/pubmed/30048494 http://dx.doi.org/10.1371/journal.pone.0201056 |
work_keys_str_mv | AT parkchihyun machinelearningbasedidentificationofgeneticinteractionsfromheterogeneousgeneexpressionprofiles AT kimjungrim machinelearningbasedidentificationofgeneticinteractionsfromheterogeneousgeneexpressionprofiles AT kimjeongwoo machinelearningbasedidentificationofgeneticinteractionsfromheterogeneousgeneexpressionprofiles AT parksanghyun machinelearningbasedidentificationofgeneticinteractionsfromheterogeneousgeneexpressionprofiles |