Cargando…
Meta-analytic support vector machine for integrating multiple omics data
BACKGROUND: Of late, high-throughput microarray and sequencing data have been extensively used to monitor biomarkers and biological processes related to many diseases. Under this circumstance, the support vector machine (SVM) has been popularly used and been successful for gene selection in many app...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5270233/ https://www.ncbi.nlm.nih.gov/pubmed/28149325 http://dx.doi.org/10.1186/s13040-017-0126-8 |
_version_ | 1782501149167845376 |
---|---|
author | Kim, SungHwan Jhong, Jae-Hwan Lee, JungJun Koo, Ja-Yong |
author_facet | Kim, SungHwan Jhong, Jae-Hwan Lee, JungJun Koo, Ja-Yong |
author_sort | Kim, SungHwan |
collection | PubMed |
description | BACKGROUND: Of late, high-throughput microarray and sequencing data have been extensively used to monitor biomarkers and biological processes related to many diseases. Under this circumstance, the support vector machine (SVM) has been popularly used and been successful for gene selection in many applications. Despite surpassing benefits of the SVMs, single data analysis using small- and mid-size of data inevitably runs into the problem of low reproducibility and statistical power. To address this problem, we propose a meta-analytic support vector machine (Meta-SVM) that can accommodate multiple omics data, making it possible to detect consensus genes associated with diseases across studies. RESULTS: Experimental studies show that the Meta-SVM is superior to the existing meta-analysis method in detecting true signal genes. In real data applications, diverse omics data of breast cancer (TCGA) and mRNA expression data of lung disease (idiopathic pulmonary fibrosis; IPF) were applied. As a result, we identified gene sets consistently associated with the diseases across studies. In particular, the ascertained gene set of TCGA omics data was found to be significantly enriched in the ABC transporters pathways well known as critical for the breast cancer mechanism. CONCLUSION: The Meta-SVM effectively achieves the purpose of meta-analysis as jointly leveraging multiple omics data, and facilitates identifying potential biomarkers and elucidating the disease process. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13040-017-0126-8) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5270233 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-52702332017-02-01 Meta-analytic support vector machine for integrating multiple omics data Kim, SungHwan Jhong, Jae-Hwan Lee, JungJun Koo, Ja-Yong BioData Min Methodology BACKGROUND: Of late, high-throughput microarray and sequencing data have been extensively used to monitor biomarkers and biological processes related to many diseases. Under this circumstance, the support vector machine (SVM) has been popularly used and been successful for gene selection in many applications. Despite surpassing benefits of the SVMs, single data analysis using small- and mid-size of data inevitably runs into the problem of low reproducibility and statistical power. To address this problem, we propose a meta-analytic support vector machine (Meta-SVM) that can accommodate multiple omics data, making it possible to detect consensus genes associated with diseases across studies. RESULTS: Experimental studies show that the Meta-SVM is superior to the existing meta-analysis method in detecting true signal genes. In real data applications, diverse omics data of breast cancer (TCGA) and mRNA expression data of lung disease (idiopathic pulmonary fibrosis; IPF) were applied. As a result, we identified gene sets consistently associated with the diseases across studies. In particular, the ascertained gene set of TCGA omics data was found to be significantly enriched in the ABC transporters pathways well known as critical for the breast cancer mechanism. CONCLUSION: The Meta-SVM effectively achieves the purpose of meta-analysis as jointly leveraging multiple omics data, and facilitates identifying potential biomarkers and elucidating the disease process. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13040-017-0126-8) contains supplementary material, which is available to authorized users. BioMed Central 2017-01-26 /pmc/articles/PMC5270233/ /pubmed/28149325 http://dx.doi.org/10.1186/s13040-017-0126-8 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Kim, SungHwan Jhong, Jae-Hwan Lee, JungJun Koo, Ja-Yong Meta-analytic support vector machine for integrating multiple omics data |
title | Meta-analytic support vector machine for integrating multiple omics data |
title_full | Meta-analytic support vector machine for integrating multiple omics data |
title_fullStr | Meta-analytic support vector machine for integrating multiple omics data |
title_full_unstemmed | Meta-analytic support vector machine for integrating multiple omics data |
title_short | Meta-analytic support vector machine for integrating multiple omics data |
title_sort | meta-analytic support vector machine for integrating multiple omics data |
topic | Methodology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5270233/ https://www.ncbi.nlm.nih.gov/pubmed/28149325 http://dx.doi.org/10.1186/s13040-017-0126-8 |
work_keys_str_mv | AT kimsunghwan metaanalyticsupportvectormachineforintegratingmultipleomicsdata AT jhongjaehwan metaanalyticsupportvectormachineforintegratingmultipleomicsdata AT leejungjun metaanalyticsupportvectormachineforintegratingmultipleomicsdata AT koojayong metaanalyticsupportvectormachineforintegratingmultipleomicsdata |