Cargando…

Meta-analytic support vector machine for integrating multiple omics data

BACKGROUND: Of late, high-throughput microarray and sequencing data have been extensively used to monitor biomarkers and biological processes related to many diseases. Under this circumstance, the support vector machine (SVM) has been popularly used and been successful for gene selection in many app...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, SungHwan, Jhong, Jae-Hwan, Lee, JungJun, Koo, Ja-Yong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5270233/
https://www.ncbi.nlm.nih.gov/pubmed/28149325
http://dx.doi.org/10.1186/s13040-017-0126-8
_version_ 1782501149167845376
author Kim, SungHwan
Jhong, Jae-Hwan
Lee, JungJun
Koo, Ja-Yong
author_facet Kim, SungHwan
Jhong, Jae-Hwan
Lee, JungJun
Koo, Ja-Yong
author_sort Kim, SungHwan
collection PubMed
description BACKGROUND: Of late, high-throughput microarray and sequencing data have been extensively used to monitor biomarkers and biological processes related to many diseases. Under this circumstance, the support vector machine (SVM) has been popularly used and been successful for gene selection in many applications. Despite surpassing benefits of the SVMs, single data analysis using small- and mid-size of data inevitably runs into the problem of low reproducibility and statistical power. To address this problem, we propose a meta-analytic support vector machine (Meta-SVM) that can accommodate multiple omics data, making it possible to detect consensus genes associated with diseases across studies. RESULTS: Experimental studies show that the Meta-SVM is superior to the existing meta-analysis method in detecting true signal genes. In real data applications, diverse omics data of breast cancer (TCGA) and mRNA expression data of lung disease (idiopathic pulmonary fibrosis; IPF) were applied. As a result, we identified gene sets consistently associated with the diseases across studies. In particular, the ascertained gene set of TCGA omics data was found to be significantly enriched in the ABC transporters pathways well known as critical for the breast cancer mechanism. CONCLUSION: The Meta-SVM effectively achieves the purpose of meta-analysis as jointly leveraging multiple omics data, and facilitates identifying potential biomarkers and elucidating the disease process. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13040-017-0126-8) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5270233
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-52702332017-02-01 Meta-analytic support vector machine for integrating multiple omics data Kim, SungHwan Jhong, Jae-Hwan Lee, JungJun Koo, Ja-Yong BioData Min Methodology BACKGROUND: Of late, high-throughput microarray and sequencing data have been extensively used to monitor biomarkers and biological processes related to many diseases. Under this circumstance, the support vector machine (SVM) has been popularly used and been successful for gene selection in many applications. Despite surpassing benefits of the SVMs, single data analysis using small- and mid-size of data inevitably runs into the problem of low reproducibility and statistical power. To address this problem, we propose a meta-analytic support vector machine (Meta-SVM) that can accommodate multiple omics data, making it possible to detect consensus genes associated with diseases across studies. RESULTS: Experimental studies show that the Meta-SVM is superior to the existing meta-analysis method in detecting true signal genes. In real data applications, diverse omics data of breast cancer (TCGA) and mRNA expression data of lung disease (idiopathic pulmonary fibrosis; IPF) were applied. As a result, we identified gene sets consistently associated with the diseases across studies. In particular, the ascertained gene set of TCGA omics data was found to be significantly enriched in the ABC transporters pathways well known as critical for the breast cancer mechanism. CONCLUSION: The Meta-SVM effectively achieves the purpose of meta-analysis as jointly leveraging multiple omics data, and facilitates identifying potential biomarkers and elucidating the disease process. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13040-017-0126-8) contains supplementary material, which is available to authorized users. BioMed Central 2017-01-26 /pmc/articles/PMC5270233/ /pubmed/28149325 http://dx.doi.org/10.1186/s13040-017-0126-8 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology
Kim, SungHwan
Jhong, Jae-Hwan
Lee, JungJun
Koo, Ja-Yong
Meta-analytic support vector machine for integrating multiple omics data
title Meta-analytic support vector machine for integrating multiple omics data
title_full Meta-analytic support vector machine for integrating multiple omics data
title_fullStr Meta-analytic support vector machine for integrating multiple omics data
title_full_unstemmed Meta-analytic support vector machine for integrating multiple omics data
title_short Meta-analytic support vector machine for integrating multiple omics data
title_sort meta-analytic support vector machine for integrating multiple omics data
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5270233/
https://www.ncbi.nlm.nih.gov/pubmed/28149325
http://dx.doi.org/10.1186/s13040-017-0126-8
work_keys_str_mv AT kimsunghwan metaanalyticsupportvectormachineforintegratingmultipleomicsdata
AT jhongjaehwan metaanalyticsupportvectormachineforintegratingmultipleomicsdata
AT leejungjun metaanalyticsupportvectormachineforintegratingmultipleomicsdata
AT koojayong metaanalyticsupportvectormachineforintegratingmultipleomicsdata