Cargando…

High-dimensional omics data analysis using a variable screening protocol with prior knowledge integration (SKI)

BACKGROUND: High-throughput technology could generate thousands to millions biomarker measurements in one experiment. However, results from high throughput analysis are often barely reproducible due to small sample size. Different statistical methods have been proposed to tackle this “small n and la...

Descripción completa

Detalles Bibliográficos
Autores principales:	Liu, Cong, Jiang, Jianping, Gu, Jianlei, Yu, Zhangsheng, Wang, Tao, Lu, Hui
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2016
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5260139/ https://www.ncbi.nlm.nih.gov/pubmed/28155690 http://dx.doi.org/10.1186/s12918-016-0358-0

_version_	1782499351866638336
author	Liu, Cong Jiang, Jianping Gu, Jianlei Yu, Zhangsheng Wang, Tao Lu, Hui
author_facet	Liu, Cong Jiang, Jianping Gu, Jianlei Yu, Zhangsheng Wang, Tao Lu, Hui
author_sort	Liu, Cong
collection	PubMed
description	BACKGROUND: High-throughput technology could generate thousands to millions biomarker measurements in one experiment. However, results from high throughput analysis are often barely reproducible due to small sample size. Different statistical methods have been proposed to tackle this “small n and large p” scenario, for example different datasets could be pooled or integrated together to provide an effective way to improve reproducibility. However, the raw data is either unavailable or hard to integrate due to different experimental conditions, thus there is an emerging need to develop a method for “knowledge integration” in high-throughput data analysis. RESULTS: In this study, we proposed an integrative prescreening approach, SKI, for high-throughput data analysis. A new rank is generated based on two initial ranks: (1) knowledge based rank; and (2) marginal correlation based rank. Our simulation shows the SKI outperforms other methods without knowledge-integration in terms of higher true positive rate given the same number of variables selected. We also applied our method in a drug response study and found its performance to be better than regular screening methods. CONCLUSION: The proposed method provides an effective way to integrate knowledge for high-throughput analysis. It could easily implemented with our provided R package named SKI.
format	Online Article Text
id	pubmed-5260139
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-52601392017-01-30 High-dimensional omics data analysis using a variable screening protocol with prior knowledge integration (SKI) Liu, Cong Jiang, Jianping Gu, Jianlei Yu, Zhangsheng Wang, Tao Lu, Hui BMC Syst Biol Research BACKGROUND: High-throughput technology could generate thousands to millions biomarker measurements in one experiment. However, results from high throughput analysis are often barely reproducible due to small sample size. Different statistical methods have been proposed to tackle this “small n and large p” scenario, for example different datasets could be pooled or integrated together to provide an effective way to improve reproducibility. However, the raw data is either unavailable or hard to integrate due to different experimental conditions, thus there is an emerging need to develop a method for “knowledge integration” in high-throughput data analysis. RESULTS: In this study, we proposed an integrative prescreening approach, SKI, for high-throughput data analysis. A new rank is generated based on two initial ranks: (1) knowledge based rank; and (2) marginal correlation based rank. Our simulation shows the SKI outperforms other methods without knowledge-integration in terms of higher true positive rate given the same number of variables selected. We also applied our method in a drug response study and found its performance to be better than regular screening methods. CONCLUSION: The proposed method provides an effective way to integrate knowledge for high-throughput analysis. It could easily implemented with our provided R package named SKI. BioMed Central 2016-12-23 /pmc/articles/PMC5260139/ /pubmed/28155690 http://dx.doi.org/10.1186/s12918-016-0358-0 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Liu, Cong Jiang, Jianping Gu, Jianlei Yu, Zhangsheng Wang, Tao Lu, Hui High-dimensional omics data analysis using a variable screening protocol with prior knowledge integration (SKI)
title	High-dimensional omics data analysis using a variable screening protocol with prior knowledge integration (SKI)
title_full	High-dimensional omics data analysis using a variable screening protocol with prior knowledge integration (SKI)
title_fullStr	High-dimensional omics data analysis using a variable screening protocol with prior knowledge integration (SKI)
title_full_unstemmed	High-dimensional omics data analysis using a variable screening protocol with prior knowledge integration (SKI)
title_short	High-dimensional omics data analysis using a variable screening protocol with prior knowledge integration (SKI)
title_sort	high-dimensional omics data analysis using a variable screening protocol with prior knowledge integration (ski)
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5260139/ https://www.ncbi.nlm.nih.gov/pubmed/28155690 http://dx.doi.org/10.1186/s12918-016-0358-0
work_keys_str_mv	AT liucong highdimensionalomicsdataanalysisusingavariablescreeningprotocolwithpriorknowledgeintegrationski AT jiangjianping highdimensionalomicsdataanalysisusingavariablescreeningprotocolwithpriorknowledgeintegrationski AT gujianlei highdimensionalomicsdataanalysisusingavariablescreeningprotocolwithpriorknowledgeintegrationski AT yuzhangsheng highdimensionalomicsdataanalysisusingavariablescreeningprotocolwithpriorknowledgeintegrationski AT wangtao highdimensionalomicsdataanalysisusingavariablescreeningprotocolwithpriorknowledgeintegrationski AT luhui highdimensionalomicsdataanalysisusingavariablescreeningprotocolwithpriorknowledgeintegrationski

High-dimensional omics data analysis using a variable screening protocol with prior knowledge integration (SKI)

Ejemplares similares