Cargando…
Data integration with high dimensionality
We consider situations where the data consist of a number of responses for each individual, which may include a mix of discrete and continuous variables. The data also include a class of predictors, where the same predictor may have different physical measurements across different experiments depend...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5532816/ https://www.ncbi.nlm.nih.gov/pubmed/28757650 http://dx.doi.org/10.1093/biomet/asx023 |
_version_ | 1783253525506031616 |
---|---|
author | Gao, Xin Carroll, Raymond J. |
author_facet | Gao, Xin Carroll, Raymond J. |
author_sort | Gao, Xin |
collection | PubMed |
description | We consider situations where the data consist of a number of responses for each individual, which may include a mix of discrete and continuous variables. The data also include a class of predictors, where the same predictor may have different physical measurements across different experiments depending on how the predictor is measured. The goal is to select which predictors affect any of the responses, where the number of such informative predictors tends to infinity as the sample size increases. There are marginal likelihoods for each experiment; we specify a pseudolikelihood combining the marginal likelihoods, and propose a pseudolikelihood information criterion. Under regularity conditions, we establish selection consistency for this criterion with unbounded true model size. The proposed method includes a Bayesian information criterion with appropriate penalty term as a special case. Simulations indicate that data integration can dramatically improve upon using only one data source. |
format | Online Article Text |
id | pubmed-5532816 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-55328162017-07-28 Data integration with high dimensionality Gao, Xin Carroll, Raymond J. Biometrika Articles We consider situations where the data consist of a number of responses for each individual, which may include a mix of discrete and continuous variables. The data also include a class of predictors, where the same predictor may have different physical measurements across different experiments depending on how the predictor is measured. The goal is to select which predictors affect any of the responses, where the number of such informative predictors tends to infinity as the sample size increases. There are marginal likelihoods for each experiment; we specify a pseudolikelihood combining the marginal likelihoods, and propose a pseudolikelihood information criterion. Under regularity conditions, we establish selection consistency for this criterion with unbounded true model size. The proposed method includes a Bayesian information criterion with appropriate penalty term as a special case. Simulations indicate that data integration can dramatically improve upon using only one data source. Oxford University Press 2017-06 2017-05-09 /pmc/articles/PMC5532816/ /pubmed/28757650 http://dx.doi.org/10.1093/biomet/asx023 Text en © 2017 Biometrika Trust |
spellingShingle | Articles Gao, Xin Carroll, Raymond J. Data integration with high dimensionality |
title | Data integration with high dimensionality |
title_full | Data integration with high dimensionality |
title_fullStr | Data integration with high dimensionality |
title_full_unstemmed | Data integration with high dimensionality |
title_short | Data integration with high dimensionality |
title_sort | data integration with high dimensionality |
topic | Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5532816/ https://www.ncbi.nlm.nih.gov/pubmed/28757650 http://dx.doi.org/10.1093/biomet/asx023 |
work_keys_str_mv | AT gaoxin dataintegrationwithhighdimensionality AT carrollraymondj dataintegrationwithhighdimensionality |