Cargando…

Predicting phenotypes from microarrays using amplified, initially marginal, eigenvector regression

MOTIVATION: The discovery of relationships between gene expression measurements and phenotypic responses is hampered by both computational and statistical impediments. Conventional statistical methods are less than ideal because they either fail to select relevant genes, predict poorly, ignore the u...

Descripción completa

Detalles Bibliográficos
Autores principales: Ding, Lei, McDonald, Daniel J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5870707/
https://www.ncbi.nlm.nih.gov/pubmed/28881997
http://dx.doi.org/10.1093/bioinformatics/btx265
_version_ 1783309537950826496
author Ding, Lei
McDonald, Daniel J
author_facet Ding, Lei
McDonald, Daniel J
author_sort Ding, Lei
collection PubMed
description MOTIVATION: The discovery of relationships between gene expression measurements and phenotypic responses is hampered by both computational and statistical impediments. Conventional statistical methods are less than ideal because they either fail to select relevant genes, predict poorly, ignore the unknown interaction structure between genes, or are computationally intractable. Thus, the creation of new methods which can handle many expression measurements on relatively small numbers of patients while also uncovering gene–gene relationships and predicting well is desirable. RESULTS: We develop a new technique for using the marginal relationship between gene expression measurements and patient survival outcomes to identify a small subset of genes which appear highly relevant for predicting survival, produce a low-dimensional embedding based on this small subset, and amplify this embedding with information from the remaining genes. We motivate our methodology by using gene expression measurements to predict survival time for patients with diffuse large B-cell lymphoma, illustrate the behavior of our methodology on carefully constructed synthetic examples, and test it on a number of other gene expression datasets. Our technique is computationally tractable, generally outperforms other methods, is extensible to other phenotypes, and also identifies different genes (relative to existing methods) for possible future study. AVAILABILITY AND IMPLEMENTATION: All of the code and data are available at http://mypage.iu.edu/∼dajmcdon/research/. SUPPLEMENTARY INFORMATION: Supplementary material is available at Bioinformatics online.
format Online
Article
Text
id pubmed-5870707
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-58707072018-04-05 Predicting phenotypes from microarrays using amplified, initially marginal, eigenvector regression Ding, Lei McDonald, Daniel J Bioinformatics Ismb/Eccb 2017: The 25th Annual Conference Intelligent Systems for Molecular Biology Held Jointly with the 16th Annual European Conference on Computational Biology, Prague, Czech Republic, July 21–25, 2017 MOTIVATION: The discovery of relationships between gene expression measurements and phenotypic responses is hampered by both computational and statistical impediments. Conventional statistical methods are less than ideal because they either fail to select relevant genes, predict poorly, ignore the unknown interaction structure between genes, or are computationally intractable. Thus, the creation of new methods which can handle many expression measurements on relatively small numbers of patients while also uncovering gene–gene relationships and predicting well is desirable. RESULTS: We develop a new technique for using the marginal relationship between gene expression measurements and patient survival outcomes to identify a small subset of genes which appear highly relevant for predicting survival, produce a low-dimensional embedding based on this small subset, and amplify this embedding with information from the remaining genes. We motivate our methodology by using gene expression measurements to predict survival time for patients with diffuse large B-cell lymphoma, illustrate the behavior of our methodology on carefully constructed synthetic examples, and test it on a number of other gene expression datasets. Our technique is computationally tractable, generally outperforms other methods, is extensible to other phenotypes, and also identifies different genes (relative to existing methods) for possible future study. AVAILABILITY AND IMPLEMENTATION: All of the code and data are available at http://mypage.iu.edu/∼dajmcdon/research/. SUPPLEMENTARY INFORMATION: Supplementary material is available at Bioinformatics online. Oxford University Press 2017-07-15 2017-07-12 /pmc/articles/PMC5870707/ /pubmed/28881997 http://dx.doi.org/10.1093/bioinformatics/btx265 Text en © The Author 2017. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Ismb/Eccb 2017: The 25th Annual Conference Intelligent Systems for Molecular Biology Held Jointly with the 16th Annual European Conference on Computational Biology, Prague, Czech Republic, July 21–25, 2017
Ding, Lei
McDonald, Daniel J
Predicting phenotypes from microarrays using amplified, initially marginal, eigenvector regression
title Predicting phenotypes from microarrays using amplified, initially marginal, eigenvector regression
title_full Predicting phenotypes from microarrays using amplified, initially marginal, eigenvector regression
title_fullStr Predicting phenotypes from microarrays using amplified, initially marginal, eigenvector regression
title_full_unstemmed Predicting phenotypes from microarrays using amplified, initially marginal, eigenvector regression
title_short Predicting phenotypes from microarrays using amplified, initially marginal, eigenvector regression
title_sort predicting phenotypes from microarrays using amplified, initially marginal, eigenvector regression
topic Ismb/Eccb 2017: The 25th Annual Conference Intelligent Systems for Molecular Biology Held Jointly with the 16th Annual European Conference on Computational Biology, Prague, Czech Republic, July 21–25, 2017
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5870707/
https://www.ncbi.nlm.nih.gov/pubmed/28881997
http://dx.doi.org/10.1093/bioinformatics/btx265
work_keys_str_mv AT dinglei predictingphenotypesfrommicroarraysusingamplifiedinitiallymarginaleigenvectorregression
AT mcdonalddanielj predictingphenotypesfrommicroarraysusingamplifiedinitiallymarginaleigenvectorregression