Cargando…

An individualized predictor of health and disease using paired reference and target samples

BACKGROUND: Consider the problem of designing a panel of complex biomarkers to predict a patient’s health or disease state when one can pair his or her current test sample, called a target sample, with the patient’s previously acquired healthy sample, called a reference sample. As contrasted to a po...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Tzu-Yu, Burke, Thomas, Park, Lawrence P., Woods, Christopher W., Zaas, Aimee K., Ginsburg, Geoffrey S., Hero, Alfred O.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4722633/
https://www.ncbi.nlm.nih.gov/pubmed/26801061
http://dx.doi.org/10.1186/s12859-016-0889-9
_version_ 1782411390759206912
author Liu, Tzu-Yu
Burke, Thomas
Park, Lawrence P.
Woods, Christopher W.
Zaas, Aimee K.
Ginsburg, Geoffrey S.
Hero, Alfred O.
author_facet Liu, Tzu-Yu
Burke, Thomas
Park, Lawrence P.
Woods, Christopher W.
Zaas, Aimee K.
Ginsburg, Geoffrey S.
Hero, Alfred O.
author_sort Liu, Tzu-Yu
collection PubMed
description BACKGROUND: Consider the problem of designing a panel of complex biomarkers to predict a patient’s health or disease state when one can pair his or her current test sample, called a target sample, with the patient’s previously acquired healthy sample, called a reference sample. As contrasted to a population averaged reference this reference sample is individualized. Automated predictor algorithms that compare and contrast the paired samples to each other could result in a new generation of test panels that compare to a person’s healthy reference to enhance predictive accuracy. This paper develops such an individualized predictor and illustrates the added value of including the healthy reference for design of predictive gene expression panels. RESULTS: The objective is to predict each subject’s state of infection, e.g., neither exposed nor infected, exposed but not infected, pre-acute phase of infection, acute phase of infection, post-acute phase of infection. Using gene microarray data collected in a large scale serially sampled respiratory virus challenge study we quantify the diagnostic advantage of pairing a person’s baseline reference with his or her target sample. The full study consists of 2886 microarray chips assaying 12,023 genes of 151 human volunteer subjects under 4 different inoculation regimes (HRV, RSV, H1N1, H3N2). We train (with cross-validation) reference-aided sparse multi-class classifier algorithms on this data to show that inclusion of a subject’s reference sample can improve prediction accuracy by as much as 14 %, for the H3N2 cohort, and by at least 6 %, for the H1N1 cohort. Remarkably, these gains in accuracy are achieved by using smaller panels of genes, e.g., 39 % fewer for H3N2 and 31 % fewer for H1N1. The biomarkers selected by the predictors fall into two categories: 1) contrasting genes that tend to differentially express between target and reference samples over the population; 2) reinforcement genes that remain constant over the two samples, which function as housekeeping normalization genes. Many of these genes are common to all 4 viruses and their roles in the predictor elucidate the function that they play in differentiating the different states of host immune response. CONCLUSIONS: If one uses a suitable mathematical prediction algorithm, inclusion of a healthy reference in biomarker diagnostic testing can potentially improve accuracy of disease prediction with fewer biomarkers. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-0889-9) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4722633
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-47226332016-01-23 An individualized predictor of health and disease using paired reference and target samples Liu, Tzu-Yu Burke, Thomas Park, Lawrence P. Woods, Christopher W. Zaas, Aimee K. Ginsburg, Geoffrey S. Hero, Alfred O. BMC Bioinformatics Methodology Article BACKGROUND: Consider the problem of designing a panel of complex biomarkers to predict a patient’s health or disease state when one can pair his or her current test sample, called a target sample, with the patient’s previously acquired healthy sample, called a reference sample. As contrasted to a population averaged reference this reference sample is individualized. Automated predictor algorithms that compare and contrast the paired samples to each other could result in a new generation of test panels that compare to a person’s healthy reference to enhance predictive accuracy. This paper develops such an individualized predictor and illustrates the added value of including the healthy reference for design of predictive gene expression panels. RESULTS: The objective is to predict each subject’s state of infection, e.g., neither exposed nor infected, exposed but not infected, pre-acute phase of infection, acute phase of infection, post-acute phase of infection. Using gene microarray data collected in a large scale serially sampled respiratory virus challenge study we quantify the diagnostic advantage of pairing a person’s baseline reference with his or her target sample. The full study consists of 2886 microarray chips assaying 12,023 genes of 151 human volunteer subjects under 4 different inoculation regimes (HRV, RSV, H1N1, H3N2). We train (with cross-validation) reference-aided sparse multi-class classifier algorithms on this data to show that inclusion of a subject’s reference sample can improve prediction accuracy by as much as 14 %, for the H3N2 cohort, and by at least 6 %, for the H1N1 cohort. Remarkably, these gains in accuracy are achieved by using smaller panels of genes, e.g., 39 % fewer for H3N2 and 31 % fewer for H1N1. The biomarkers selected by the predictors fall into two categories: 1) contrasting genes that tend to differentially express between target and reference samples over the population; 2) reinforcement genes that remain constant over the two samples, which function as housekeeping normalization genes. Many of these genes are common to all 4 viruses and their roles in the predictor elucidate the function that they play in differentiating the different states of host immune response. CONCLUSIONS: If one uses a suitable mathematical prediction algorithm, inclusion of a healthy reference in biomarker diagnostic testing can potentially improve accuracy of disease prediction with fewer biomarkers. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-0889-9) contains supplementary material, which is available to authorized users. BioMed Central 2016-01-22 /pmc/articles/PMC4722633/ /pubmed/26801061 http://dx.doi.org/10.1186/s12859-016-0889-9 Text en © Liu et al. 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Liu, Tzu-Yu
Burke, Thomas
Park, Lawrence P.
Woods, Christopher W.
Zaas, Aimee K.
Ginsburg, Geoffrey S.
Hero, Alfred O.
An individualized predictor of health and disease using paired reference and target samples
title An individualized predictor of health and disease using paired reference and target samples
title_full An individualized predictor of health and disease using paired reference and target samples
title_fullStr An individualized predictor of health and disease using paired reference and target samples
title_full_unstemmed An individualized predictor of health and disease using paired reference and target samples
title_short An individualized predictor of health and disease using paired reference and target samples
title_sort individualized predictor of health and disease using paired reference and target samples
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4722633/
https://www.ncbi.nlm.nih.gov/pubmed/26801061
http://dx.doi.org/10.1186/s12859-016-0889-9
work_keys_str_mv AT liutzuyu anindividualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples
AT burkethomas anindividualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples
AT parklawrencep anindividualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples
AT woodschristopherw anindividualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples
AT zaasaimeek anindividualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples
AT ginsburggeoffreys anindividualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples
AT heroalfredo anindividualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples
AT liutzuyu individualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples
AT burkethomas individualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples
AT parklawrencep individualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples
AT woodschristopherw individualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples
AT zaasaimeek individualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples
AT ginsburggeoffreys individualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples
AT heroalfredo individualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples