Cargando…
An individualized predictor of health and disease using paired reference and target samples
BACKGROUND: Consider the problem of designing a panel of complex biomarkers to predict a patient’s health or disease state when one can pair his or her current test sample, called a target sample, with the patient’s previously acquired healthy sample, called a reference sample. As contrasted to a po...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4722633/ https://www.ncbi.nlm.nih.gov/pubmed/26801061 http://dx.doi.org/10.1186/s12859-016-0889-9 |
_version_ | 1782411390759206912 |
---|---|
author | Liu, Tzu-Yu Burke, Thomas Park, Lawrence P. Woods, Christopher W. Zaas, Aimee K. Ginsburg, Geoffrey S. Hero, Alfred O. |
author_facet | Liu, Tzu-Yu Burke, Thomas Park, Lawrence P. Woods, Christopher W. Zaas, Aimee K. Ginsburg, Geoffrey S. Hero, Alfred O. |
author_sort | Liu, Tzu-Yu |
collection | PubMed |
description | BACKGROUND: Consider the problem of designing a panel of complex biomarkers to predict a patient’s health or disease state when one can pair his or her current test sample, called a target sample, with the patient’s previously acquired healthy sample, called a reference sample. As contrasted to a population averaged reference this reference sample is individualized. Automated predictor algorithms that compare and contrast the paired samples to each other could result in a new generation of test panels that compare to a person’s healthy reference to enhance predictive accuracy. This paper develops such an individualized predictor and illustrates the added value of including the healthy reference for design of predictive gene expression panels. RESULTS: The objective is to predict each subject’s state of infection, e.g., neither exposed nor infected, exposed but not infected, pre-acute phase of infection, acute phase of infection, post-acute phase of infection. Using gene microarray data collected in a large scale serially sampled respiratory virus challenge study we quantify the diagnostic advantage of pairing a person’s baseline reference with his or her target sample. The full study consists of 2886 microarray chips assaying 12,023 genes of 151 human volunteer subjects under 4 different inoculation regimes (HRV, RSV, H1N1, H3N2). We train (with cross-validation) reference-aided sparse multi-class classifier algorithms on this data to show that inclusion of a subject’s reference sample can improve prediction accuracy by as much as 14 %, for the H3N2 cohort, and by at least 6 %, for the H1N1 cohort. Remarkably, these gains in accuracy are achieved by using smaller panels of genes, e.g., 39 % fewer for H3N2 and 31 % fewer for H1N1. The biomarkers selected by the predictors fall into two categories: 1) contrasting genes that tend to differentially express between target and reference samples over the population; 2) reinforcement genes that remain constant over the two samples, which function as housekeeping normalization genes. Many of these genes are common to all 4 viruses and their roles in the predictor elucidate the function that they play in differentiating the different states of host immune response. CONCLUSIONS: If one uses a suitable mathematical prediction algorithm, inclusion of a healthy reference in biomarker diagnostic testing can potentially improve accuracy of disease prediction with fewer biomarkers. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-0889-9) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4722633 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-47226332016-01-23 An individualized predictor of health and disease using paired reference and target samples Liu, Tzu-Yu Burke, Thomas Park, Lawrence P. Woods, Christopher W. Zaas, Aimee K. Ginsburg, Geoffrey S. Hero, Alfred O. BMC Bioinformatics Methodology Article BACKGROUND: Consider the problem of designing a panel of complex biomarkers to predict a patient’s health or disease state when one can pair his or her current test sample, called a target sample, with the patient’s previously acquired healthy sample, called a reference sample. As contrasted to a population averaged reference this reference sample is individualized. Automated predictor algorithms that compare and contrast the paired samples to each other could result in a new generation of test panels that compare to a person’s healthy reference to enhance predictive accuracy. This paper develops such an individualized predictor and illustrates the added value of including the healthy reference for design of predictive gene expression panels. RESULTS: The objective is to predict each subject’s state of infection, e.g., neither exposed nor infected, exposed but not infected, pre-acute phase of infection, acute phase of infection, post-acute phase of infection. Using gene microarray data collected in a large scale serially sampled respiratory virus challenge study we quantify the diagnostic advantage of pairing a person’s baseline reference with his or her target sample. The full study consists of 2886 microarray chips assaying 12,023 genes of 151 human volunteer subjects under 4 different inoculation regimes (HRV, RSV, H1N1, H3N2). We train (with cross-validation) reference-aided sparse multi-class classifier algorithms on this data to show that inclusion of a subject’s reference sample can improve prediction accuracy by as much as 14 %, for the H3N2 cohort, and by at least 6 %, for the H1N1 cohort. Remarkably, these gains in accuracy are achieved by using smaller panels of genes, e.g., 39 % fewer for H3N2 and 31 % fewer for H1N1. The biomarkers selected by the predictors fall into two categories: 1) contrasting genes that tend to differentially express between target and reference samples over the population; 2) reinforcement genes that remain constant over the two samples, which function as housekeeping normalization genes. Many of these genes are common to all 4 viruses and their roles in the predictor elucidate the function that they play in differentiating the different states of host immune response. CONCLUSIONS: If one uses a suitable mathematical prediction algorithm, inclusion of a healthy reference in biomarker diagnostic testing can potentially improve accuracy of disease prediction with fewer biomarkers. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-0889-9) contains supplementary material, which is available to authorized users. BioMed Central 2016-01-22 /pmc/articles/PMC4722633/ /pubmed/26801061 http://dx.doi.org/10.1186/s12859-016-0889-9 Text en © Liu et al. 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Liu, Tzu-Yu Burke, Thomas Park, Lawrence P. Woods, Christopher W. Zaas, Aimee K. Ginsburg, Geoffrey S. Hero, Alfred O. An individualized predictor of health and disease using paired reference and target samples |
title | An individualized predictor of health and disease using paired reference and target samples |
title_full | An individualized predictor of health and disease using paired reference and target samples |
title_fullStr | An individualized predictor of health and disease using paired reference and target samples |
title_full_unstemmed | An individualized predictor of health and disease using paired reference and target samples |
title_short | An individualized predictor of health and disease using paired reference and target samples |
title_sort | individualized predictor of health and disease using paired reference and target samples |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4722633/ https://www.ncbi.nlm.nih.gov/pubmed/26801061 http://dx.doi.org/10.1186/s12859-016-0889-9 |
work_keys_str_mv | AT liutzuyu anindividualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples AT burkethomas anindividualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples AT parklawrencep anindividualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples AT woodschristopherw anindividualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples AT zaasaimeek anindividualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples AT ginsburggeoffreys anindividualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples AT heroalfredo anindividualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples AT liutzuyu individualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples AT burkethomas individualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples AT parklawrencep individualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples AT woodschristopherw individualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples AT zaasaimeek individualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples AT ginsburggeoffreys individualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples AT heroalfredo individualizedpredictorofhealthanddiseaseusingpairedreferenceandtargetsamples |