Cargando…
Ensemble clustering of longitudinal bivariate HIV biomarker profiles to group patients by patterns of disease progression
This paper describes an ensemble cluster analysis of bivariate profiles of HIV biomarkers, viral load and CD4 cell counts, which jointly measure disease progression. Data are from a prevalent cohort of HIV positive participants in a clinical trial of vitamin supplementation in Botswana. These indivi...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer International Publishing
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9064718/ https://www.ncbi.nlm.nih.gov/pubmed/35528805 http://dx.doi.org/10.1007/s41060-022-00323-2 |
_version_ | 1784699443669368832 |
---|---|
author | Lynch, Miranda L. DeGruttola, Victor |
author_facet | Lynch, Miranda L. DeGruttola, Victor |
author_sort | Lynch, Miranda L. |
collection | PubMed |
description | This paper describes an ensemble cluster analysis of bivariate profiles of HIV biomarkers, viral load and CD4 cell counts, which jointly measure disease progression. Data are from a prevalent cohort of HIV positive participants in a clinical trial of vitamin supplementation in Botswana. These individuals were HIV positive upon enrollment, but with unknown times of infection. To categorize groups of participants based on their patterns of progression of HIV infection using both biomarkers, we combine univariate shape-based cluster results for multiple biomarkers through the use of ensemble clustering methods. We first describe univariate clustering for each of the individual biomarker profiles, and make use of shape-respecting distances for clustering the longitudinal profile data. In our data, profiles are subject to either missing or irregular measurements as well as unobserved initiation times of the process of interest. Shape-respecting distances that can handle such data issues, preserve time-ordering, and identify similar profile shapes are useful in identifying patterns of disease progression from longitudinal biomarker data. However, their performance with regard to clustering differs by severity of the data issues mentioned above. We provide an empirical investigation of shape-respecting distances (Fréchet and dynamic time warping (DTW)) on benchmark shape data, and use DTW in cluster analysis of biomarker profile observations. These reveal a primary group of ‘typical progressors,’ as well as a smaller group that shows relatively rapid progression. We then refine the analysis using ensemble clustering for both markers to obtain a single classification. The information from joint evaluation of the two biomarkers combined with ensemble clustering reveals subgroups of patients not identifiable through univariate analyses; noteworthy subgroups are those that appear to represent recently and chronically infected subsets. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s41060-022-00323-2. |
format | Online Article Text |
id | pubmed-9064718 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Springer International Publishing |
record_format | MEDLINE/PubMed |
spelling | pubmed-90647182022-05-04 Ensemble clustering of longitudinal bivariate HIV biomarker profiles to group patients by patterns of disease progression Lynch, Miranda L. DeGruttola, Victor Int J Data Sci Anal Applications This paper describes an ensemble cluster analysis of bivariate profiles of HIV biomarkers, viral load and CD4 cell counts, which jointly measure disease progression. Data are from a prevalent cohort of HIV positive participants in a clinical trial of vitamin supplementation in Botswana. These individuals were HIV positive upon enrollment, but with unknown times of infection. To categorize groups of participants based on their patterns of progression of HIV infection using both biomarkers, we combine univariate shape-based cluster results for multiple biomarkers through the use of ensemble clustering methods. We first describe univariate clustering for each of the individual biomarker profiles, and make use of shape-respecting distances for clustering the longitudinal profile data. In our data, profiles are subject to either missing or irregular measurements as well as unobserved initiation times of the process of interest. Shape-respecting distances that can handle such data issues, preserve time-ordering, and identify similar profile shapes are useful in identifying patterns of disease progression from longitudinal biomarker data. However, their performance with regard to clustering differs by severity of the data issues mentioned above. We provide an empirical investigation of shape-respecting distances (Fréchet and dynamic time warping (DTW)) on benchmark shape data, and use DTW in cluster analysis of biomarker profile observations. These reveal a primary group of ‘typical progressors,’ as well as a smaller group that shows relatively rapid progression. We then refine the analysis using ensemble clustering for both markers to obtain a single classification. The information from joint evaluation of the two biomarkers combined with ensemble clustering reveals subgroups of patients not identifiable through univariate analyses; noteworthy subgroups are those that appear to represent recently and chronically infected subsets. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s41060-022-00323-2. Springer International Publishing 2022-05-04 2022 /pmc/articles/PMC9064718/ /pubmed/35528805 http://dx.doi.org/10.1007/s41060-022-00323-2 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Applications Lynch, Miranda L. DeGruttola, Victor Ensemble clustering of longitudinal bivariate HIV biomarker profiles to group patients by patterns of disease progression |
title | Ensemble clustering of longitudinal bivariate HIV biomarker profiles to group patients by patterns of disease progression |
title_full | Ensemble clustering of longitudinal bivariate HIV biomarker profiles to group patients by patterns of disease progression |
title_fullStr | Ensemble clustering of longitudinal bivariate HIV biomarker profiles to group patients by patterns of disease progression |
title_full_unstemmed | Ensemble clustering of longitudinal bivariate HIV biomarker profiles to group patients by patterns of disease progression |
title_short | Ensemble clustering of longitudinal bivariate HIV biomarker profiles to group patients by patterns of disease progression |
title_sort | ensemble clustering of longitudinal bivariate hiv biomarker profiles to group patients by patterns of disease progression |
topic | Applications |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9064718/ https://www.ncbi.nlm.nih.gov/pubmed/35528805 http://dx.doi.org/10.1007/s41060-022-00323-2 |
work_keys_str_mv | AT lynchmirandal ensembleclusteringoflongitudinalbivariatehivbiomarkerprofilestogrouppatientsbypatternsofdiseaseprogression AT degruttolavictor ensembleclusteringoflongitudinalbivariatehivbiomarkerprofilestogrouppatientsbypatternsofdiseaseprogression |