Cargando…

Ensemble clustering of longitudinal bivariate HIV biomarker profiles to group patients by patterns of disease progression

This paper describes an ensemble cluster analysis of bivariate profiles of HIV biomarkers, viral load and CD4 cell counts, which jointly measure disease progression. Data are from a prevalent cohort of HIV positive participants in a clinical trial of vitamin supplementation in Botswana. These indivi...

Descripción completa

Detalles Bibliográficos
Autores principales: Lynch, Miranda L., DeGruttola, Victor
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9064718/
https://www.ncbi.nlm.nih.gov/pubmed/35528805
http://dx.doi.org/10.1007/s41060-022-00323-2
_version_ 1784699443669368832
author Lynch, Miranda L.
DeGruttola, Victor
author_facet Lynch, Miranda L.
DeGruttola, Victor
author_sort Lynch, Miranda L.
collection PubMed
description This paper describes an ensemble cluster analysis of bivariate profiles of HIV biomarkers, viral load and CD4 cell counts, which jointly measure disease progression. Data are from a prevalent cohort of HIV positive participants in a clinical trial of vitamin supplementation in Botswana. These individuals were HIV positive upon enrollment, but with unknown times of infection. To categorize groups of participants based on their patterns of progression of HIV infection using both biomarkers, we combine univariate shape-based cluster results for multiple biomarkers through the use of ensemble clustering methods. We first describe univariate clustering for each of the individual biomarker profiles, and make use of shape-respecting distances for clustering the longitudinal profile data. In our data, profiles are subject to either missing or irregular measurements as well as unobserved initiation times of the process of interest. Shape-respecting distances that can handle such data issues, preserve time-ordering, and identify similar profile shapes are useful in identifying patterns of disease progression from longitudinal biomarker data. However, their performance with regard to clustering differs by severity of the data issues mentioned above. We provide an empirical investigation of shape-respecting distances (Fréchet and dynamic time warping (DTW)) on benchmark shape data, and use DTW in cluster analysis of biomarker profile observations. These reveal a primary group of ‘typical progressors,’ as well as a smaller group that shows relatively rapid progression. We then refine the analysis using ensemble clustering for both markers to obtain a single classification. The information from joint evaluation of the two biomarkers combined with ensemble clustering reveals subgroups of patients not identifiable through univariate analyses; noteworthy subgroups are those that appear to represent recently and chronically infected subsets. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s41060-022-00323-2.
format Online
Article
Text
id pubmed-9064718
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-90647182022-05-04 Ensemble clustering of longitudinal bivariate HIV biomarker profiles to group patients by patterns of disease progression Lynch, Miranda L. DeGruttola, Victor Int J Data Sci Anal Applications This paper describes an ensemble cluster analysis of bivariate profiles of HIV biomarkers, viral load and CD4 cell counts, which jointly measure disease progression. Data are from a prevalent cohort of HIV positive participants in a clinical trial of vitamin supplementation in Botswana. These individuals were HIV positive upon enrollment, but with unknown times of infection. To categorize groups of participants based on their patterns of progression of HIV infection using both biomarkers, we combine univariate shape-based cluster results for multiple biomarkers through the use of ensemble clustering methods. We first describe univariate clustering for each of the individual biomarker profiles, and make use of shape-respecting distances for clustering the longitudinal profile data. In our data, profiles are subject to either missing or irregular measurements as well as unobserved initiation times of the process of interest. Shape-respecting distances that can handle such data issues, preserve time-ordering, and identify similar profile shapes are useful in identifying patterns of disease progression from longitudinal biomarker data. However, their performance with regard to clustering differs by severity of the data issues mentioned above. We provide an empirical investigation of shape-respecting distances (Fréchet and dynamic time warping (DTW)) on benchmark shape data, and use DTW in cluster analysis of biomarker profile observations. These reveal a primary group of ‘typical progressors,’ as well as a smaller group that shows relatively rapid progression. We then refine the analysis using ensemble clustering for both markers to obtain a single classification. The information from joint evaluation of the two biomarkers combined with ensemble clustering reveals subgroups of patients not identifiable through univariate analyses; noteworthy subgroups are those that appear to represent recently and chronically infected subsets. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s41060-022-00323-2. Springer International Publishing 2022-05-04 2022 /pmc/articles/PMC9064718/ /pubmed/35528805 http://dx.doi.org/10.1007/s41060-022-00323-2 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Applications
Lynch, Miranda L.
DeGruttola, Victor
Ensemble clustering of longitudinal bivariate HIV biomarker profiles to group patients by patterns of disease progression
title Ensemble clustering of longitudinal bivariate HIV biomarker profiles to group patients by patterns of disease progression
title_full Ensemble clustering of longitudinal bivariate HIV biomarker profiles to group patients by patterns of disease progression
title_fullStr Ensemble clustering of longitudinal bivariate HIV biomarker profiles to group patients by patterns of disease progression
title_full_unstemmed Ensemble clustering of longitudinal bivariate HIV biomarker profiles to group patients by patterns of disease progression
title_short Ensemble clustering of longitudinal bivariate HIV biomarker profiles to group patients by patterns of disease progression
title_sort ensemble clustering of longitudinal bivariate hiv biomarker profiles to group patients by patterns of disease progression
topic Applications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9064718/
https://www.ncbi.nlm.nih.gov/pubmed/35528805
http://dx.doi.org/10.1007/s41060-022-00323-2
work_keys_str_mv AT lynchmirandal ensembleclusteringoflongitudinalbivariatehivbiomarkerprofilestogrouppatientsbypatternsofdiseaseprogression
AT degruttolavictor ensembleclusteringoflongitudinalbivariatehivbiomarkerprofilestogrouppatientsbypatternsofdiseaseprogression