Cargando…
Identifying homogeneous subgroups of patients and important features: a topological machine learning approach
BACKGROUND: This paper exploits recent developments in topological data analysis to present a pipeline for clustering based on Mapper, an algorithm that reduces complex data into a one-dimensional graph. RESULTS: We present a pipeline to identify and summarise clusters based on statistically signifi...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8451168/ https://www.ncbi.nlm.nih.gov/pubmed/34544357 http://dx.doi.org/10.1186/s12859-021-04360-9 |
Sumario: | BACKGROUND: This paper exploits recent developments in topological data analysis to present a pipeline for clustering based on Mapper, an algorithm that reduces complex data into a one-dimensional graph. RESULTS: We present a pipeline to identify and summarise clusters based on statistically significant topological features from a point cloud using Mapper. CONCLUSIONS: Key strengths of this pipeline include the integration of prior knowledge to inform the clustering process and the selection of optimal clusters; the use of the bootstrap to restrict the search to robust topological features; the use of machine learning to inspect clusters; and the ability to incorporate mixed data types. Our pipeline can be downloaded under the GNU GPLv3 license at https://github.com/kcl-bhi/mapper-pipeline. |
---|