Cargando…

Exploratory Data Mining for Subgroup Cohort Discoveries and Prioritization

Finding small homogeneous subgroup cohorts in large heterogeneous populations is a critical process for hypothesis development in biomedical research. Concurrent computational approaches are still lacking in robust answers to the question “what hypotheses are likely to be novel and to produce clinic...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Danlu, Baskett, William, Beversdorf, David, Shyu, Chi-Ren
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9341221/
https://www.ncbi.nlm.nih.gov/pubmed/31494566
http://dx.doi.org/10.1109/JBHI.2019.2939149
Descripción
Sumario:Finding small homogeneous subgroup cohorts in large heterogeneous populations is a critical process for hypothesis development in biomedical research. Concurrent computational approaches are still lacking in robust answers to the question “what hypotheses are likely to be novel and to produce clinically relevant results with well thought-out study designs?” We have developed a novel subgroup discovery method which employs a deep exploratory mining process to slice and dice thousands of potential subpopulations and prioritize potential cohorts based on their explainable contrast patterns and which may provide interventionable insights. We conducted computational experiments on both synthesized data and a clinical autism data set to assess performance quantitatively for coverage of pre-defined cohorts and qualitatively for novel knowledge discovery, respectively. We also conducted a scaling analysis using a distributed computing environment to suggest computational resource needs for when the subpopulation number increases. This work will provide a robust data-driven framework to automatically tailor potential interventions for precision health.