Cargando…

Machine Learning to Discern Interactive Clusters of Risk Factors for Late Recurrence of Metastatic Breast Cancer

SIMPLE SUMMARY: Breast cancer is the most frequently diagnosed cancer and second leading cause of cancer-related death among women worldwide. After initial tumor resection, breast cancer may recur locally and/or in distant organs within several months to years or even decades. Multiple methods exist...

Descripción completa

Detalles Bibliográficos
Autores principales: Gomez Marti, Juan Luis, Brufsky, Adam, Wells, Alan, Jiang, Xia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8750735/
https://www.ncbi.nlm.nih.gov/pubmed/35008417
http://dx.doi.org/10.3390/cancers14010253
Descripción
Sumario:SIMPLE SUMMARY: Breast cancer is the most frequently diagnosed cancer and second leading cause of cancer-related death among women worldwide. After initial tumor resection, breast cancer may recur locally and/or in distant organs within several months to years or even decades. Multiple methods exist to prognosticate disease progression in the early months and years after diagnosis. However, further efforts are needed to identify risk factors that relate to recurrence beyond the initial 5-year window. In this study, we applied machine learning to retrieve single and interactive clinical and pathological risk factors of 5-, 10- and 15-year metastases. ABSTRACT: Background: Risk of metastatic recurrence of breast cancer after initial diagnosis and treatment depends on the presence of a number of risk factors. Although most univariate risk factors have been identified using classical methods, machine-learning methods are also being used to tease out non-obvious contributors to a patient’s individual risk of developing late distant metastasis. Bayesian-network algorithms can identify not only risk factors but also interactions among these risks, which consequently may increase the risk of developing metastatic breast cancer. We proposed to apply a previously developed machine-learning method to discern risk factors of 5-, 10- and 15-year metastases. Methods: We applied a previously validated algorithm named the Markov Blanket and Interactive Risk Factor Learner (MBIL) to the electronic health record (EHR)-based Lynn Sage Database (LSDB) from the Lynn Sage Comprehensive Breast Center at Northwestern Memorial Hospital. This algorithm provided an output of both single and interactive risk factors of 5-, 10-, and 15-year metastases from the LSDB. We individually examined and interpreted the clinical relevance of these interactions based on years to metastasis and reliance on interactivity between risk factors. Results: We found that, with lower alpha values (low interactivity score), the prevalence of variables with an independent influence on long-term metastasis was higher (i.e., HER2, TNEG). As the value of alpha increased to 480, stronger interactions were needed to define clusters of factors that increased the risk of metastasis (i.e., ER, smoking, race, alcohol usage). Conclusion: MBIL identified single and interacting risk factors of metastatic breast cancer, many of which were supported by clinical evidence. These results strongly recommend the development of further large data studies with different databases to validate the degree to which some of these variables impact metastatic breast cancer in the long term.