Cargando…

Predicting breast cancer risk using interacting genetic and demographic factors and machine learning

Breast cancer (BC) is a multifactorial disease and the most common cancer in women worldwide. We describe a machine learning approach to identify a combination of interacting genetic variants (SNPs) and demographic risk factors for BC, especially factors related to both familial history (Group 1) an...

Descripción completa

Detalles Bibliográficos
Autores principales: Behravan, Hamid, Hartikainen, Jaana M., Tengström, Maria, Kosma, Veli–Matti, Mannermaa, Arto
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7338351/
https://www.ncbi.nlm.nih.gov/pubmed/32632202
http://dx.doi.org/10.1038/s41598-020-66907-9
_version_ 1783554654448123904
author Behravan, Hamid
Hartikainen, Jaana M.
Tengström, Maria
Kosma, Veli–Matti
Mannermaa, Arto
author_facet Behravan, Hamid
Hartikainen, Jaana M.
Tengström, Maria
Kosma, Veli–Matti
Mannermaa, Arto
author_sort Behravan, Hamid
collection PubMed
description Breast cancer (BC) is a multifactorial disease and the most common cancer in women worldwide. We describe a machine learning approach to identify a combination of interacting genetic variants (SNPs) and demographic risk factors for BC, especially factors related to both familial history (Group 1) and oestrogen metabolism (Group 2), for predicting BC risk. This approach identifies the best combinations of interacting genetic and demographic risk factors that yield the highest BC risk prediction accuracy. In tests on the Kuopio Breast Cancer Project (KBCP) dataset, our approach achieves a mean average precision (mAP) of 77.78 in predicting BC risk by using interacting genetic and Group 1 features, which is better than the mAPs of 74.19 and 73.65 achieved using only Group 1 features and interacting SNPs, respectively. Similarly, using interacting genetic and Group 2 features yields a mAP of 78.00, which outperforms the system based on only Group 2 features, which has a mAP of 72.57. Furthermore, the gene interaction maps built from genes associated with SNPs that interact with demographic risk factors indicate important BC-related biological entities, such as angiogenesis, apoptosis and oestrogen-related networks. The results also show that demographic risk factors are individually more important than genetic variants in predicting BC risk.
format Online
Article
Text
id pubmed-7338351
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-73383512020-07-07 Predicting breast cancer risk using interacting genetic and demographic factors and machine learning Behravan, Hamid Hartikainen, Jaana M. Tengström, Maria Kosma, Veli–Matti Mannermaa, Arto Sci Rep Article Breast cancer (BC) is a multifactorial disease and the most common cancer in women worldwide. We describe a machine learning approach to identify a combination of interacting genetic variants (SNPs) and demographic risk factors for BC, especially factors related to both familial history (Group 1) and oestrogen metabolism (Group 2), for predicting BC risk. This approach identifies the best combinations of interacting genetic and demographic risk factors that yield the highest BC risk prediction accuracy. In tests on the Kuopio Breast Cancer Project (KBCP) dataset, our approach achieves a mean average precision (mAP) of 77.78 in predicting BC risk by using interacting genetic and Group 1 features, which is better than the mAPs of 74.19 and 73.65 achieved using only Group 1 features and interacting SNPs, respectively. Similarly, using interacting genetic and Group 2 features yields a mAP of 78.00, which outperforms the system based on only Group 2 features, which has a mAP of 72.57. Furthermore, the gene interaction maps built from genes associated with SNPs that interact with demographic risk factors indicate important BC-related biological entities, such as angiogenesis, apoptosis and oestrogen-related networks. The results also show that demographic risk factors are individually more important than genetic variants in predicting BC risk. Nature Publishing Group UK 2020-07-06 /pmc/articles/PMC7338351/ /pubmed/32632202 http://dx.doi.org/10.1038/s41598-020-66907-9 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Behravan, Hamid
Hartikainen, Jaana M.
Tengström, Maria
Kosma, Veli–Matti
Mannermaa, Arto
Predicting breast cancer risk using interacting genetic and demographic factors and machine learning
title Predicting breast cancer risk using interacting genetic and demographic factors and machine learning
title_full Predicting breast cancer risk using interacting genetic and demographic factors and machine learning
title_fullStr Predicting breast cancer risk using interacting genetic and demographic factors and machine learning
title_full_unstemmed Predicting breast cancer risk using interacting genetic and demographic factors and machine learning
title_short Predicting breast cancer risk using interacting genetic and demographic factors and machine learning
title_sort predicting breast cancer risk using interacting genetic and demographic factors and machine learning
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7338351/
https://www.ncbi.nlm.nih.gov/pubmed/32632202
http://dx.doi.org/10.1038/s41598-020-66907-9
work_keys_str_mv AT behravanhamid predictingbreastcancerriskusinginteractinggeneticanddemographicfactorsandmachinelearning
AT hartikainenjaanam predictingbreastcancerriskusinginteractinggeneticanddemographicfactorsandmachinelearning
AT tengstrommaria predictingbreastcancerriskusinginteractinggeneticanddemographicfactorsandmachinelearning
AT kosmavelimatti predictingbreastcancerriskusinginteractinggeneticanddemographicfactorsandmachinelearning
AT mannermaaarto predictingbreastcancerriskusinginteractinggeneticanddemographicfactorsandmachinelearning