Cargando…

Machine Learning Approaches to Classify Primary and Metastatic Cancers Using Tissue of Origin-Based DNA Methylation Profiles

SIMPLE SUMMARY: Cancer metastasis is considered to be one of the most significant causes of cancer morbidity, accounting for up to 90% of cancer deaths. The accurate identification of a cancer’s origin and the types of cancer cells it comprises is crucial in enabling clinicians to decide better trea...

Descripción completa

Detalles Bibliográficos
Autores principales: Modhukur, Vijayachitra, Sharma, Shakshi, Mondal, Mainak, Lawarde, Ankita, Kask, Keiu, Sharma, Rajesh, Salumets, Andres
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8345047/
https://www.ncbi.nlm.nih.gov/pubmed/34359669
http://dx.doi.org/10.3390/cancers13153768
_version_ 1783734535557480448
author Modhukur, Vijayachitra
Sharma, Shakshi
Mondal, Mainak
Lawarde, Ankita
Kask, Keiu
Sharma, Rajesh
Salumets, Andres
author_facet Modhukur, Vijayachitra
Sharma, Shakshi
Mondal, Mainak
Lawarde, Ankita
Kask, Keiu
Sharma, Rajesh
Salumets, Andres
author_sort Modhukur, Vijayachitra
collection PubMed
description SIMPLE SUMMARY: Cancer metastasis is considered to be one of the most significant causes of cancer morbidity, accounting for up to 90% of cancer deaths. The accurate identification of a cancer’s origin and the types of cancer cells it comprises is crucial in enabling clinicians to decide better treatment options for patients. DNA methylation changes are increasingly recognized as determining cancer prediction, especially for the transition to metastasis. Research in the last decade has shown the incredible promise of the use of artificial intelligence (AI) in cancer classification. In this study, we applied several machine learning techniques, a branch of AI, to identify cancer tissue or origin and further classified cancer samples as primary and metastatic cancers based on publicly available DNA methylation data. Overall, our analysis resulted in a 99% accuracy for predicting cancer subtypes based on the tissue of origin. ABSTRACT: Metastatic cancers account for up to 90% of cancer-related deaths. The clear differentiation of metastatic cancers from primary cancers is crucial for cancer type identification and developing targeted treatment for each cancer type. DNA methylation patterns are suggested to be an intriguing target for cancer prediction and are also considered to be an important mediator for the transition to metastatic cancer. In the present study, we used 24 cancer types and 9303 methylome samples downloaded from publicly available data repositories, including The Cancer Genome Atlas (TCGA) and the Gene Expression Omnibus (GEO). We constructed machine learning classifiers to discriminate metastatic, primary, and non-cancerous methylome samples. We applied support vector machines (SVM), Naive Bayes (NB), extreme gradient boosting (XGBoost), and random forest (RF) machine learning models to classify the cancer types based on their tissue of origin. RF outperformed the other classifiers, with an average accuracy of 99%. Moreover, we applied local interpretable model-agnostic explanations (LIME) to explain important methylation biomarkers to classify cancer types.
format Online
Article
Text
id pubmed-8345047
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-83450472021-08-07 Machine Learning Approaches to Classify Primary and Metastatic Cancers Using Tissue of Origin-Based DNA Methylation Profiles Modhukur, Vijayachitra Sharma, Shakshi Mondal, Mainak Lawarde, Ankita Kask, Keiu Sharma, Rajesh Salumets, Andres Cancers (Basel) Article SIMPLE SUMMARY: Cancer metastasis is considered to be one of the most significant causes of cancer morbidity, accounting for up to 90% of cancer deaths. The accurate identification of a cancer’s origin and the types of cancer cells it comprises is crucial in enabling clinicians to decide better treatment options for patients. DNA methylation changes are increasingly recognized as determining cancer prediction, especially for the transition to metastasis. Research in the last decade has shown the incredible promise of the use of artificial intelligence (AI) in cancer classification. In this study, we applied several machine learning techniques, a branch of AI, to identify cancer tissue or origin and further classified cancer samples as primary and metastatic cancers based on publicly available DNA methylation data. Overall, our analysis resulted in a 99% accuracy for predicting cancer subtypes based on the tissue of origin. ABSTRACT: Metastatic cancers account for up to 90% of cancer-related deaths. The clear differentiation of metastatic cancers from primary cancers is crucial for cancer type identification and developing targeted treatment for each cancer type. DNA methylation patterns are suggested to be an intriguing target for cancer prediction and are also considered to be an important mediator for the transition to metastatic cancer. In the present study, we used 24 cancer types and 9303 methylome samples downloaded from publicly available data repositories, including The Cancer Genome Atlas (TCGA) and the Gene Expression Omnibus (GEO). We constructed machine learning classifiers to discriminate metastatic, primary, and non-cancerous methylome samples. We applied support vector machines (SVM), Naive Bayes (NB), extreme gradient boosting (XGBoost), and random forest (RF) machine learning models to classify the cancer types based on their tissue of origin. RF outperformed the other classifiers, with an average accuracy of 99%. Moreover, we applied local interpretable model-agnostic explanations (LIME) to explain important methylation biomarkers to classify cancer types. MDPI 2021-07-27 /pmc/articles/PMC8345047/ /pubmed/34359669 http://dx.doi.org/10.3390/cancers13153768 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Modhukur, Vijayachitra
Sharma, Shakshi
Mondal, Mainak
Lawarde, Ankita
Kask, Keiu
Sharma, Rajesh
Salumets, Andres
Machine Learning Approaches to Classify Primary and Metastatic Cancers Using Tissue of Origin-Based DNA Methylation Profiles
title Machine Learning Approaches to Classify Primary and Metastatic Cancers Using Tissue of Origin-Based DNA Methylation Profiles
title_full Machine Learning Approaches to Classify Primary and Metastatic Cancers Using Tissue of Origin-Based DNA Methylation Profiles
title_fullStr Machine Learning Approaches to Classify Primary and Metastatic Cancers Using Tissue of Origin-Based DNA Methylation Profiles
title_full_unstemmed Machine Learning Approaches to Classify Primary and Metastatic Cancers Using Tissue of Origin-Based DNA Methylation Profiles
title_short Machine Learning Approaches to Classify Primary and Metastatic Cancers Using Tissue of Origin-Based DNA Methylation Profiles
title_sort machine learning approaches to classify primary and metastatic cancers using tissue of origin-based dna methylation profiles
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8345047/
https://www.ncbi.nlm.nih.gov/pubmed/34359669
http://dx.doi.org/10.3390/cancers13153768
work_keys_str_mv AT modhukurvijayachitra machinelearningapproachestoclassifyprimaryandmetastaticcancersusingtissueoforiginbaseddnamethylationprofiles
AT sharmashakshi machinelearningapproachestoclassifyprimaryandmetastaticcancersusingtissueoforiginbaseddnamethylationprofiles
AT mondalmainak machinelearningapproachestoclassifyprimaryandmetastaticcancersusingtissueoforiginbaseddnamethylationprofiles
AT lawardeankita machinelearningapproachestoclassifyprimaryandmetastaticcancersusingtissueoforiginbaseddnamethylationprofiles
AT kaskkeiu machinelearningapproachestoclassifyprimaryandmetastaticcancersusingtissueoforiginbaseddnamethylationprofiles
AT sharmarajesh machinelearningapproachestoclassifyprimaryandmetastaticcancersusingtissueoforiginbaseddnamethylationprofiles
AT salumetsandres machinelearningapproachestoclassifyprimaryandmetastaticcancersusingtissueoforiginbaseddnamethylationprofiles