Cargando…

Cancer Type Classification in Liquid Biopsies Based on Sparse Mutational Profiles Enabled through Data Augmentation and Integration

Identifying the cell of origin of cancer is important to guide treatment decisions. Machine learning approaches have been proposed to classify the cell of origin based on somatic mutation profiles from solid biopsies. However, solid biopsies can cause complications and certain tumors are not accessi...

Descripción completa

Detalles Bibliográficos
Autores principales: Danyi, Alexandra, Jager, Myrthe, de Ridder, Jeroen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8780455/
https://www.ncbi.nlm.nih.gov/pubmed/35054395
http://dx.doi.org/10.3390/life12010001
_version_ 1784637842989776896
author Danyi, Alexandra
Jager, Myrthe
de Ridder, Jeroen
author_facet Danyi, Alexandra
Jager, Myrthe
de Ridder, Jeroen
author_sort Danyi, Alexandra
collection PubMed
description Identifying the cell of origin of cancer is important to guide treatment decisions. Machine learning approaches have been proposed to classify the cell of origin based on somatic mutation profiles from solid biopsies. However, solid biopsies can cause complications and certain tumors are not accessible. Liquid biopsies are promising alternatives but their somatic mutation profile is sparse and current machine learning models fail to perform in this setting. We propose an improved method to deal with sparsity in liquid biopsy data. Firstly, data augmentation is performed on sparse data to enhance model robustness. Secondly, we employ data integration to merge information from: (i) SNV density; (ii) SNVs in driver genes and (iii) trinucleotide motifs. Our adapted method achieves an average accuracy of 0.88 and 0.65 on data where only 70% and 2% of SNVs are retained, compared to 0.83 and 0.41 with the original model, respectively. The method and results presented here open the way for application of machine learning in the detection of the cell of origin of cancer from liquid biopsy data.
format Online
Article
Text
id pubmed-8780455
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-87804552022-01-22 Cancer Type Classification in Liquid Biopsies Based on Sparse Mutational Profiles Enabled through Data Augmentation and Integration Danyi, Alexandra Jager, Myrthe de Ridder, Jeroen Life (Basel) Article Identifying the cell of origin of cancer is important to guide treatment decisions. Machine learning approaches have been proposed to classify the cell of origin based on somatic mutation profiles from solid biopsies. However, solid biopsies can cause complications and certain tumors are not accessible. Liquid biopsies are promising alternatives but their somatic mutation profile is sparse and current machine learning models fail to perform in this setting. We propose an improved method to deal with sparsity in liquid biopsy data. Firstly, data augmentation is performed on sparse data to enhance model robustness. Secondly, we employ data integration to merge information from: (i) SNV density; (ii) SNVs in driver genes and (iii) trinucleotide motifs. Our adapted method achieves an average accuracy of 0.88 and 0.65 on data where only 70% and 2% of SNVs are retained, compared to 0.83 and 0.41 with the original model, respectively. The method and results presented here open the way for application of machine learning in the detection of the cell of origin of cancer from liquid biopsy data. MDPI 2021-12-21 /pmc/articles/PMC8780455/ /pubmed/35054395 http://dx.doi.org/10.3390/life12010001 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Danyi, Alexandra
Jager, Myrthe
de Ridder, Jeroen
Cancer Type Classification in Liquid Biopsies Based on Sparse Mutational Profiles Enabled through Data Augmentation and Integration
title Cancer Type Classification in Liquid Biopsies Based on Sparse Mutational Profiles Enabled through Data Augmentation and Integration
title_full Cancer Type Classification in Liquid Biopsies Based on Sparse Mutational Profiles Enabled through Data Augmentation and Integration
title_fullStr Cancer Type Classification in Liquid Biopsies Based on Sparse Mutational Profiles Enabled through Data Augmentation and Integration
title_full_unstemmed Cancer Type Classification in Liquid Biopsies Based on Sparse Mutational Profiles Enabled through Data Augmentation and Integration
title_short Cancer Type Classification in Liquid Biopsies Based on Sparse Mutational Profiles Enabled through Data Augmentation and Integration
title_sort cancer type classification in liquid biopsies based on sparse mutational profiles enabled through data augmentation and integration
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8780455/
https://www.ncbi.nlm.nih.gov/pubmed/35054395
http://dx.doi.org/10.3390/life12010001
work_keys_str_mv AT danyialexandra cancertypeclassificationinliquidbiopsiesbasedonsparsemutationalprofilesenabledthroughdataaugmentationandintegration
AT jagermyrthe cancertypeclassificationinliquidbiopsiesbasedonsparsemutationalprofilesenabledthroughdataaugmentationandintegration
AT deridderjeroen cancertypeclassificationinliquidbiopsiesbasedonsparsemutationalprofilesenabledthroughdataaugmentationandintegration