Cargando…
Predicting liver cancer on epigenomics data using machine learning
Epigenomics is the branch of biology concerned with the phenotype modifications that do not induce any change in the cell DNA sequence. Epigenetic modifications apply changes to the properties of DNA, which ultimately prevents such DNA actions from being executed. These alterations arise in the canc...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9580905/ https://www.ncbi.nlm.nih.gov/pubmed/36304318 http://dx.doi.org/10.3389/fbinf.2022.954529 |
_version_ | 1784812497848500224 |
---|---|
author | Vekariya, Vishalkumar Passi, Kalpdrum Jain, Chakresh Kumar |
author_facet | Vekariya, Vishalkumar Passi, Kalpdrum Jain, Chakresh Kumar |
author_sort | Vekariya, Vishalkumar |
collection | PubMed |
description | Epigenomics is the branch of biology concerned with the phenotype modifications that do not induce any change in the cell DNA sequence. Epigenetic modifications apply changes to the properties of DNA, which ultimately prevents such DNA actions from being executed. These alterations arise in the cancer cells, which is the only cause of cancer. The liver is the metabolic cleansing center of the human body and the only organ, which can regenerate itself, but liver cancer can stop the cleansing of the body. Machine learning techniques are used in this research to predict the gene expression of the liver cells for the liver hepatocellular carcinoma (LIHC), which is the third biggest reason of death by cancer and affects five hundred thousand people per year. The data for LIHC include four different types, namely, methylation, histone, the human genome, and RNA sequences. The data were accessed through open-source technologies in R programming languages for The Cancer Genome Atlas (TCGA). The proposed method considers 1,000 features across the four types of data. Nine different feature selection methods were used and eight different classification methods were compared to select the best model over 5-fold cross-validation and different training-to-test ratios. The best model was obtained for 140 features for ReliefF feature selection and XGBoost classification method with an AUC of 1.0 and an accuracy of 99.67% to predict the liver cancer. |
format | Online Article Text |
id | pubmed-9580905 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-95809052022-10-26 Predicting liver cancer on epigenomics data using machine learning Vekariya, Vishalkumar Passi, Kalpdrum Jain, Chakresh Kumar Front Bioinform Bioinformatics Epigenomics is the branch of biology concerned with the phenotype modifications that do not induce any change in the cell DNA sequence. Epigenetic modifications apply changes to the properties of DNA, which ultimately prevents such DNA actions from being executed. These alterations arise in the cancer cells, which is the only cause of cancer. The liver is the metabolic cleansing center of the human body and the only organ, which can regenerate itself, but liver cancer can stop the cleansing of the body. Machine learning techniques are used in this research to predict the gene expression of the liver cells for the liver hepatocellular carcinoma (LIHC), which is the third biggest reason of death by cancer and affects five hundred thousand people per year. The data for LIHC include four different types, namely, methylation, histone, the human genome, and RNA sequences. The data were accessed through open-source technologies in R programming languages for The Cancer Genome Atlas (TCGA). The proposed method considers 1,000 features across the four types of data. Nine different feature selection methods were used and eight different classification methods were compared to select the best model over 5-fold cross-validation and different training-to-test ratios. The best model was obtained for 140 features for ReliefF feature selection and XGBoost classification method with an AUC of 1.0 and an accuracy of 99.67% to predict the liver cancer. Frontiers Media S.A. 2022-09-27 /pmc/articles/PMC9580905/ /pubmed/36304318 http://dx.doi.org/10.3389/fbinf.2022.954529 Text en Copyright © 2022 Vekariya, Passi and Jain. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Bioinformatics Vekariya, Vishalkumar Passi, Kalpdrum Jain, Chakresh Kumar Predicting liver cancer on epigenomics data using machine learning |
title | Predicting liver cancer on epigenomics data using machine learning |
title_full | Predicting liver cancer on epigenomics data using machine learning |
title_fullStr | Predicting liver cancer on epigenomics data using machine learning |
title_full_unstemmed | Predicting liver cancer on epigenomics data using machine learning |
title_short | Predicting liver cancer on epigenomics data using machine learning |
title_sort | predicting liver cancer on epigenomics data using machine learning |
topic | Bioinformatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9580905/ https://www.ncbi.nlm.nih.gov/pubmed/36304318 http://dx.doi.org/10.3389/fbinf.2022.954529 |
work_keys_str_mv | AT vekariyavishalkumar predictinglivercanceronepigenomicsdatausingmachinelearning AT passikalpdrum predictinglivercanceronepigenomicsdatausingmachinelearning AT jainchakreshkumar predictinglivercanceronepigenomicsdatausingmachinelearning |