Cargando…
Using Classification and K-means Methods to Predict Breast Cancer Recurrence in Gene Expression Data
BACKGROUND: Breast cancer is a type of cancer that starts in the breast tissue and affects about 10% of women at different stages of their lives. In this study, we applied a new method to predict recurrence in biological networks made from gene expression data. METHOD: The method includes the steps...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Wolters Kluwer - Medknow
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9215834/ https://www.ncbi.nlm.nih.gov/pubmed/35755980 http://dx.doi.org/10.4103/jmss.jmss_117_21 |
_version_ | 1784731295443582976 |
---|---|
author | Sehhati, Mohammadreza Tabatabaiefar, Mohammad Amin Gholami, Ali Haji Sattari, Mohammad |
author_facet | Sehhati, Mohammadreza Tabatabaiefar, Mohammad Amin Gholami, Ali Haji Sattari, Mohammad |
author_sort | Sehhati, Mohammadreza |
collection | PubMed |
description | BACKGROUND: Breast cancer is a type of cancer that starts in the breast tissue and affects about 10% of women at different stages of their lives. In this study, we applied a new method to predict recurrence in biological networks made from gene expression data. METHOD: The method includes the steps such as data collection, clustering, determining differentiating genes, and classification. The eight techniques consist of random forest, support vector machine and neural network, randomforest + k-means, hidden markov model, joint mutual information, neural network + k-means and suportvector machine + k-menas were implemented on 12172 genes and 200 samples. RESULTS: Thirty genes were considered as differentiating genes which used for the classification. The results showed that random forest + k-means get better performance than other techniques. The two techniques including neural network + k-means and random forest + k-means performed better than other techniques in identifying high risk cases. CONCLUSION: Thirty of 12,172 genes are considered for classification that the use of clustering has improved the classification techniques performance. |
format | Online Article Text |
id | pubmed-9215834 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Wolters Kluwer - Medknow |
record_format | MEDLINE/PubMed |
spelling | pubmed-92158342022-06-23 Using Classification and K-means Methods to Predict Breast Cancer Recurrence in Gene Expression Data Sehhati, Mohammadreza Tabatabaiefar, Mohammad Amin Gholami, Ali Haji Sattari, Mohammad J Med Signals Sens Original Article BACKGROUND: Breast cancer is a type of cancer that starts in the breast tissue and affects about 10% of women at different stages of their lives. In this study, we applied a new method to predict recurrence in biological networks made from gene expression data. METHOD: The method includes the steps such as data collection, clustering, determining differentiating genes, and classification. The eight techniques consist of random forest, support vector machine and neural network, randomforest + k-means, hidden markov model, joint mutual information, neural network + k-means and suportvector machine + k-menas were implemented on 12172 genes and 200 samples. RESULTS: Thirty genes were considered as differentiating genes which used for the classification. The results showed that random forest + k-means get better performance than other techniques. The two techniques including neural network + k-means and random forest + k-means performed better than other techniques in identifying high risk cases. CONCLUSION: Thirty of 12,172 genes are considered for classification that the use of clustering has improved the classification techniques performance. Wolters Kluwer - Medknow 2022-05-12 /pmc/articles/PMC9215834/ /pubmed/35755980 http://dx.doi.org/10.4103/jmss.jmss_117_21 Text en Copyright: © 2022 Journal of Medical Signals & Sensors https://creativecommons.org/licenses/by-nc-sa/4.0/This is an open access journal, and articles are distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 License, which allows others to remix, tweak, and build upon the work non-commercially, as long as appropriate credit is given and the new creations are licensed under the identical terms. |
spellingShingle | Original Article Sehhati, Mohammadreza Tabatabaiefar, Mohammad Amin Gholami, Ali Haji Sattari, Mohammad Using Classification and K-means Methods to Predict Breast Cancer Recurrence in Gene Expression Data |
title | Using Classification and K-means Methods to Predict Breast Cancer Recurrence in Gene Expression Data |
title_full | Using Classification and K-means Methods to Predict Breast Cancer Recurrence in Gene Expression Data |
title_fullStr | Using Classification and K-means Methods to Predict Breast Cancer Recurrence in Gene Expression Data |
title_full_unstemmed | Using Classification and K-means Methods to Predict Breast Cancer Recurrence in Gene Expression Data |
title_short | Using Classification and K-means Methods to Predict Breast Cancer Recurrence in Gene Expression Data |
title_sort | using classification and k-means methods to predict breast cancer recurrence in gene expression data |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9215834/ https://www.ncbi.nlm.nih.gov/pubmed/35755980 http://dx.doi.org/10.4103/jmss.jmss_117_21 |
work_keys_str_mv | AT sehhatimohammadreza usingclassificationandkmeansmethodstopredictbreastcancerrecurrenceingeneexpressiondata AT tabatabaiefarmohammadamin usingclassificationandkmeansmethodstopredictbreastcancerrecurrenceingeneexpressiondata AT gholamialihaji usingclassificationandkmeansmethodstopredictbreastcancerrecurrenceingeneexpressiondata AT sattarimohammad usingclassificationandkmeansmethodstopredictbreastcancerrecurrenceingeneexpressiondata |