Cargando…

Using Classification and K-means Methods to Predict Breast Cancer Recurrence in Gene Expression Data

BACKGROUND: Breast cancer is a type of cancer that starts in the breast tissue and affects about 10% of women at different stages of their lives. In this study, we applied a new method to predict recurrence in biological networks made from gene expression data. METHOD: The method includes the steps...

Descripción completa

Detalles Bibliográficos
Autores principales: Sehhati, Mohammadreza, Tabatabaiefar, Mohammad Amin, Gholami, Ali Haji, Sattari, Mohammad
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Wolters Kluwer - Medknow 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9215834/
https://www.ncbi.nlm.nih.gov/pubmed/35755980
http://dx.doi.org/10.4103/jmss.jmss_117_21
_version_ 1784731295443582976
author Sehhati, Mohammadreza
Tabatabaiefar, Mohammad Amin
Gholami, Ali Haji
Sattari, Mohammad
author_facet Sehhati, Mohammadreza
Tabatabaiefar, Mohammad Amin
Gholami, Ali Haji
Sattari, Mohammad
author_sort Sehhati, Mohammadreza
collection PubMed
description BACKGROUND: Breast cancer is a type of cancer that starts in the breast tissue and affects about 10% of women at different stages of their lives. In this study, we applied a new method to predict recurrence in biological networks made from gene expression data. METHOD: The method includes the steps such as data collection, clustering, determining differentiating genes, and classification. The eight techniques consist of random forest, support vector machine and neural network, randomforest + k-means, hidden markov model, joint mutual information, neural network + k-means and suportvector machine + k-menas were implemented on 12172 genes and 200 samples. RESULTS: Thirty genes were considered as differentiating genes which used for the classification. The results showed that random forest + k-means get better performance than other techniques. The two techniques including neural network + k-means and random forest + k-means performed better than other techniques in identifying high risk cases. CONCLUSION: Thirty of 12,172 genes are considered for classification that the use of clustering has improved the classification techniques performance.
format Online
Article
Text
id pubmed-9215834
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Wolters Kluwer - Medknow
record_format MEDLINE/PubMed
spelling pubmed-92158342022-06-23 Using Classification and K-means Methods to Predict Breast Cancer Recurrence in Gene Expression Data Sehhati, Mohammadreza Tabatabaiefar, Mohammad Amin Gholami, Ali Haji Sattari, Mohammad J Med Signals Sens Original Article BACKGROUND: Breast cancer is a type of cancer that starts in the breast tissue and affects about 10% of women at different stages of their lives. In this study, we applied a new method to predict recurrence in biological networks made from gene expression data. METHOD: The method includes the steps such as data collection, clustering, determining differentiating genes, and classification. The eight techniques consist of random forest, support vector machine and neural network, randomforest + k-means, hidden markov model, joint mutual information, neural network + k-means and suportvector machine + k-menas were implemented on 12172 genes and 200 samples. RESULTS: Thirty genes were considered as differentiating genes which used for the classification. The results showed that random forest + k-means get better performance than other techniques. The two techniques including neural network + k-means and random forest + k-means performed better than other techniques in identifying high risk cases. CONCLUSION: Thirty of 12,172 genes are considered for classification that the use of clustering has improved the classification techniques performance. Wolters Kluwer - Medknow 2022-05-12 /pmc/articles/PMC9215834/ /pubmed/35755980 http://dx.doi.org/10.4103/jmss.jmss_117_21 Text en Copyright: © 2022 Journal of Medical Signals & Sensors https://creativecommons.org/licenses/by-nc-sa/4.0/This is an open access journal, and articles are distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 License, which allows others to remix, tweak, and build upon the work non-commercially, as long as appropriate credit is given and the new creations are licensed under the identical terms.
spellingShingle Original Article
Sehhati, Mohammadreza
Tabatabaiefar, Mohammad Amin
Gholami, Ali Haji
Sattari, Mohammad
Using Classification and K-means Methods to Predict Breast Cancer Recurrence in Gene Expression Data
title Using Classification and K-means Methods to Predict Breast Cancer Recurrence in Gene Expression Data
title_full Using Classification and K-means Methods to Predict Breast Cancer Recurrence in Gene Expression Data
title_fullStr Using Classification and K-means Methods to Predict Breast Cancer Recurrence in Gene Expression Data
title_full_unstemmed Using Classification and K-means Methods to Predict Breast Cancer Recurrence in Gene Expression Data
title_short Using Classification and K-means Methods to Predict Breast Cancer Recurrence in Gene Expression Data
title_sort using classification and k-means methods to predict breast cancer recurrence in gene expression data
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9215834/
https://www.ncbi.nlm.nih.gov/pubmed/35755980
http://dx.doi.org/10.4103/jmss.jmss_117_21
work_keys_str_mv AT sehhatimohammadreza usingclassificationandkmeansmethodstopredictbreastcancerrecurrenceingeneexpressiondata
AT tabatabaiefarmohammadamin usingclassificationandkmeansmethodstopredictbreastcancerrecurrenceingeneexpressiondata
AT gholamialihaji usingclassificationandkmeansmethodstopredictbreastcancerrecurrenceingeneexpressiondata
AT sattarimohammad usingclassificationandkmeansmethodstopredictbreastcancerrecurrenceingeneexpressiondata