Cargando…
Recursive SVM biomarker selection for early detection of breast cancer in peripheral blood
BACKGROUND: Breast cancer is worldwide the second most common type of cancer after lung cancer. Traditional mammography and Tissue Microarray has been studied for early cancer detection and cancer prediction. However, there is a need for more reliable diagnostic tools for early detection of breast c...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3552693/ https://www.ncbi.nlm.nih.gov/pubmed/23369435 http://dx.doi.org/10.1186/1755-8794-6-S1-S4 |
_version_ | 1782256701453369344 |
---|---|
author | Zhang, Fan Kaufman, Howard L Deng, Youping Drabier, Renee |
author_facet | Zhang, Fan Kaufman, Howard L Deng, Youping Drabier, Renee |
author_sort | Zhang, Fan |
collection | PubMed |
description | BACKGROUND: Breast cancer is worldwide the second most common type of cancer after lung cancer. Traditional mammography and Tissue Microarray has been studied for early cancer detection and cancer prediction. However, there is a need for more reliable diagnostic tools for early detection of breast cancer. This can be a challenge due to a number of factors and logistics. First, obtaining tissue biopsies can be difficult. Second, mammography may not detect small tumors, and is often unsatisfactory for younger women who typically have dense breast tissue. Lastly, breast cancer is not a single homogeneous disease but consists of multiple disease states, each arising from a distinct molecular mechanism and having a distinct clinical progression path which makes the disease difficult to detect and predict in early stages. RESULTS: In the paper, we present a Support Vector Machine based on Recursive Feature Elimination and Cross Validation (SVM-RFE-CV) algorithm for early detection of breast cancer in peripheral blood and show how to use SVM-RFE-CV to model the classification and prediction problem of early detection of breast cancer in peripheral blood. The training set which consists of 32 health and 33 cancer samples and the testing set consisting of 31 health and 34 cancer samples were randomly separated from a dataset of peripheral blood of breast cancer that is downloaded from Gene Express Omnibus. First, we identified the 42 differentially expressed biomarkers between "normal" and "cancer". Then, with the SVM-RFE-CV we extracted 15 biomarkers that yield zero cross validation score. Lastly, we compared the classification and prediction performance of SVM-RFE-CV with that of SVM and SVM Recursive Feature Elimination (SVM-RFE). CONCLUSIONS: We found that 1) the SVM-RFE-CV is suitable for analyzing noisy high-throughput microarray data, 2) it outperforms SVM-RFE in the robustness to noise and in the ability to recover informative features, and 3) it can improve the prediction performance (Area Under Curve) in the testing data set from 0.5826 to 0.7879. Further pathway analysis showed that the biomarkers are associated with Signaling, Hemostasis, Hormones, and Immune System, which are consistent with previous findings. Our prediction model can serve as a general model for biomarker discovery in early detection of other cancers. In the future, Polymerase Chain Reaction (PCR) is planned for validation of the ability of these potential biomarkers for early detection of breast cancer. |
format | Online Article Text |
id | pubmed-3552693 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-35526932013-01-28 Recursive SVM biomarker selection for early detection of breast cancer in peripheral blood Zhang, Fan Kaufman, Howard L Deng, Youping Drabier, Renee BMC Med Genomics Research BACKGROUND: Breast cancer is worldwide the second most common type of cancer after lung cancer. Traditional mammography and Tissue Microarray has been studied for early cancer detection and cancer prediction. However, there is a need for more reliable diagnostic tools for early detection of breast cancer. This can be a challenge due to a number of factors and logistics. First, obtaining tissue biopsies can be difficult. Second, mammography may not detect small tumors, and is often unsatisfactory for younger women who typically have dense breast tissue. Lastly, breast cancer is not a single homogeneous disease but consists of multiple disease states, each arising from a distinct molecular mechanism and having a distinct clinical progression path which makes the disease difficult to detect and predict in early stages. RESULTS: In the paper, we present a Support Vector Machine based on Recursive Feature Elimination and Cross Validation (SVM-RFE-CV) algorithm for early detection of breast cancer in peripheral blood and show how to use SVM-RFE-CV to model the classification and prediction problem of early detection of breast cancer in peripheral blood. The training set which consists of 32 health and 33 cancer samples and the testing set consisting of 31 health and 34 cancer samples were randomly separated from a dataset of peripheral blood of breast cancer that is downloaded from Gene Express Omnibus. First, we identified the 42 differentially expressed biomarkers between "normal" and "cancer". Then, with the SVM-RFE-CV we extracted 15 biomarkers that yield zero cross validation score. Lastly, we compared the classification and prediction performance of SVM-RFE-CV with that of SVM and SVM Recursive Feature Elimination (SVM-RFE). CONCLUSIONS: We found that 1) the SVM-RFE-CV is suitable for analyzing noisy high-throughput microarray data, 2) it outperforms SVM-RFE in the robustness to noise and in the ability to recover informative features, and 3) it can improve the prediction performance (Area Under Curve) in the testing data set from 0.5826 to 0.7879. Further pathway analysis showed that the biomarkers are associated with Signaling, Hemostasis, Hormones, and Immune System, which are consistent with previous findings. Our prediction model can serve as a general model for biomarker discovery in early detection of other cancers. In the future, Polymerase Chain Reaction (PCR) is planned for validation of the ability of these potential biomarkers for early detection of breast cancer. BioMed Central 2013-01-23 /pmc/articles/PMC3552693/ /pubmed/23369435 http://dx.doi.org/10.1186/1755-8794-6-S1-S4 Text en Copyright ©2013 Zhang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Zhang, Fan Kaufman, Howard L Deng, Youping Drabier, Renee Recursive SVM biomarker selection for early detection of breast cancer in peripheral blood |
title | Recursive SVM biomarker selection for early detection of breast cancer in peripheral blood |
title_full | Recursive SVM biomarker selection for early detection of breast cancer in peripheral blood |
title_fullStr | Recursive SVM biomarker selection for early detection of breast cancer in peripheral blood |
title_full_unstemmed | Recursive SVM biomarker selection for early detection of breast cancer in peripheral blood |
title_short | Recursive SVM biomarker selection for early detection of breast cancer in peripheral blood |
title_sort | recursive svm biomarker selection for early detection of breast cancer in peripheral blood |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3552693/ https://www.ncbi.nlm.nih.gov/pubmed/23369435 http://dx.doi.org/10.1186/1755-8794-6-S1-S4 |
work_keys_str_mv | AT zhangfan recursivesvmbiomarkerselectionforearlydetectionofbreastcancerinperipheralblood AT kaufmanhowardl recursivesvmbiomarkerselectionforearlydetectionofbreastcancerinperipheralblood AT dengyouping recursivesvmbiomarkerselectionforearlydetectionofbreastcancerinperipheralblood AT drabierrenee recursivesvmbiomarkerselectionforearlydetectionofbreastcancerinperipheralblood |