Cargando…

Recursive SVM biomarker selection for early detection of breast cancer in peripheral blood

BACKGROUND: Breast cancer is worldwide the second most common type of cancer after lung cancer. Traditional mammography and Tissue Microarray has been studied for early cancer detection and cancer prediction. However, there is a need for more reliable diagnostic tools for early detection of breast c...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Fan, Kaufman, Howard L, Deng, Youping, Drabier, Renee
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3552693/
https://www.ncbi.nlm.nih.gov/pubmed/23369435
http://dx.doi.org/10.1186/1755-8794-6-S1-S4
_version_ 1782256701453369344
author Zhang, Fan
Kaufman, Howard L
Deng, Youping
Drabier, Renee
author_facet Zhang, Fan
Kaufman, Howard L
Deng, Youping
Drabier, Renee
author_sort Zhang, Fan
collection PubMed
description BACKGROUND: Breast cancer is worldwide the second most common type of cancer after lung cancer. Traditional mammography and Tissue Microarray has been studied for early cancer detection and cancer prediction. However, there is a need for more reliable diagnostic tools for early detection of breast cancer. This can be a challenge due to a number of factors and logistics. First, obtaining tissue biopsies can be difficult. Second, mammography may not detect small tumors, and is often unsatisfactory for younger women who typically have dense breast tissue. Lastly, breast cancer is not a single homogeneous disease but consists of multiple disease states, each arising from a distinct molecular mechanism and having a distinct clinical progression path which makes the disease difficult to detect and predict in early stages. RESULTS: In the paper, we present a Support Vector Machine based on Recursive Feature Elimination and Cross Validation (SVM-RFE-CV) algorithm for early detection of breast cancer in peripheral blood and show how to use SVM-RFE-CV to model the classification and prediction problem of early detection of breast cancer in peripheral blood. The training set which consists of 32 health and 33 cancer samples and the testing set consisting of 31 health and 34 cancer samples were randomly separated from a dataset of peripheral blood of breast cancer that is downloaded from Gene Express Omnibus. First, we identified the 42 differentially expressed biomarkers between "normal" and "cancer". Then, with the SVM-RFE-CV we extracted 15 biomarkers that yield zero cross validation score. Lastly, we compared the classification and prediction performance of SVM-RFE-CV with that of SVM and SVM Recursive Feature Elimination (SVM-RFE). CONCLUSIONS: We found that 1) the SVM-RFE-CV is suitable for analyzing noisy high-throughput microarray data, 2) it outperforms SVM-RFE in the robustness to noise and in the ability to recover informative features, and 3) it can improve the prediction performance (Area Under Curve) in the testing data set from 0.5826 to 0.7879. Further pathway analysis showed that the biomarkers are associated with Signaling, Hemostasis, Hormones, and Immune System, which are consistent with previous findings. Our prediction model can serve as a general model for biomarker discovery in early detection of other cancers. In the future, Polymerase Chain Reaction (PCR) is planned for validation of the ability of these potential biomarkers for early detection of breast cancer.
format Online
Article
Text
id pubmed-3552693
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35526932013-01-28 Recursive SVM biomarker selection for early detection of breast cancer in peripheral blood Zhang, Fan Kaufman, Howard L Deng, Youping Drabier, Renee BMC Med Genomics Research BACKGROUND: Breast cancer is worldwide the second most common type of cancer after lung cancer. Traditional mammography and Tissue Microarray has been studied for early cancer detection and cancer prediction. However, there is a need for more reliable diagnostic tools for early detection of breast cancer. This can be a challenge due to a number of factors and logistics. First, obtaining tissue biopsies can be difficult. Second, mammography may not detect small tumors, and is often unsatisfactory for younger women who typically have dense breast tissue. Lastly, breast cancer is not a single homogeneous disease but consists of multiple disease states, each arising from a distinct molecular mechanism and having a distinct clinical progression path which makes the disease difficult to detect and predict in early stages. RESULTS: In the paper, we present a Support Vector Machine based on Recursive Feature Elimination and Cross Validation (SVM-RFE-CV) algorithm for early detection of breast cancer in peripheral blood and show how to use SVM-RFE-CV to model the classification and prediction problem of early detection of breast cancer in peripheral blood. The training set which consists of 32 health and 33 cancer samples and the testing set consisting of 31 health and 34 cancer samples were randomly separated from a dataset of peripheral blood of breast cancer that is downloaded from Gene Express Omnibus. First, we identified the 42 differentially expressed biomarkers between "normal" and "cancer". Then, with the SVM-RFE-CV we extracted 15 biomarkers that yield zero cross validation score. Lastly, we compared the classification and prediction performance of SVM-RFE-CV with that of SVM and SVM Recursive Feature Elimination (SVM-RFE). CONCLUSIONS: We found that 1) the SVM-RFE-CV is suitable for analyzing noisy high-throughput microarray data, 2) it outperforms SVM-RFE in the robustness to noise and in the ability to recover informative features, and 3) it can improve the prediction performance (Area Under Curve) in the testing data set from 0.5826 to 0.7879. Further pathway analysis showed that the biomarkers are associated with Signaling, Hemostasis, Hormones, and Immune System, which are consistent with previous findings. Our prediction model can serve as a general model for biomarker discovery in early detection of other cancers. In the future, Polymerase Chain Reaction (PCR) is planned for validation of the ability of these potential biomarkers for early detection of breast cancer. BioMed Central 2013-01-23 /pmc/articles/PMC3552693/ /pubmed/23369435 http://dx.doi.org/10.1186/1755-8794-6-S1-S4 Text en Copyright ©2013 Zhang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Zhang, Fan
Kaufman, Howard L
Deng, Youping
Drabier, Renee
Recursive SVM biomarker selection for early detection of breast cancer in peripheral blood
title Recursive SVM biomarker selection for early detection of breast cancer in peripheral blood
title_full Recursive SVM biomarker selection for early detection of breast cancer in peripheral blood
title_fullStr Recursive SVM biomarker selection for early detection of breast cancer in peripheral blood
title_full_unstemmed Recursive SVM biomarker selection for early detection of breast cancer in peripheral blood
title_short Recursive SVM biomarker selection for early detection of breast cancer in peripheral blood
title_sort recursive svm biomarker selection for early detection of breast cancer in peripheral blood
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3552693/
https://www.ncbi.nlm.nih.gov/pubmed/23369435
http://dx.doi.org/10.1186/1755-8794-6-S1-S4
work_keys_str_mv AT zhangfan recursivesvmbiomarkerselectionforearlydetectionofbreastcancerinperipheralblood
AT kaufmanhowardl recursivesvmbiomarkerselectionforearlydetectionofbreastcancerinperipheralblood
AT dengyouping recursivesvmbiomarkerselectionforearlydetectionofbreastcancerinperipheralblood
AT drabierrenee recursivesvmbiomarkerselectionforearlydetectionofbreastcancerinperipheralblood