Cargando…
Robust identification of target genes and outliers in triple-negative breast cancer data
Correct classification of breast cancer subtypes is of high importance as it directly affects the therapeutic options. We focus on triple-negative breast cancer which has the worst prognosis among breast cancer types. Using cutting edge methods from the field of robust statistics, we analyze Breast...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
SAGE Publications
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6745616/ https://www.ncbi.nlm.nih.gov/pubmed/30146936 http://dx.doi.org/10.1177/0962280218794722 |
_version_ | 1783451570844729344 |
---|---|
author | Segaert, Pieter Lopes, Marta B Casimiro, Sandra Vinga, Susana Rousseeuw, Peter J |
author_facet | Segaert, Pieter Lopes, Marta B Casimiro, Sandra Vinga, Susana Rousseeuw, Peter J |
author_sort | Segaert, Pieter |
collection | PubMed |
description | Correct classification of breast cancer subtypes is of high importance as it directly affects the therapeutic options. We focus on triple-negative breast cancer which has the worst prognosis among breast cancer types. Using cutting edge methods from the field of robust statistics, we analyze Breast Invasive Carcinoma transcriptomic data publicly available from The Cancer Genome Atlas data portal. Our analysis identifies statistical outliers that may correspond to misdiagnosed patients. Furthermore, it is illustrated that classical statistical methods may fail to identify outliers due to their heavy influence, prompting the need for robust statistics. Using robust sparse logistic regression we obtain 36 relevant genes, of which ca. 60% have been previously reported as biologically relevant to triple-negative breast cancer, reinforcing the validity of the method. The remaining 14 genes identified are new potential biomarkers for triple-negative breast cancer. Out of these, JAM3, SFT2D2, and PAPSS1 were previously associated to breast tumors or other types of cancer. The relevance of these genes is confirmed by the new DetectDeviatingCells outlier detection technique. A comparison of gene networks on the selected genes showed significant differences between triple-negative breast cancer and non-triple-negative breast cancer data. The individual role of FOXA1 in triple-negative breast cancer and non-triple-negative breast cancer, and the strong FOXA1-AGR2 connection in triple-negative breast cancer stand out. The goal of our paper is to contribute to the breast cancer/triple-negative breast cancer understanding and management. At the same time it demonstrates that robust regression and outlier detection constitute key strategies to cope with high-dimensional clinical data such as omics data. |
format | Online Article Text |
id | pubmed-6745616 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | SAGE Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-67456162019-10-03 Robust identification of target genes and outliers in triple-negative breast cancer data Segaert, Pieter Lopes, Marta B Casimiro, Sandra Vinga, Susana Rousseeuw, Peter J Stat Methods Med Res Articles Correct classification of breast cancer subtypes is of high importance as it directly affects the therapeutic options. We focus on triple-negative breast cancer which has the worst prognosis among breast cancer types. Using cutting edge methods from the field of robust statistics, we analyze Breast Invasive Carcinoma transcriptomic data publicly available from The Cancer Genome Atlas data portal. Our analysis identifies statistical outliers that may correspond to misdiagnosed patients. Furthermore, it is illustrated that classical statistical methods may fail to identify outliers due to their heavy influence, prompting the need for robust statistics. Using robust sparse logistic regression we obtain 36 relevant genes, of which ca. 60% have been previously reported as biologically relevant to triple-negative breast cancer, reinforcing the validity of the method. The remaining 14 genes identified are new potential biomarkers for triple-negative breast cancer. Out of these, JAM3, SFT2D2, and PAPSS1 were previously associated to breast tumors or other types of cancer. The relevance of these genes is confirmed by the new DetectDeviatingCells outlier detection technique. A comparison of gene networks on the selected genes showed significant differences between triple-negative breast cancer and non-triple-negative breast cancer data. The individual role of FOXA1 in triple-negative breast cancer and non-triple-negative breast cancer, and the strong FOXA1-AGR2 connection in triple-negative breast cancer stand out. The goal of our paper is to contribute to the breast cancer/triple-negative breast cancer understanding and management. At the same time it demonstrates that robust regression and outlier detection constitute key strategies to cope with high-dimensional clinical data such as omics data. SAGE Publications 2018-08-27 2019-11 /pmc/articles/PMC6745616/ /pubmed/30146936 http://dx.doi.org/10.1177/0962280218794722 Text en © The Author(s) 2018 http://creativecommons.org/licenses/by-nc/4.0/ This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (http://www.creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage). |
spellingShingle | Articles Segaert, Pieter Lopes, Marta B Casimiro, Sandra Vinga, Susana Rousseeuw, Peter J Robust identification of target genes and outliers in triple-negative breast cancer data |
title | Robust identification of target genes and outliers in triple-negative
breast cancer data |
title_full | Robust identification of target genes and outliers in triple-negative
breast cancer data |
title_fullStr | Robust identification of target genes and outliers in triple-negative
breast cancer data |
title_full_unstemmed | Robust identification of target genes and outliers in triple-negative
breast cancer data |
title_short | Robust identification of target genes and outliers in triple-negative
breast cancer data |
title_sort | robust identification of target genes and outliers in triple-negative
breast cancer data |
topic | Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6745616/ https://www.ncbi.nlm.nih.gov/pubmed/30146936 http://dx.doi.org/10.1177/0962280218794722 |
work_keys_str_mv | AT segaertpieter robustidentificationoftargetgenesandoutliersintriplenegativebreastcancerdata AT lopesmartab robustidentificationoftargetgenesandoutliersintriplenegativebreastcancerdata AT casimirosandra robustidentificationoftargetgenesandoutliersintriplenegativebreastcancerdata AT vingasusana robustidentificationoftargetgenesandoutliersintriplenegativebreastcancerdata AT rousseeuwpeterj robustidentificationoftargetgenesandoutliersintriplenegativebreastcancerdata |