Cargando…

Robust identification of target genes and outliers in triple-negative breast cancer data

Correct classification of breast cancer subtypes is of high importance as it directly affects the therapeutic options. We focus on triple-negative breast cancer which has the worst prognosis among breast cancer types. Using cutting edge methods from the field of robust statistics, we analyze Breast...

Descripción completa

Detalles Bibliográficos
Autores principales: Segaert, Pieter, Lopes, Marta B, Casimiro, Sandra, Vinga, Susana, Rousseeuw, Peter J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: SAGE Publications 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6745616/
https://www.ncbi.nlm.nih.gov/pubmed/30146936
http://dx.doi.org/10.1177/0962280218794722
_version_ 1783451570844729344
author Segaert, Pieter
Lopes, Marta B
Casimiro, Sandra
Vinga, Susana
Rousseeuw, Peter J
author_facet Segaert, Pieter
Lopes, Marta B
Casimiro, Sandra
Vinga, Susana
Rousseeuw, Peter J
author_sort Segaert, Pieter
collection PubMed
description Correct classification of breast cancer subtypes is of high importance as it directly affects the therapeutic options. We focus on triple-negative breast cancer which has the worst prognosis among breast cancer types. Using cutting edge methods from the field of robust statistics, we analyze Breast Invasive Carcinoma transcriptomic data publicly available from The Cancer Genome Atlas data portal. Our analysis identifies statistical outliers that may correspond to misdiagnosed patients. Furthermore, it is illustrated that classical statistical methods may fail to identify outliers due to their heavy influence, prompting the need for robust statistics. Using robust sparse logistic regression we obtain 36 relevant genes, of which ca. 60% have been previously reported as biologically relevant to triple-negative breast cancer, reinforcing the validity of the method. The remaining 14 genes identified are new potential biomarkers for triple-negative breast cancer. Out of these, JAM3, SFT2D2, and PAPSS1 were previously associated to breast tumors or other types of cancer. The relevance of these genes is confirmed by the new DetectDeviatingCells outlier detection technique. A comparison of gene networks on the selected genes showed significant differences between triple-negative breast cancer and non-triple-negative breast cancer data. The individual role of FOXA1 in triple-negative breast cancer and non-triple-negative breast cancer, and the strong FOXA1-AGR2 connection in triple-negative breast cancer stand out. The goal of our paper is to contribute to the breast cancer/triple-negative breast cancer understanding and management. At the same time it demonstrates that robust regression and outlier detection constitute key strategies to cope with high-dimensional clinical data such as omics data.
format Online
Article
Text
id pubmed-6745616
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher SAGE Publications
record_format MEDLINE/PubMed
spelling pubmed-67456162019-10-03 Robust identification of target genes and outliers in triple-negative breast cancer data Segaert, Pieter Lopes, Marta B Casimiro, Sandra Vinga, Susana Rousseeuw, Peter J Stat Methods Med Res Articles Correct classification of breast cancer subtypes is of high importance as it directly affects the therapeutic options. We focus on triple-negative breast cancer which has the worst prognosis among breast cancer types. Using cutting edge methods from the field of robust statistics, we analyze Breast Invasive Carcinoma transcriptomic data publicly available from The Cancer Genome Atlas data portal. Our analysis identifies statistical outliers that may correspond to misdiagnosed patients. Furthermore, it is illustrated that classical statistical methods may fail to identify outliers due to their heavy influence, prompting the need for robust statistics. Using robust sparse logistic regression we obtain 36 relevant genes, of which ca. 60% have been previously reported as biologically relevant to triple-negative breast cancer, reinforcing the validity of the method. The remaining 14 genes identified are new potential biomarkers for triple-negative breast cancer. Out of these, JAM3, SFT2D2, and PAPSS1 were previously associated to breast tumors or other types of cancer. The relevance of these genes is confirmed by the new DetectDeviatingCells outlier detection technique. A comparison of gene networks on the selected genes showed significant differences between triple-negative breast cancer and non-triple-negative breast cancer data. The individual role of FOXA1 in triple-negative breast cancer and non-triple-negative breast cancer, and the strong FOXA1-AGR2 connection in triple-negative breast cancer stand out. The goal of our paper is to contribute to the breast cancer/triple-negative breast cancer understanding and management. At the same time it demonstrates that robust regression and outlier detection constitute key strategies to cope with high-dimensional clinical data such as omics data. SAGE Publications 2018-08-27 2019-11 /pmc/articles/PMC6745616/ /pubmed/30146936 http://dx.doi.org/10.1177/0962280218794722 Text en © The Author(s) 2018 http://creativecommons.org/licenses/by-nc/4.0/ This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (http://www.creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Articles
Segaert, Pieter
Lopes, Marta B
Casimiro, Sandra
Vinga, Susana
Rousseeuw, Peter J
Robust identification of target genes and outliers in triple-negative breast cancer data
title Robust identification of target genes and outliers in triple-negative breast cancer data
title_full Robust identification of target genes and outliers in triple-negative breast cancer data
title_fullStr Robust identification of target genes and outliers in triple-negative breast cancer data
title_full_unstemmed Robust identification of target genes and outliers in triple-negative breast cancer data
title_short Robust identification of target genes and outliers in triple-negative breast cancer data
title_sort robust identification of target genes and outliers in triple-negative breast cancer data
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6745616/
https://www.ncbi.nlm.nih.gov/pubmed/30146936
http://dx.doi.org/10.1177/0962280218794722
work_keys_str_mv AT segaertpieter robustidentificationoftargetgenesandoutliersintriplenegativebreastcancerdata
AT lopesmartab robustidentificationoftargetgenesandoutliersintriplenegativebreastcancerdata
AT casimirosandra robustidentificationoftargetgenesandoutliersintriplenegativebreastcancerdata
AT vingasusana robustidentificationoftargetgenesandoutliersintriplenegativebreastcancerdata
AT rousseeuwpeterj robustidentificationoftargetgenesandoutliersintriplenegativebreastcancerdata