Cargando…

Breast cancer prognosis by combinatorial analysis of gene expression data

INTRODUCTION: The potential of applying data analysis tools to microarray data for diagnosis and prognosis is illustrated on the recent breast cancer dataset of van 't Veer and coworkers. We re-examine that dataset using the novel technique of logical analysis of data (LAD), with the double obj...

Descripción completa

Detalles Bibliográficos
Autores principales: Alexe, Gabriela, Alexe, Sorin, Axelrod, David E, Bonates, Tibérius O, Lozina, Irina I, Reiss, Michael, Hammer, Peter L
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1779471/
https://www.ncbi.nlm.nih.gov/pubmed/16859500
http://dx.doi.org/10.1186/bcr1512
_version_ 1782131780437934080
author Alexe, Gabriela
Alexe, Sorin
Axelrod, David E
Bonates, Tibérius O
Lozina, Irina I
Reiss, Michael
Hammer, Peter L
author_facet Alexe, Gabriela
Alexe, Sorin
Axelrod, David E
Bonates, Tibérius O
Lozina, Irina I
Reiss, Michael
Hammer, Peter L
author_sort Alexe, Gabriela
collection PubMed
description INTRODUCTION: The potential of applying data analysis tools to microarray data for diagnosis and prognosis is illustrated on the recent breast cancer dataset of van 't Veer and coworkers. We re-examine that dataset using the novel technique of logical analysis of data (LAD), with the double objective of discovering patterns characteristic for cases with good or poor outcome, using them for accurate and justifiable predictions; and deriving novel information about the role of genes, the existence of special classes of cases, and other factors. METHOD: Data were analyzed using the combinatorics and optimization-based method of LAD, recently shown to provide highly accurate diagnostic and prognostic systems in cardiology, cancer proteomics, hematology, pulmonology, and other disciplines. RESULTS: LAD identified a subset of 17 of the 25,000 genes, capable of fully distinguishing between patients with poor, respectively good prognoses. An extensive list of 'patterns' or 'combinatorial biomarkers' (that is, combinations of genes and limitations on their expression levels) was generated, and 40 patterns were used to create a prognostic system, shown to have 100% and 92.9% weighted accuracy on the training and test sets, respectively. The prognostic system uses fewer genes than other methods, and has similar or better accuracy than those reported in other studies. Out of the 17 genes identified by LAD, three (respectively, five) were shown to play a significant role in determining poor (respectively, good) prognosis. Two new classes of patients (described by similar sets of covering patterns, gene expression ranges, and clinical features) were discovered. As a by-product of the study, it is shown that the training and the test sets of van 't Veer have differing characteristics. CONCLUSION: The study shows that LAD provides an accurate and fully explanatory prognostic system for breast cancer using genomic data (that is, a system that, in addition to predicting good or poor prognosis, provides an individualized explanation of the reasons for that prognosis for each patient). Moreover, the LAD model provides valuable insights into the roles of individual and combinatorial biomarkers, allows the discovery of new classes of patients, and generates a vast library of biomedical research hypotheses.
format Text
id pubmed-1779471
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-17794712007-01-19 Breast cancer prognosis by combinatorial analysis of gene expression data Alexe, Gabriela Alexe, Sorin Axelrod, David E Bonates, Tibérius O Lozina, Irina I Reiss, Michael Hammer, Peter L Breast Cancer Res Research Article INTRODUCTION: The potential of applying data analysis tools to microarray data for diagnosis and prognosis is illustrated on the recent breast cancer dataset of van 't Veer and coworkers. We re-examine that dataset using the novel technique of logical analysis of data (LAD), with the double objective of discovering patterns characteristic for cases with good or poor outcome, using them for accurate and justifiable predictions; and deriving novel information about the role of genes, the existence of special classes of cases, and other factors. METHOD: Data were analyzed using the combinatorics and optimization-based method of LAD, recently shown to provide highly accurate diagnostic and prognostic systems in cardiology, cancer proteomics, hematology, pulmonology, and other disciplines. RESULTS: LAD identified a subset of 17 of the 25,000 genes, capable of fully distinguishing between patients with poor, respectively good prognoses. An extensive list of 'patterns' or 'combinatorial biomarkers' (that is, combinations of genes and limitations on their expression levels) was generated, and 40 patterns were used to create a prognostic system, shown to have 100% and 92.9% weighted accuracy on the training and test sets, respectively. The prognostic system uses fewer genes than other methods, and has similar or better accuracy than those reported in other studies. Out of the 17 genes identified by LAD, three (respectively, five) were shown to play a significant role in determining poor (respectively, good) prognosis. Two new classes of patients (described by similar sets of covering patterns, gene expression ranges, and clinical features) were discovered. As a by-product of the study, it is shown that the training and the test sets of van 't Veer have differing characteristics. CONCLUSION: The study shows that LAD provides an accurate and fully explanatory prognostic system for breast cancer using genomic data (that is, a system that, in addition to predicting good or poor prognosis, provides an individualized explanation of the reasons for that prognosis for each patient). Moreover, the LAD model provides valuable insights into the roles of individual and combinatorial biomarkers, allows the discovery of new classes of patients, and generates a vast library of biomedical research hypotheses. BioMed Central 2006 2006-07-19 /pmc/articles/PMC1779471/ /pubmed/16859500 http://dx.doi.org/10.1186/bcr1512 Text en Copyright © 2006 Alexe et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Alexe, Gabriela
Alexe, Sorin
Axelrod, David E
Bonates, Tibérius O
Lozina, Irina I
Reiss, Michael
Hammer, Peter L
Breast cancer prognosis by combinatorial analysis of gene expression data
title Breast cancer prognosis by combinatorial analysis of gene expression data
title_full Breast cancer prognosis by combinatorial analysis of gene expression data
title_fullStr Breast cancer prognosis by combinatorial analysis of gene expression data
title_full_unstemmed Breast cancer prognosis by combinatorial analysis of gene expression data
title_short Breast cancer prognosis by combinatorial analysis of gene expression data
title_sort breast cancer prognosis by combinatorial analysis of gene expression data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1779471/
https://www.ncbi.nlm.nih.gov/pubmed/16859500
http://dx.doi.org/10.1186/bcr1512
work_keys_str_mv AT alexegabriela breastcancerprognosisbycombinatorialanalysisofgeneexpressiondata
AT alexesorin breastcancerprognosisbycombinatorialanalysisofgeneexpressiondata
AT axelroddavide breastcancerprognosisbycombinatorialanalysisofgeneexpressiondata
AT bonatestiberiuso breastcancerprognosisbycombinatorialanalysisofgeneexpressiondata
AT lozinairinai breastcancerprognosisbycombinatorialanalysisofgeneexpressiondata
AT reissmichael breastcancerprognosisbycombinatorialanalysisofgeneexpressiondata
AT hammerpeterl breastcancerprognosisbycombinatorialanalysisofgeneexpressiondata