Cargando…
Discovering biomarkers from gene expression data for predicting cancer subgroups using neural networks and relational fuzzy clustering
BACKGROUND: The four heterogeneous childhood cancers, neuroblastoma, non-Hodgkin lymphoma, rhabdomyosarcoma, and Ewing sarcoma present a similar histology of small round blue cell tumor (SRBCT) and thus often leads to misdiagnosis. Identification of biomarkers for distinguishing these cancers is a w...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1770936/ https://www.ncbi.nlm.nih.gov/pubmed/17207284 http://dx.doi.org/10.1186/1471-2105-8-5 |
_version_ | 1782131718007816192 |
---|---|
author | Pal, Nikhil R Aguan, Kripamoy Sharma, Animesh Amari, Shun-ichi |
author_facet | Pal, Nikhil R Aguan, Kripamoy Sharma, Animesh Amari, Shun-ichi |
author_sort | Pal, Nikhil R |
collection | PubMed |
description | BACKGROUND: The four heterogeneous childhood cancers, neuroblastoma, non-Hodgkin lymphoma, rhabdomyosarcoma, and Ewing sarcoma present a similar histology of small round blue cell tumor (SRBCT) and thus often leads to misdiagnosis. Identification of biomarkers for distinguishing these cancers is a well studied problem. Existing methods typically evaluate each gene separately and do not take into account the nonlinear interaction between genes and the tools that are used to design the diagnostic prediction system. Consequently, more genes are usually identified as necessary for prediction. We propose a general scheme for finding a small set of biomarkers to design a diagnostic system for accurate classification of the cancer subgroups. We use multilayer networks with online gene selection ability and relational fuzzy clustering to identify a small set of biomarkers for accurate classification of the training and blind test cases of a well studied data set. RESULTS: Our method discerned just seven biomarkers that precisely categorized the four subgroups of cancer both in training and blind samples. For the same problem, others suggested 19–94 genes. These seven biomarkers include three novel genes (NAB2, LSP1 and EHD1 – not identified by others) with distinct class-specific signatures and important role in cancer biology, including cellular proliferation, transendothelial migration and trafficking of MHC class antigens. Interestingly, NAB2 is downregulated in other tumors including Non-Hodgkin lymphoma and Neuroblastoma but we observed moderate to high upregulation in a few cases of Ewing sarcoma and Rabhdomyosarcoma, suggesting that NAB2 might be mutated in these tumors. These genes can discover the subgroups correctly with unsupervised learning, can differentiate non-SRBCT samples and they perform equally well with other machine learning tools including support vector machines. These biomarkers lead to four simple human interpretable rules for the diagnostic task. CONCLUSION: Although the proposed method is tested on a SRBCT data set, it is quite general and can be applied to other cancer data sets. Our scheme takes into account the interaction between genes as well as that between genes and the tool and thus is able find a very small set and can discover novel genes. Our findings suggest the possibility of developing specialized microarray chips or use of real-time qPCR assays or antibody based methods such as ELISA and western blot analysis for an easy and low cost diagnosis of the subgroups. |
format | Text |
id | pubmed-1770936 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-17709362007-01-22 Discovering biomarkers from gene expression data for predicting cancer subgroups using neural networks and relational fuzzy clustering Pal, Nikhil R Aguan, Kripamoy Sharma, Animesh Amari, Shun-ichi BMC Bioinformatics Research Article BACKGROUND: The four heterogeneous childhood cancers, neuroblastoma, non-Hodgkin lymphoma, rhabdomyosarcoma, and Ewing sarcoma present a similar histology of small round blue cell tumor (SRBCT) and thus often leads to misdiagnosis. Identification of biomarkers for distinguishing these cancers is a well studied problem. Existing methods typically evaluate each gene separately and do not take into account the nonlinear interaction between genes and the tools that are used to design the diagnostic prediction system. Consequently, more genes are usually identified as necessary for prediction. We propose a general scheme for finding a small set of biomarkers to design a diagnostic system for accurate classification of the cancer subgroups. We use multilayer networks with online gene selection ability and relational fuzzy clustering to identify a small set of biomarkers for accurate classification of the training and blind test cases of a well studied data set. RESULTS: Our method discerned just seven biomarkers that precisely categorized the four subgroups of cancer both in training and blind samples. For the same problem, others suggested 19–94 genes. These seven biomarkers include three novel genes (NAB2, LSP1 and EHD1 – not identified by others) with distinct class-specific signatures and important role in cancer biology, including cellular proliferation, transendothelial migration and trafficking of MHC class antigens. Interestingly, NAB2 is downregulated in other tumors including Non-Hodgkin lymphoma and Neuroblastoma but we observed moderate to high upregulation in a few cases of Ewing sarcoma and Rabhdomyosarcoma, suggesting that NAB2 might be mutated in these tumors. These genes can discover the subgroups correctly with unsupervised learning, can differentiate non-SRBCT samples and they perform equally well with other machine learning tools including support vector machines. These biomarkers lead to four simple human interpretable rules for the diagnostic task. CONCLUSION: Although the proposed method is tested on a SRBCT data set, it is quite general and can be applied to other cancer data sets. Our scheme takes into account the interaction between genes as well as that between genes and the tool and thus is able find a very small set and can discover novel genes. Our findings suggest the possibility of developing specialized microarray chips or use of real-time qPCR assays or antibody based methods such as ELISA and western blot analysis for an easy and low cost diagnosis of the subgroups. BioMed Central 2007-01-06 /pmc/articles/PMC1770936/ /pubmed/17207284 http://dx.doi.org/10.1186/1471-2105-8-5 Text en Copyright © 2007 Pal et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Pal, Nikhil R Aguan, Kripamoy Sharma, Animesh Amari, Shun-ichi Discovering biomarkers from gene expression data for predicting cancer subgroups using neural networks and relational fuzzy clustering |
title | Discovering biomarkers from gene expression data for predicting cancer subgroups using neural networks and relational fuzzy clustering |
title_full | Discovering biomarkers from gene expression data for predicting cancer subgroups using neural networks and relational fuzzy clustering |
title_fullStr | Discovering biomarkers from gene expression data for predicting cancer subgroups using neural networks and relational fuzzy clustering |
title_full_unstemmed | Discovering biomarkers from gene expression data for predicting cancer subgroups using neural networks and relational fuzzy clustering |
title_short | Discovering biomarkers from gene expression data for predicting cancer subgroups using neural networks and relational fuzzy clustering |
title_sort | discovering biomarkers from gene expression data for predicting cancer subgroups using neural networks and relational fuzzy clustering |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1770936/ https://www.ncbi.nlm.nih.gov/pubmed/17207284 http://dx.doi.org/10.1186/1471-2105-8-5 |
work_keys_str_mv | AT palnikhilr discoveringbiomarkersfromgeneexpressiondataforpredictingcancersubgroupsusingneuralnetworksandrelationalfuzzyclustering AT aguankripamoy discoveringbiomarkersfromgeneexpressiondataforpredictingcancersubgroupsusingneuralnetworksandrelationalfuzzyclustering AT sharmaanimesh discoveringbiomarkersfromgeneexpressiondataforpredictingcancersubgroupsusingneuralnetworksandrelationalfuzzyclustering AT amarishunichi discoveringbiomarkersfromgeneexpressiondataforpredictingcancersubgroupsusingneuralnetworksandrelationalfuzzyclustering |