Cargando…

Characteristics and Validation Techniques for PCA-Based Gene-Expression Signatures

Background. Many gene-expression signatures exist for describing the biological state of profiled tumors. Principal Component Analysis (PCA) can be used to summarize a gene signature into a single score. Our hypothesis is that gene signatures can be validated when applied to new datasets, using inhe...

Descripción completa

Detalles Bibliográficos
Autores principales: Berglund, Anders E., Welsh, Eric A., Eschrich, Steven A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5317117/
https://www.ncbi.nlm.nih.gov/pubmed/28265563
http://dx.doi.org/10.1155/2017/2354564
_version_ 1782508951100719104
author Berglund, Anders E.
Welsh, Eric A.
Eschrich, Steven A.
author_facet Berglund, Anders E.
Welsh, Eric A.
Eschrich, Steven A.
author_sort Berglund, Anders E.
collection PubMed
description Background. Many gene-expression signatures exist for describing the biological state of profiled tumors. Principal Component Analysis (PCA) can be used to summarize a gene signature into a single score. Our hypothesis is that gene signatures can be validated when applied to new datasets, using inherent properties of PCA. Results. This validation is based on four key concepts. Coherence: elements of a gene signature should be correlated beyond chance. Uniqueness: the general direction of the data being examined can drive most of the observed signal. Robustness: if a gene signature is designed to measure a single biological effect, then this signal should be sufficiently strong and distinct compared to other signals within the signature. Transferability: the derived PCA gene signature score should describe the same biology in the target dataset as it does in the training dataset. Conclusions. The proposed validation procedure ensures that PCA-based gene signatures perform as expected when applied to datasets other than those that the signatures were trained upon. Complex signatures, describing multiple independent biological components, are also easily identified.
format Online
Article
Text
id pubmed-5317117
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-53171172017-03-06 Characteristics and Validation Techniques for PCA-Based Gene-Expression Signatures Berglund, Anders E. Welsh, Eric A. Eschrich, Steven A. Int J Genomics Research Article Background. Many gene-expression signatures exist for describing the biological state of profiled tumors. Principal Component Analysis (PCA) can be used to summarize a gene signature into a single score. Our hypothesis is that gene signatures can be validated when applied to new datasets, using inherent properties of PCA. Results. This validation is based on four key concepts. Coherence: elements of a gene signature should be correlated beyond chance. Uniqueness: the general direction of the data being examined can drive most of the observed signal. Robustness: if a gene signature is designed to measure a single biological effect, then this signal should be sufficiently strong and distinct compared to other signals within the signature. Transferability: the derived PCA gene signature score should describe the same biology in the target dataset as it does in the training dataset. Conclusions. The proposed validation procedure ensures that PCA-based gene signatures perform as expected when applied to datasets other than those that the signatures were trained upon. Complex signatures, describing multiple independent biological components, are also easily identified. Hindawi Publishing Corporation 2017 2017-02-06 /pmc/articles/PMC5317117/ /pubmed/28265563 http://dx.doi.org/10.1155/2017/2354564 Text en Copyright © 2017 Anders E. Berglund et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Berglund, Anders E.
Welsh, Eric A.
Eschrich, Steven A.
Characteristics and Validation Techniques for PCA-Based Gene-Expression Signatures
title Characteristics and Validation Techniques for PCA-Based Gene-Expression Signatures
title_full Characteristics and Validation Techniques for PCA-Based Gene-Expression Signatures
title_fullStr Characteristics and Validation Techniques for PCA-Based Gene-Expression Signatures
title_full_unstemmed Characteristics and Validation Techniques for PCA-Based Gene-Expression Signatures
title_short Characteristics and Validation Techniques for PCA-Based Gene-Expression Signatures
title_sort characteristics and validation techniques for pca-based gene-expression signatures
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5317117/
https://www.ncbi.nlm.nih.gov/pubmed/28265563
http://dx.doi.org/10.1155/2017/2354564
work_keys_str_mv AT berglundanderse characteristicsandvalidationtechniquesforpcabasedgeneexpressionsignatures
AT welsherica characteristicsandvalidationtechniquesforpcabasedgeneexpressionsignatures
AT eschrichstevena characteristicsandvalidationtechniquesforpcabasedgeneexpressionsignatures