Cargando…
Differential gene expression in disease: a comparison between high-throughput studies and the literature
BACKGROUND: Differential gene expression is important to understand the biological differences between healthy and diseased states. Two common sources of differential gene expression data are microarray studies and the biomedical literature. METHODS: With the aid of text mining and gene expression a...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5637346/ https://www.ncbi.nlm.nih.gov/pubmed/29020950 http://dx.doi.org/10.1186/s12920-017-0293-y |
_version_ | 1783270607516860416 |
---|---|
author | Rodriguez-Esteban, Raul Jiang, Xiaoyu |
author_facet | Rodriguez-Esteban, Raul Jiang, Xiaoyu |
author_sort | Rodriguez-Esteban, Raul |
collection | PubMed |
description | BACKGROUND: Differential gene expression is important to understand the biological differences between healthy and diseased states. Two common sources of differential gene expression data are microarray studies and the biomedical literature. METHODS: With the aid of text mining and gene expression analysis we have examined the comparative properties of these two sources of differential gene expression data. RESULTS: The literature shows a preference for reporting genes associated to higher fold changes in microarray data, rather than genes that are simply significantly differentially expressed. Thus, the resemblance between the literature and microarray data increases when the fold-change threshold for microarray data is increased. Moreover, the literature has a reporting preference for differentially expressed genes that (1) are overexpressed rather than underexpressed; (2) are overexpressed in multiple diseases; and (3) are popular in the biomedical literature at large. Additionally, the degree to which diseases are similar depends on whether microarray data or the literature is used to compare them. Finally, vaguely-qualified reports of differential expression magnitudes in the literature have only small correlation with microarray fold-change data. CONCLUSIONS: Reporting biases of differential gene expression in the literature can be affecting our appreciation of disease biology and of the degree of similarity that actually exists between different diseases. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12920-017-0293-y) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5637346 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-56373462017-10-18 Differential gene expression in disease: a comparison between high-throughput studies and the literature Rodriguez-Esteban, Raul Jiang, Xiaoyu BMC Med Genomics Research Article BACKGROUND: Differential gene expression is important to understand the biological differences between healthy and diseased states. Two common sources of differential gene expression data are microarray studies and the biomedical literature. METHODS: With the aid of text mining and gene expression analysis we have examined the comparative properties of these two sources of differential gene expression data. RESULTS: The literature shows a preference for reporting genes associated to higher fold changes in microarray data, rather than genes that are simply significantly differentially expressed. Thus, the resemblance between the literature and microarray data increases when the fold-change threshold for microarray data is increased. Moreover, the literature has a reporting preference for differentially expressed genes that (1) are overexpressed rather than underexpressed; (2) are overexpressed in multiple diseases; and (3) are popular in the biomedical literature at large. Additionally, the degree to which diseases are similar depends on whether microarray data or the literature is used to compare them. Finally, vaguely-qualified reports of differential expression magnitudes in the literature have only small correlation with microarray fold-change data. CONCLUSIONS: Reporting biases of differential gene expression in the literature can be affecting our appreciation of disease biology and of the degree of similarity that actually exists between different diseases. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12920-017-0293-y) contains supplementary material, which is available to authorized users. BioMed Central 2017-10-11 /pmc/articles/PMC5637346/ /pubmed/29020950 http://dx.doi.org/10.1186/s12920-017-0293-y Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Rodriguez-Esteban, Raul Jiang, Xiaoyu Differential gene expression in disease: a comparison between high-throughput studies and the literature |
title | Differential gene expression in disease: a comparison between high-throughput studies and the literature |
title_full | Differential gene expression in disease: a comparison between high-throughput studies and the literature |
title_fullStr | Differential gene expression in disease: a comparison between high-throughput studies and the literature |
title_full_unstemmed | Differential gene expression in disease: a comparison between high-throughput studies and the literature |
title_short | Differential gene expression in disease: a comparison between high-throughput studies and the literature |
title_sort | differential gene expression in disease: a comparison between high-throughput studies and the literature |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5637346/ https://www.ncbi.nlm.nih.gov/pubmed/29020950 http://dx.doi.org/10.1186/s12920-017-0293-y |
work_keys_str_mv | AT rodriguezestebanraul differentialgeneexpressionindiseaseacomparisonbetweenhighthroughputstudiesandtheliterature AT jiangxiaoyu differentialgeneexpressionindiseaseacomparisonbetweenhighthroughputstudiesandtheliterature |