Cargando…
Improved gene co-expression network quality through expression dataset down-sampling and network aggregation
Large-scale gene co-expression networks are an effective methodology to analyze sets of co-expressed genes and discover new gene functions or associations. Distances between genes are estimated according to their expression profiles and are visualized in networks that may be further partitioned to r...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6783424/ https://www.ncbi.nlm.nih.gov/pubmed/31594989 http://dx.doi.org/10.1038/s41598-019-50885-8 |
_version_ | 1783457548140019712 |
---|---|
author | Liesecke, Franziska De Craene, Johan-Owen Besseau, Sébastien Courdavault, Vincent Clastre, Marc Vergès, Valentin Papon, Nicolas Giglioli-Guivarc’h, Nathalie Glévarec, Gaëlle Pichon, Olivier Dugé de Bernonville, Thomas |
author_facet | Liesecke, Franziska De Craene, Johan-Owen Besseau, Sébastien Courdavault, Vincent Clastre, Marc Vergès, Valentin Papon, Nicolas Giglioli-Guivarc’h, Nathalie Glévarec, Gaëlle Pichon, Olivier Dugé de Bernonville, Thomas |
author_sort | Liesecke, Franziska |
collection | PubMed |
description | Large-scale gene co-expression networks are an effective methodology to analyze sets of co-expressed genes and discover new gene functions or associations. Distances between genes are estimated according to their expression profiles and are visualized in networks that may be further partitioned to reveal communities of co-expressed genes. Creating expression profiles is now eased by the large amounts of publicly available expression data (microarrays and RNA-seq). Although many distance calculation methods have been intensively compared and reviewed in the past, it is unclear how to proceed when many samples reflecting a wide range of different conditions are available. Should as many samples as possible be integrated into network construction or be partitioned into smaller sets of more related samples? Previous studies have indicated a saturation in network performances to capture known associations once a certain number of samples is included in distance calculations. Here, we examined the influence of sample size on co-expression network construction using microarray and RNA-seq expression data from three plant species. We tested different down-sampling methods and compared network performances in recovering known gene associations to networks obtained from full datasets. We further examined how aggregating networks may help increase this performance by testing six aggregation methods. |
format | Online Article Text |
id | pubmed-6783424 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-67834242019-10-16 Improved gene co-expression network quality through expression dataset down-sampling and network aggregation Liesecke, Franziska De Craene, Johan-Owen Besseau, Sébastien Courdavault, Vincent Clastre, Marc Vergès, Valentin Papon, Nicolas Giglioli-Guivarc’h, Nathalie Glévarec, Gaëlle Pichon, Olivier Dugé de Bernonville, Thomas Sci Rep Article Large-scale gene co-expression networks are an effective methodology to analyze sets of co-expressed genes and discover new gene functions or associations. Distances between genes are estimated according to their expression profiles and are visualized in networks that may be further partitioned to reveal communities of co-expressed genes. Creating expression profiles is now eased by the large amounts of publicly available expression data (microarrays and RNA-seq). Although many distance calculation methods have been intensively compared and reviewed in the past, it is unclear how to proceed when many samples reflecting a wide range of different conditions are available. Should as many samples as possible be integrated into network construction or be partitioned into smaller sets of more related samples? Previous studies have indicated a saturation in network performances to capture known associations once a certain number of samples is included in distance calculations. Here, we examined the influence of sample size on co-expression network construction using microarray and RNA-seq expression data from three plant species. We tested different down-sampling methods and compared network performances in recovering known gene associations to networks obtained from full datasets. We further examined how aggregating networks may help increase this performance by testing six aggregation methods. Nature Publishing Group UK 2019-10-08 /pmc/articles/PMC6783424/ /pubmed/31594989 http://dx.doi.org/10.1038/s41598-019-50885-8 Text en © The Author(s) 2019 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Liesecke, Franziska De Craene, Johan-Owen Besseau, Sébastien Courdavault, Vincent Clastre, Marc Vergès, Valentin Papon, Nicolas Giglioli-Guivarc’h, Nathalie Glévarec, Gaëlle Pichon, Olivier Dugé de Bernonville, Thomas Improved gene co-expression network quality through expression dataset down-sampling and network aggregation |
title | Improved gene co-expression network quality through expression dataset down-sampling and network aggregation |
title_full | Improved gene co-expression network quality through expression dataset down-sampling and network aggregation |
title_fullStr | Improved gene co-expression network quality through expression dataset down-sampling and network aggregation |
title_full_unstemmed | Improved gene co-expression network quality through expression dataset down-sampling and network aggregation |
title_short | Improved gene co-expression network quality through expression dataset down-sampling and network aggregation |
title_sort | improved gene co-expression network quality through expression dataset down-sampling and network aggregation |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6783424/ https://www.ncbi.nlm.nih.gov/pubmed/31594989 http://dx.doi.org/10.1038/s41598-019-50885-8 |
work_keys_str_mv | AT lieseckefranziska improvedgenecoexpressionnetworkqualitythroughexpressiondatasetdownsamplingandnetworkaggregation AT decraenejohanowen improvedgenecoexpressionnetworkqualitythroughexpressiondatasetdownsamplingandnetworkaggregation AT besseausebastien improvedgenecoexpressionnetworkqualitythroughexpressiondatasetdownsamplingandnetworkaggregation AT courdavaultvincent improvedgenecoexpressionnetworkqualitythroughexpressiondatasetdownsamplingandnetworkaggregation AT clastremarc improvedgenecoexpressionnetworkqualitythroughexpressiondatasetdownsamplingandnetworkaggregation AT vergesvalentin improvedgenecoexpressionnetworkqualitythroughexpressiondatasetdownsamplingandnetworkaggregation AT paponnicolas improvedgenecoexpressionnetworkqualitythroughexpressiondatasetdownsamplingandnetworkaggregation AT giglioliguivarchnathalie improvedgenecoexpressionnetworkqualitythroughexpressiondatasetdownsamplingandnetworkaggregation AT glevarecgaelle improvedgenecoexpressionnetworkqualitythroughexpressiondatasetdownsamplingandnetworkaggregation AT pichonolivier improvedgenecoexpressionnetworkqualitythroughexpressiondatasetdownsamplingandnetworkaggregation AT dugedebernonvillethomas improvedgenecoexpressionnetworkqualitythroughexpressiondatasetdownsamplingandnetworkaggregation |