Cargando…

Improved gene co-expression network quality through expression dataset down-sampling and network aggregation

Large-scale gene co-expression networks are an effective methodology to analyze sets of co-expressed genes and discover new gene functions or associations. Distances between genes are estimated according to their expression profiles and are visualized in networks that may be further partitioned to r...

Descripción completa

Detalles Bibliográficos
Autores principales: Liesecke, Franziska, De Craene, Johan-Owen, Besseau, Sébastien, Courdavault, Vincent, Clastre, Marc, Vergès, Valentin, Papon, Nicolas, Giglioli-Guivarc’h, Nathalie, Glévarec, Gaëlle, Pichon, Olivier, Dugé de Bernonville, Thomas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6783424/
https://www.ncbi.nlm.nih.gov/pubmed/31594989
http://dx.doi.org/10.1038/s41598-019-50885-8
_version_ 1783457548140019712
author Liesecke, Franziska
De Craene, Johan-Owen
Besseau, Sébastien
Courdavault, Vincent
Clastre, Marc
Vergès, Valentin
Papon, Nicolas
Giglioli-Guivarc’h, Nathalie
Glévarec, Gaëlle
Pichon, Olivier
Dugé de Bernonville, Thomas
author_facet Liesecke, Franziska
De Craene, Johan-Owen
Besseau, Sébastien
Courdavault, Vincent
Clastre, Marc
Vergès, Valentin
Papon, Nicolas
Giglioli-Guivarc’h, Nathalie
Glévarec, Gaëlle
Pichon, Olivier
Dugé de Bernonville, Thomas
author_sort Liesecke, Franziska
collection PubMed
description Large-scale gene co-expression networks are an effective methodology to analyze sets of co-expressed genes and discover new gene functions or associations. Distances between genes are estimated according to their expression profiles and are visualized in networks that may be further partitioned to reveal communities of co-expressed genes. Creating expression profiles is now eased by the large amounts of publicly available expression data (microarrays and RNA-seq). Although many distance calculation methods have been intensively compared and reviewed in the past, it is unclear how to proceed when many samples reflecting a wide range of different conditions are available. Should as many samples as possible be integrated into network construction or be partitioned into smaller sets of more related samples? Previous studies have indicated a saturation in network performances to capture known associations once a certain number of samples is included in distance calculations. Here, we examined the influence of sample size on co-expression network construction using microarray and RNA-seq expression data from three plant species. We tested different down-sampling methods and compared network performances in recovering known gene associations to networks obtained from full datasets. We further examined how aggregating networks may help increase this performance by testing six aggregation methods.
format Online
Article
Text
id pubmed-6783424
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-67834242019-10-16 Improved gene co-expression network quality through expression dataset down-sampling and network aggregation Liesecke, Franziska De Craene, Johan-Owen Besseau, Sébastien Courdavault, Vincent Clastre, Marc Vergès, Valentin Papon, Nicolas Giglioli-Guivarc’h, Nathalie Glévarec, Gaëlle Pichon, Olivier Dugé de Bernonville, Thomas Sci Rep Article Large-scale gene co-expression networks are an effective methodology to analyze sets of co-expressed genes and discover new gene functions or associations. Distances between genes are estimated according to their expression profiles and are visualized in networks that may be further partitioned to reveal communities of co-expressed genes. Creating expression profiles is now eased by the large amounts of publicly available expression data (microarrays and RNA-seq). Although many distance calculation methods have been intensively compared and reviewed in the past, it is unclear how to proceed when many samples reflecting a wide range of different conditions are available. Should as many samples as possible be integrated into network construction or be partitioned into smaller sets of more related samples? Previous studies have indicated a saturation in network performances to capture known associations once a certain number of samples is included in distance calculations. Here, we examined the influence of sample size on co-expression network construction using microarray and RNA-seq expression data from three plant species. We tested different down-sampling methods and compared network performances in recovering known gene associations to networks obtained from full datasets. We further examined how aggregating networks may help increase this performance by testing six aggregation methods. Nature Publishing Group UK 2019-10-08 /pmc/articles/PMC6783424/ /pubmed/31594989 http://dx.doi.org/10.1038/s41598-019-50885-8 Text en © The Author(s) 2019 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Liesecke, Franziska
De Craene, Johan-Owen
Besseau, Sébastien
Courdavault, Vincent
Clastre, Marc
Vergès, Valentin
Papon, Nicolas
Giglioli-Guivarc’h, Nathalie
Glévarec, Gaëlle
Pichon, Olivier
Dugé de Bernonville, Thomas
Improved gene co-expression network quality through expression dataset down-sampling and network aggregation
title Improved gene co-expression network quality through expression dataset down-sampling and network aggregation
title_full Improved gene co-expression network quality through expression dataset down-sampling and network aggregation
title_fullStr Improved gene co-expression network quality through expression dataset down-sampling and network aggregation
title_full_unstemmed Improved gene co-expression network quality through expression dataset down-sampling and network aggregation
title_short Improved gene co-expression network quality through expression dataset down-sampling and network aggregation
title_sort improved gene co-expression network quality through expression dataset down-sampling and network aggregation
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6783424/
https://www.ncbi.nlm.nih.gov/pubmed/31594989
http://dx.doi.org/10.1038/s41598-019-50885-8
work_keys_str_mv AT lieseckefranziska improvedgenecoexpressionnetworkqualitythroughexpressiondatasetdownsamplingandnetworkaggregation
AT decraenejohanowen improvedgenecoexpressionnetworkqualitythroughexpressiondatasetdownsamplingandnetworkaggregation
AT besseausebastien improvedgenecoexpressionnetworkqualitythroughexpressiondatasetdownsamplingandnetworkaggregation
AT courdavaultvincent improvedgenecoexpressionnetworkqualitythroughexpressiondatasetdownsamplingandnetworkaggregation
AT clastremarc improvedgenecoexpressionnetworkqualitythroughexpressiondatasetdownsamplingandnetworkaggregation
AT vergesvalentin improvedgenecoexpressionnetworkqualitythroughexpressiondatasetdownsamplingandnetworkaggregation
AT paponnicolas improvedgenecoexpressionnetworkqualitythroughexpressiondatasetdownsamplingandnetworkaggregation
AT giglioliguivarchnathalie improvedgenecoexpressionnetworkqualitythroughexpressiondatasetdownsamplingandnetworkaggregation
AT glevarecgaelle improvedgenecoexpressionnetworkqualitythroughexpressiondatasetdownsamplingandnetworkaggregation
AT pichonolivier improvedgenecoexpressionnetworkqualitythroughexpressiondatasetdownsamplingandnetworkaggregation
AT dugedebernonvillethomas improvedgenecoexpressionnetworkqualitythroughexpressiondatasetdownsamplingandnetworkaggregation