Cargando…

Cleaning the Medicago Microarray Database to Improve Gene Function Analysis

Transcriptomics studies have been facilitated by the development of microarray and RNA-Seq technologies, with thousands of expression datasets available for many species. However, the quality of data can be highly variable, making the combined analysis of different datasets difficult and unreliable....

Descripción completa

Detalles Bibliográficos
Autores principales: Marzorati, Francesca, Wang, Chu, Pavesi, Giulio, Mizzi, Luca, Morandini, Piero
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8234645/
https://www.ncbi.nlm.nih.gov/pubmed/34207216
http://dx.doi.org/10.3390/plants10061240
_version_ 1783714132196851712
author Marzorati, Francesca
Wang, Chu
Pavesi, Giulio
Mizzi, Luca
Morandini, Piero
author_facet Marzorati, Francesca
Wang, Chu
Pavesi, Giulio
Mizzi, Luca
Morandini, Piero
author_sort Marzorati, Francesca
collection PubMed
description Transcriptomics studies have been facilitated by the development of microarray and RNA-Seq technologies, with thousands of expression datasets available for many species. However, the quality of data can be highly variable, making the combined analysis of different datasets difficult and unreliable. Most of the microarray data for Medicago truncatula, the barrel medic, have been stored and made publicly accessible on the web database Medicago truncatula Gene Expression atlas (MtGEA). The aim of this work is to ameliorate the quality of the MtGEA database through a general method based on logical and statistical relationships among parameters and conditions. The initial 716 columns available in the dataset were reduced to 607 by evaluating the quality of data through the sum of the expression levels over the entire transcriptome probes and Pearson correlation among hybridizations. The reduced dataset shows great improvements in the consistency of the data, with a reduction in both false positives and false negatives resulting from Pearson correlation and GO enrichment analysis among genes. The approach we used is of general validity and our intent is to extend the analysis to other plant microarray databases.
format Online
Article
Text
id pubmed-8234645
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-82346452021-06-27 Cleaning the Medicago Microarray Database to Improve Gene Function Analysis Marzorati, Francesca Wang, Chu Pavesi, Giulio Mizzi, Luca Morandini, Piero Plants (Basel) Article Transcriptomics studies have been facilitated by the development of microarray and RNA-Seq technologies, with thousands of expression datasets available for many species. However, the quality of data can be highly variable, making the combined analysis of different datasets difficult and unreliable. Most of the microarray data for Medicago truncatula, the barrel medic, have been stored and made publicly accessible on the web database Medicago truncatula Gene Expression atlas (MtGEA). The aim of this work is to ameliorate the quality of the MtGEA database through a general method based on logical and statistical relationships among parameters and conditions. The initial 716 columns available in the dataset were reduced to 607 by evaluating the quality of data through the sum of the expression levels over the entire transcriptome probes and Pearson correlation among hybridizations. The reduced dataset shows great improvements in the consistency of the data, with a reduction in both false positives and false negatives resulting from Pearson correlation and GO enrichment analysis among genes. The approach we used is of general validity and our intent is to extend the analysis to other plant microarray databases. MDPI 2021-06-18 /pmc/articles/PMC8234645/ /pubmed/34207216 http://dx.doi.org/10.3390/plants10061240 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Marzorati, Francesca
Wang, Chu
Pavesi, Giulio
Mizzi, Luca
Morandini, Piero
Cleaning the Medicago Microarray Database to Improve Gene Function Analysis
title Cleaning the Medicago Microarray Database to Improve Gene Function Analysis
title_full Cleaning the Medicago Microarray Database to Improve Gene Function Analysis
title_fullStr Cleaning the Medicago Microarray Database to Improve Gene Function Analysis
title_full_unstemmed Cleaning the Medicago Microarray Database to Improve Gene Function Analysis
title_short Cleaning the Medicago Microarray Database to Improve Gene Function Analysis
title_sort cleaning the medicago microarray database to improve gene function analysis
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8234645/
https://www.ncbi.nlm.nih.gov/pubmed/34207216
http://dx.doi.org/10.3390/plants10061240
work_keys_str_mv AT marzoratifrancesca cleaningthemedicagomicroarraydatabasetoimprovegenefunctionanalysis
AT wangchu cleaningthemedicagomicroarraydatabasetoimprovegenefunctionanalysis
AT pavesigiulio cleaningthemedicagomicroarraydatabasetoimprovegenefunctionanalysis
AT mizziluca cleaningthemedicagomicroarraydatabasetoimprovegenefunctionanalysis
AT morandinipiero cleaningthemedicagomicroarraydatabasetoimprovegenefunctionanalysis