Cargando…
Cleaning the Medicago Microarray Database to Improve Gene Function Analysis
Transcriptomics studies have been facilitated by the development of microarray and RNA-Seq technologies, with thousands of expression datasets available for many species. However, the quality of data can be highly variable, making the combined analysis of different datasets difficult and unreliable....
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8234645/ https://www.ncbi.nlm.nih.gov/pubmed/34207216 http://dx.doi.org/10.3390/plants10061240 |
_version_ | 1783714132196851712 |
---|---|
author | Marzorati, Francesca Wang, Chu Pavesi, Giulio Mizzi, Luca Morandini, Piero |
author_facet | Marzorati, Francesca Wang, Chu Pavesi, Giulio Mizzi, Luca Morandini, Piero |
author_sort | Marzorati, Francesca |
collection | PubMed |
description | Transcriptomics studies have been facilitated by the development of microarray and RNA-Seq technologies, with thousands of expression datasets available for many species. However, the quality of data can be highly variable, making the combined analysis of different datasets difficult and unreliable. Most of the microarray data for Medicago truncatula, the barrel medic, have been stored and made publicly accessible on the web database Medicago truncatula Gene Expression atlas (MtGEA). The aim of this work is to ameliorate the quality of the MtGEA database through a general method based on logical and statistical relationships among parameters and conditions. The initial 716 columns available in the dataset were reduced to 607 by evaluating the quality of data through the sum of the expression levels over the entire transcriptome probes and Pearson correlation among hybridizations. The reduced dataset shows great improvements in the consistency of the data, with a reduction in both false positives and false negatives resulting from Pearson correlation and GO enrichment analysis among genes. The approach we used is of general validity and our intent is to extend the analysis to other plant microarray databases. |
format | Online Article Text |
id | pubmed-8234645 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-82346452021-06-27 Cleaning the Medicago Microarray Database to Improve Gene Function Analysis Marzorati, Francesca Wang, Chu Pavesi, Giulio Mizzi, Luca Morandini, Piero Plants (Basel) Article Transcriptomics studies have been facilitated by the development of microarray and RNA-Seq technologies, with thousands of expression datasets available for many species. However, the quality of data can be highly variable, making the combined analysis of different datasets difficult and unreliable. Most of the microarray data for Medicago truncatula, the barrel medic, have been stored and made publicly accessible on the web database Medicago truncatula Gene Expression atlas (MtGEA). The aim of this work is to ameliorate the quality of the MtGEA database through a general method based on logical and statistical relationships among parameters and conditions. The initial 716 columns available in the dataset were reduced to 607 by evaluating the quality of data through the sum of the expression levels over the entire transcriptome probes and Pearson correlation among hybridizations. The reduced dataset shows great improvements in the consistency of the data, with a reduction in both false positives and false negatives resulting from Pearson correlation and GO enrichment analysis among genes. The approach we used is of general validity and our intent is to extend the analysis to other plant microarray databases. MDPI 2021-06-18 /pmc/articles/PMC8234645/ /pubmed/34207216 http://dx.doi.org/10.3390/plants10061240 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Marzorati, Francesca Wang, Chu Pavesi, Giulio Mizzi, Luca Morandini, Piero Cleaning the Medicago Microarray Database to Improve Gene Function Analysis |
title | Cleaning the Medicago Microarray Database to Improve Gene Function Analysis |
title_full | Cleaning the Medicago Microarray Database to Improve Gene Function Analysis |
title_fullStr | Cleaning the Medicago Microarray Database to Improve Gene Function Analysis |
title_full_unstemmed | Cleaning the Medicago Microarray Database to Improve Gene Function Analysis |
title_short | Cleaning the Medicago Microarray Database to Improve Gene Function Analysis |
title_sort | cleaning the medicago microarray database to improve gene function analysis |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8234645/ https://www.ncbi.nlm.nih.gov/pubmed/34207216 http://dx.doi.org/10.3390/plants10061240 |
work_keys_str_mv | AT marzoratifrancesca cleaningthemedicagomicroarraydatabasetoimprovegenefunctionanalysis AT wangchu cleaningthemedicagomicroarraydatabasetoimprovegenefunctionanalysis AT pavesigiulio cleaningthemedicagomicroarraydatabasetoimprovegenefunctionanalysis AT mizziluca cleaningthemedicagomicroarraydatabasetoimprovegenefunctionanalysis AT morandinipiero cleaningthemedicagomicroarraydatabasetoimprovegenefunctionanalysis |