Cargando…

OMICfpp: a fuzzy approach for paired RNA-Seq counts

BACKGROUND: RNA sequencing is a widely used technology for differential expression analysis. However, the RNA-Seq do not provide accurate absolute measurements and the results can be different for each pipeline used. The major problem in statistical analysis of RNA-Seq and in the omics data in gener...

Descripción completa

Detalles Bibliográficos
Autores principales: Berral-Gonzalez, Alberto, Riffo-Campos, Angela L., Ayala, Guillermo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6444640/
https://www.ncbi.nlm.nih.gov/pubmed/30940089
http://dx.doi.org/10.1186/s12864-019-5496-5
_version_ 1783408062607917056
author Berral-Gonzalez, Alberto
Riffo-Campos, Angela L.
Ayala, Guillermo
author_facet Berral-Gonzalez, Alberto
Riffo-Campos, Angela L.
Ayala, Guillermo
author_sort Berral-Gonzalez, Alberto
collection PubMed
description BACKGROUND: RNA sequencing is a widely used technology for differential expression analysis. However, the RNA-Seq do not provide accurate absolute measurements and the results can be different for each pipeline used. The major problem in statistical analysis of RNA-Seq and in the omics data in general, is the small sample size with respect to the large number of variables. In addition, experimental design must be taken into account and few tools consider it. RESULTS: We propose OMICfpp, a method for the statistical analysis of RNA-Seq paired design data. First, we obtain a p-value for each case-control pair using a binomial test. These p-values are aggregated using an ordered weighted average (OWA) with a given orness previously chosen. The aggregated p-value from the original data is compared with the aggregated p-value obtained using the same method applied to random pairs. These new pairs are generated using between-pairs and complete randomization distributions. This randomization p-value is used as a raw p-value to test the differential expression of each gene. The OMICfpp method is evaluated using public data sets of 68 sample pairs from patients with colorectal cancer. We validate our results through bibliographic search of the reported genes and using simulated data set. Furthermore, we compared our results with those obtained by the methods edgeR and DESeq2 for paired samples. Finally, we propose new target genes to validate these as gene expression signatures in colorectal cancer. OMICfpp is available at http://www.uv.es/ayala/software/OMICfpp_0.2.tar.gz. CONCLUSIONS: Our study shows that OMICfpp is an accurate method for differential expression analysis in RNA-Seq data with paired design. In addition, we propose the use of randomized p-values pattern graphic as a powerful and robust method to select the target genes for experimental validation. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-019-5496-5) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6444640
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-64446402019-04-11 OMICfpp: a fuzzy approach for paired RNA-Seq counts Berral-Gonzalez, Alberto Riffo-Campos, Angela L. Ayala, Guillermo BMC Genomics Methodology Article BACKGROUND: RNA sequencing is a widely used technology for differential expression analysis. However, the RNA-Seq do not provide accurate absolute measurements and the results can be different for each pipeline used. The major problem in statistical analysis of RNA-Seq and in the omics data in general, is the small sample size with respect to the large number of variables. In addition, experimental design must be taken into account and few tools consider it. RESULTS: We propose OMICfpp, a method for the statistical analysis of RNA-Seq paired design data. First, we obtain a p-value for each case-control pair using a binomial test. These p-values are aggregated using an ordered weighted average (OWA) with a given orness previously chosen. The aggregated p-value from the original data is compared with the aggregated p-value obtained using the same method applied to random pairs. These new pairs are generated using between-pairs and complete randomization distributions. This randomization p-value is used as a raw p-value to test the differential expression of each gene. The OMICfpp method is evaluated using public data sets of 68 sample pairs from patients with colorectal cancer. We validate our results through bibliographic search of the reported genes and using simulated data set. Furthermore, we compared our results with those obtained by the methods edgeR and DESeq2 for paired samples. Finally, we propose new target genes to validate these as gene expression signatures in colorectal cancer. OMICfpp is available at http://www.uv.es/ayala/software/OMICfpp_0.2.tar.gz. CONCLUSIONS: Our study shows that OMICfpp is an accurate method for differential expression analysis in RNA-Seq data with paired design. In addition, we propose the use of randomized p-values pattern graphic as a powerful and robust method to select the target genes for experimental validation. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-019-5496-5) contains supplementary material, which is available to authorized users. BioMed Central 2019-04-02 /pmc/articles/PMC6444640/ /pubmed/30940089 http://dx.doi.org/10.1186/s12864-019-5496-5 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Berral-Gonzalez, Alberto
Riffo-Campos, Angela L.
Ayala, Guillermo
OMICfpp: a fuzzy approach for paired RNA-Seq counts
title OMICfpp: a fuzzy approach for paired RNA-Seq counts
title_full OMICfpp: a fuzzy approach for paired RNA-Seq counts
title_fullStr OMICfpp: a fuzzy approach for paired RNA-Seq counts
title_full_unstemmed OMICfpp: a fuzzy approach for paired RNA-Seq counts
title_short OMICfpp: a fuzzy approach for paired RNA-Seq counts
title_sort omicfpp: a fuzzy approach for paired rna-seq counts
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6444640/
https://www.ncbi.nlm.nih.gov/pubmed/30940089
http://dx.doi.org/10.1186/s12864-019-5496-5
work_keys_str_mv AT berralgonzalezalberto omicfppafuzzyapproachforpairedrnaseqcounts
AT riffocamposangelal omicfppafuzzyapproachforpairedrnaseqcounts
AT ayalaguillermo omicfppafuzzyapproachforpairedrnaseqcounts