Cargando…
OMICfpp: a fuzzy approach for paired RNA-Seq counts
BACKGROUND: RNA sequencing is a widely used technology for differential expression analysis. However, the RNA-Seq do not provide accurate absolute measurements and the results can be different for each pipeline used. The major problem in statistical analysis of RNA-Seq and in the omics data in gener...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6444640/ https://www.ncbi.nlm.nih.gov/pubmed/30940089 http://dx.doi.org/10.1186/s12864-019-5496-5 |
_version_ | 1783408062607917056 |
---|---|
author | Berral-Gonzalez, Alberto Riffo-Campos, Angela L. Ayala, Guillermo |
author_facet | Berral-Gonzalez, Alberto Riffo-Campos, Angela L. Ayala, Guillermo |
author_sort | Berral-Gonzalez, Alberto |
collection | PubMed |
description | BACKGROUND: RNA sequencing is a widely used technology for differential expression analysis. However, the RNA-Seq do not provide accurate absolute measurements and the results can be different for each pipeline used. The major problem in statistical analysis of RNA-Seq and in the omics data in general, is the small sample size with respect to the large number of variables. In addition, experimental design must be taken into account and few tools consider it. RESULTS: We propose OMICfpp, a method for the statistical analysis of RNA-Seq paired design data. First, we obtain a p-value for each case-control pair using a binomial test. These p-values are aggregated using an ordered weighted average (OWA) with a given orness previously chosen. The aggregated p-value from the original data is compared with the aggregated p-value obtained using the same method applied to random pairs. These new pairs are generated using between-pairs and complete randomization distributions. This randomization p-value is used as a raw p-value to test the differential expression of each gene. The OMICfpp method is evaluated using public data sets of 68 sample pairs from patients with colorectal cancer. We validate our results through bibliographic search of the reported genes and using simulated data set. Furthermore, we compared our results with those obtained by the methods edgeR and DESeq2 for paired samples. Finally, we propose new target genes to validate these as gene expression signatures in colorectal cancer. OMICfpp is available at http://www.uv.es/ayala/software/OMICfpp_0.2.tar.gz. CONCLUSIONS: Our study shows that OMICfpp is an accurate method for differential expression analysis in RNA-Seq data with paired design. In addition, we propose the use of randomized p-values pattern graphic as a powerful and robust method to select the target genes for experimental validation. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-019-5496-5) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6444640 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-64446402019-04-11 OMICfpp: a fuzzy approach for paired RNA-Seq counts Berral-Gonzalez, Alberto Riffo-Campos, Angela L. Ayala, Guillermo BMC Genomics Methodology Article BACKGROUND: RNA sequencing is a widely used technology for differential expression analysis. However, the RNA-Seq do not provide accurate absolute measurements and the results can be different for each pipeline used. The major problem in statistical analysis of RNA-Seq and in the omics data in general, is the small sample size with respect to the large number of variables. In addition, experimental design must be taken into account and few tools consider it. RESULTS: We propose OMICfpp, a method for the statistical analysis of RNA-Seq paired design data. First, we obtain a p-value for each case-control pair using a binomial test. These p-values are aggregated using an ordered weighted average (OWA) with a given orness previously chosen. The aggregated p-value from the original data is compared with the aggregated p-value obtained using the same method applied to random pairs. These new pairs are generated using between-pairs and complete randomization distributions. This randomization p-value is used as a raw p-value to test the differential expression of each gene. The OMICfpp method is evaluated using public data sets of 68 sample pairs from patients with colorectal cancer. We validate our results through bibliographic search of the reported genes and using simulated data set. Furthermore, we compared our results with those obtained by the methods edgeR and DESeq2 for paired samples. Finally, we propose new target genes to validate these as gene expression signatures in colorectal cancer. OMICfpp is available at http://www.uv.es/ayala/software/OMICfpp_0.2.tar.gz. CONCLUSIONS: Our study shows that OMICfpp is an accurate method for differential expression analysis in RNA-Seq data with paired design. In addition, we propose the use of randomized p-values pattern graphic as a powerful and robust method to select the target genes for experimental validation. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-019-5496-5) contains supplementary material, which is available to authorized users. BioMed Central 2019-04-02 /pmc/articles/PMC6444640/ /pubmed/30940089 http://dx.doi.org/10.1186/s12864-019-5496-5 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Berral-Gonzalez, Alberto Riffo-Campos, Angela L. Ayala, Guillermo OMICfpp: a fuzzy approach for paired RNA-Seq counts |
title | OMICfpp: a fuzzy approach for paired RNA-Seq counts |
title_full | OMICfpp: a fuzzy approach for paired RNA-Seq counts |
title_fullStr | OMICfpp: a fuzzy approach for paired RNA-Seq counts |
title_full_unstemmed | OMICfpp: a fuzzy approach for paired RNA-Seq counts |
title_short | OMICfpp: a fuzzy approach for paired RNA-Seq counts |
title_sort | omicfpp: a fuzzy approach for paired rna-seq counts |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6444640/ https://www.ncbi.nlm.nih.gov/pubmed/30940089 http://dx.doi.org/10.1186/s12864-019-5496-5 |
work_keys_str_mv | AT berralgonzalezalberto omicfppafuzzyapproachforpairedrnaseqcounts AT riffocamposangelal omicfppafuzzyapproachforpairedrnaseqcounts AT ayalaguillermo omicfppafuzzyapproachforpairedrnaseqcounts |