Cargando…
Effects of subsampling on characteristics of RNA-seq data from triple-negative breast cancer patients
BACKGROUND: Data from RNA-seq experiments provide a wealth of information about the transcriptome of an organism. However, the analysis of such data is very demanding. In this study, we aimed to establish robust analysis procedures that can be used in clinical practice. METHODS: We studied RNA-seq d...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4593382/ https://www.ncbi.nlm.nih.gov/pubmed/26253000 http://dx.doi.org/10.1186/s40880-015-0040-8 |
_version_ | 1782393316890902528 |
---|---|
author | Stupnikov, Alexey Glazko, Galina V Emmert-Streib, Frank |
author_facet | Stupnikov, Alexey Glazko, Galina V Emmert-Streib, Frank |
author_sort | Stupnikov, Alexey |
collection | PubMed |
description | BACKGROUND: Data from RNA-seq experiments provide a wealth of information about the transcriptome of an organism. However, the analysis of such data is very demanding. In this study, we aimed to establish robust analysis procedures that can be used in clinical practice. METHODS: We studied RNA-seq data from triple-negative breast cancer patients. Specifically, we investigated the subsampling of RNA-seq data. RESULTS: The main results of our investigations are as follows: (1) the subsampling of RNA-seq data gave biologically realistic simulations of sequencing experiments with smaller sequencing depth but not direct scaling of count matrices; (2) the saturation of results required an average sequencing depth larger than 32 million reads and an individual sequencing depth larger than 46 million reads; and (3) for an abrogated feature selection, higher moments of the distribution of all expressed genes had a higher sensitivity for signal detection than the corresponding mean values. CONCLUSIONS: Our results reveal important characteristics of RNA-seq data that must be understood before one can apply such an approach to translational medicine. |
format | Online Article Text |
id | pubmed-4593382 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-45933822015-10-06 Effects of subsampling on characteristics of RNA-seq data from triple-negative breast cancer patients Stupnikov, Alexey Glazko, Galina V Emmert-Streib, Frank Chin J Cancer Original Article BACKGROUND: Data from RNA-seq experiments provide a wealth of information about the transcriptome of an organism. However, the analysis of such data is very demanding. In this study, we aimed to establish robust analysis procedures that can be used in clinical practice. METHODS: We studied RNA-seq data from triple-negative breast cancer patients. Specifically, we investigated the subsampling of RNA-seq data. RESULTS: The main results of our investigations are as follows: (1) the subsampling of RNA-seq data gave biologically realistic simulations of sequencing experiments with smaller sequencing depth but not direct scaling of count matrices; (2) the saturation of results required an average sequencing depth larger than 32 million reads and an individual sequencing depth larger than 46 million reads; and (3) for an abrogated feature selection, higher moments of the distribution of all expressed genes had a higher sensitivity for signal detection than the corresponding mean values. CONCLUSIONS: Our results reveal important characteristics of RNA-seq data that must be understood before one can apply such an approach to translational medicine. BioMed Central 2015-08-08 /pmc/articles/PMC4593382/ /pubmed/26253000 http://dx.doi.org/10.1186/s40880-015-0040-8 Text en © Stupnikov et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Original Article Stupnikov, Alexey Glazko, Galina V Emmert-Streib, Frank Effects of subsampling on characteristics of RNA-seq data from triple-negative breast cancer patients |
title | Effects of subsampling on characteristics of RNA-seq data from triple-negative breast cancer patients |
title_full | Effects of subsampling on characteristics of RNA-seq data from triple-negative breast cancer patients |
title_fullStr | Effects of subsampling on characteristics of RNA-seq data from triple-negative breast cancer patients |
title_full_unstemmed | Effects of subsampling on characteristics of RNA-seq data from triple-negative breast cancer patients |
title_short | Effects of subsampling on characteristics of RNA-seq data from triple-negative breast cancer patients |
title_sort | effects of subsampling on characteristics of rna-seq data from triple-negative breast cancer patients |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4593382/ https://www.ncbi.nlm.nih.gov/pubmed/26253000 http://dx.doi.org/10.1186/s40880-015-0040-8 |
work_keys_str_mv | AT stupnikovalexey effectsofsubsamplingoncharacteristicsofrnaseqdatafromtriplenegativebreastcancerpatients AT glazkogalinav effectsofsubsamplingoncharacteristicsofrnaseqdatafromtriplenegativebreastcancerpatients AT emmertstreibfrank effectsofsubsamplingoncharacteristicsofrnaseqdatafromtriplenegativebreastcancerpatients |