Cargando…

Effects of subsampling on characteristics of RNA-seq data from triple-negative breast cancer patients

BACKGROUND: Data from RNA-seq experiments provide a wealth of information about the transcriptome of an organism. However, the analysis of such data is very demanding. In this study, we aimed to establish robust analysis procedures that can be used in clinical practice. METHODS: We studied RNA-seq d...

Descripción completa

Detalles Bibliográficos
Autores principales: Stupnikov, Alexey, Glazko, Galina V, Emmert-Streib, Frank
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4593382/
https://www.ncbi.nlm.nih.gov/pubmed/26253000
http://dx.doi.org/10.1186/s40880-015-0040-8
_version_ 1782393316890902528
author Stupnikov, Alexey
Glazko, Galina V
Emmert-Streib, Frank
author_facet Stupnikov, Alexey
Glazko, Galina V
Emmert-Streib, Frank
author_sort Stupnikov, Alexey
collection PubMed
description BACKGROUND: Data from RNA-seq experiments provide a wealth of information about the transcriptome of an organism. However, the analysis of such data is very demanding. In this study, we aimed to establish robust analysis procedures that can be used in clinical practice. METHODS: We studied RNA-seq data from triple-negative breast cancer patients. Specifically, we investigated the subsampling of RNA-seq data. RESULTS: The main results of our investigations are as follows: (1) the subsampling of RNA-seq data gave biologically realistic simulations of sequencing experiments with smaller sequencing depth but not direct scaling of count matrices; (2) the saturation of results required an average sequencing depth larger than 32 million reads and an individual sequencing depth larger than 46 million reads; and (3) for an abrogated feature selection, higher moments of the distribution of all expressed genes had a higher sensitivity for signal detection than the corresponding mean values. CONCLUSIONS: Our results reveal important characteristics of RNA-seq data that must be understood before one can apply such an approach to translational medicine.
format Online
Article
Text
id pubmed-4593382
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-45933822015-10-06 Effects of subsampling on characteristics of RNA-seq data from triple-negative breast cancer patients Stupnikov, Alexey Glazko, Galina V Emmert-Streib, Frank Chin J Cancer Original Article BACKGROUND: Data from RNA-seq experiments provide a wealth of information about the transcriptome of an organism. However, the analysis of such data is very demanding. In this study, we aimed to establish robust analysis procedures that can be used in clinical practice. METHODS: We studied RNA-seq data from triple-negative breast cancer patients. Specifically, we investigated the subsampling of RNA-seq data. RESULTS: The main results of our investigations are as follows: (1) the subsampling of RNA-seq data gave biologically realistic simulations of sequencing experiments with smaller sequencing depth but not direct scaling of count matrices; (2) the saturation of results required an average sequencing depth larger than 32 million reads and an individual sequencing depth larger than 46 million reads; and (3) for an abrogated feature selection, higher moments of the distribution of all expressed genes had a higher sensitivity for signal detection than the corresponding mean values. CONCLUSIONS: Our results reveal important characteristics of RNA-seq data that must be understood before one can apply such an approach to translational medicine. BioMed Central 2015-08-08 /pmc/articles/PMC4593382/ /pubmed/26253000 http://dx.doi.org/10.1186/s40880-015-0040-8 Text en © Stupnikov et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Original Article
Stupnikov, Alexey
Glazko, Galina V
Emmert-Streib, Frank
Effects of subsampling on characteristics of RNA-seq data from triple-negative breast cancer patients
title Effects of subsampling on characteristics of RNA-seq data from triple-negative breast cancer patients
title_full Effects of subsampling on characteristics of RNA-seq data from triple-negative breast cancer patients
title_fullStr Effects of subsampling on characteristics of RNA-seq data from triple-negative breast cancer patients
title_full_unstemmed Effects of subsampling on characteristics of RNA-seq data from triple-negative breast cancer patients
title_short Effects of subsampling on characteristics of RNA-seq data from triple-negative breast cancer patients
title_sort effects of subsampling on characteristics of rna-seq data from triple-negative breast cancer patients
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4593382/
https://www.ncbi.nlm.nih.gov/pubmed/26253000
http://dx.doi.org/10.1186/s40880-015-0040-8
work_keys_str_mv AT stupnikovalexey effectsofsubsamplingoncharacteristicsofrnaseqdatafromtriplenegativebreastcancerpatients
AT glazkogalinav effectsofsubsamplingoncharacteristicsofrnaseqdatafromtriplenegativebreastcancerpatients
AT emmertstreibfrank effectsofsubsamplingoncharacteristicsofrnaseqdatafromtriplenegativebreastcancerpatients