Cargando…
SVExpress: identifying gene features altered recurrently in expression with nearby structural variant breakpoints
BACKGROUND: Combined whole-genome sequencing (WGS) and RNA sequencing of cancers offer the opportunity to identify genes with altered expression due to genomic rearrangements. Somatic structural variants (SVs), as identified by WGS, can involve altered gene cis-regulation, gene fusions, copy number...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7981925/ https://www.ncbi.nlm.nih.gov/pubmed/33743584 http://dx.doi.org/10.1186/s12859-021-04072-0 |
_version_ | 1783667613044310016 |
---|---|
author | Zhang, Yiqun Chen, Fengju Creighton, Chad J. |
author_facet | Zhang, Yiqun Chen, Fengju Creighton, Chad J. |
author_sort | Zhang, Yiqun |
collection | PubMed |
description | BACKGROUND: Combined whole-genome sequencing (WGS) and RNA sequencing of cancers offer the opportunity to identify genes with altered expression due to genomic rearrangements. Somatic structural variants (SVs), as identified by WGS, can involve altered gene cis-regulation, gene fusions, copy number alterations, or gene disruption. The absence of computational tools to streamline integrative analysis steps may represent a barrier in identifying genes recurrently altered by genomic rearrangement. RESULTS: Here, we introduce SVExpress, a set of tools for carrying out integrative analysis of SV and gene expression data. SVExpress enables systematic cataloging of genes that consistently show increased or decreased expression in conjunction with the presence of nearby SV breakpoints. SVExpress can evaluate breakpoints in proximity to genes for potential enhancer translocation events or disruption of topologically associated domains, two mechanisms by which SVs may deregulate genes. The output from any commonly used SV calling algorithm may be easily adapted for use with SVExpress. SVExpress can readily analyze genomic datasets involving hundreds of cancer sample profiles. Here, we used SVExpress to analyze SV and expression data across 327 cancer cell lines with combined SV and expression data in the Cancer Cell Line Encyclopedia (CCLE). In the CCLE dataset, hundreds of genes showed altered gene expression in relation to nearby SV breakpoints. Altered genes involved TAD disruption, enhancer hijacking, and gene fusions. When comparing the top set of SV-altered genes from cancer cell lines with the top SV-altered genes previously reported for human tumors from The Cancer Genome Atlas and the Pan-Cancer Analysis of Whole Genomes datasets, a significant number of genes overlapped in the same direction for both cell lines and tumors, while some genes were significant for cell lines but not for human tumors and vice versa. CONCLUSION: Our SVExpress tools allow computational biologists with a working knowledge of R to integrate gene expression with SV breakpoint data to identify recurrently altered genes. SVExpress is freely available for academic or commercial use at https://github.com/chadcreighton/SVExpress. SVExpress is implemented as a set of Excel macros and R code. All source code (R and Visual Basic for Applications) is available. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04072-0. |
format | Online Article Text |
id | pubmed-7981925 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-79819252021-03-22 SVExpress: identifying gene features altered recurrently in expression with nearby structural variant breakpoints Zhang, Yiqun Chen, Fengju Creighton, Chad J. BMC Bioinformatics Software BACKGROUND: Combined whole-genome sequencing (WGS) and RNA sequencing of cancers offer the opportunity to identify genes with altered expression due to genomic rearrangements. Somatic structural variants (SVs), as identified by WGS, can involve altered gene cis-regulation, gene fusions, copy number alterations, or gene disruption. The absence of computational tools to streamline integrative analysis steps may represent a barrier in identifying genes recurrently altered by genomic rearrangement. RESULTS: Here, we introduce SVExpress, a set of tools for carrying out integrative analysis of SV and gene expression data. SVExpress enables systematic cataloging of genes that consistently show increased or decreased expression in conjunction with the presence of nearby SV breakpoints. SVExpress can evaluate breakpoints in proximity to genes for potential enhancer translocation events or disruption of topologically associated domains, two mechanisms by which SVs may deregulate genes. The output from any commonly used SV calling algorithm may be easily adapted for use with SVExpress. SVExpress can readily analyze genomic datasets involving hundreds of cancer sample profiles. Here, we used SVExpress to analyze SV and expression data across 327 cancer cell lines with combined SV and expression data in the Cancer Cell Line Encyclopedia (CCLE). In the CCLE dataset, hundreds of genes showed altered gene expression in relation to nearby SV breakpoints. Altered genes involved TAD disruption, enhancer hijacking, and gene fusions. When comparing the top set of SV-altered genes from cancer cell lines with the top SV-altered genes previously reported for human tumors from The Cancer Genome Atlas and the Pan-Cancer Analysis of Whole Genomes datasets, a significant number of genes overlapped in the same direction for both cell lines and tumors, while some genes were significant for cell lines but not for human tumors and vice versa. CONCLUSION: Our SVExpress tools allow computational biologists with a working knowledge of R to integrate gene expression with SV breakpoint data to identify recurrently altered genes. SVExpress is freely available for academic or commercial use at https://github.com/chadcreighton/SVExpress. SVExpress is implemented as a set of Excel macros and R code. All source code (R and Visual Basic for Applications) is available. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04072-0. BioMed Central 2021-03-21 /pmc/articles/PMC7981925/ /pubmed/33743584 http://dx.doi.org/10.1186/s12859-021-04072-0 Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Software Zhang, Yiqun Chen, Fengju Creighton, Chad J. SVExpress: identifying gene features altered recurrently in expression with nearby structural variant breakpoints |
title | SVExpress: identifying gene features altered recurrently in expression with nearby structural variant breakpoints |
title_full | SVExpress: identifying gene features altered recurrently in expression with nearby structural variant breakpoints |
title_fullStr | SVExpress: identifying gene features altered recurrently in expression with nearby structural variant breakpoints |
title_full_unstemmed | SVExpress: identifying gene features altered recurrently in expression with nearby structural variant breakpoints |
title_short | SVExpress: identifying gene features altered recurrently in expression with nearby structural variant breakpoints |
title_sort | svexpress: identifying gene features altered recurrently in expression with nearby structural variant breakpoints |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7981925/ https://www.ncbi.nlm.nih.gov/pubmed/33743584 http://dx.doi.org/10.1186/s12859-021-04072-0 |
work_keys_str_mv | AT zhangyiqun svexpressidentifyinggenefeaturesalteredrecurrentlyinexpressionwithnearbystructuralvariantbreakpoints AT chenfengju svexpressidentifyinggenefeaturesalteredrecurrentlyinexpressionwithnearbystructuralvariantbreakpoints AT creightonchadj svexpressidentifyinggenefeaturesalteredrecurrentlyinexpressionwithnearbystructuralvariantbreakpoints |