Cargando…

SVExpress: identifying gene features altered recurrently in expression with nearby structural variant breakpoints

BACKGROUND: Combined whole-genome sequencing (WGS) and RNA sequencing of cancers offer the opportunity to identify genes with altered expression due to genomic rearrangements. Somatic structural variants (SVs), as identified by WGS, can involve altered gene cis-regulation, gene fusions, copy number...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Yiqun, Chen, Fengju, Creighton, Chad J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7981925/
https://www.ncbi.nlm.nih.gov/pubmed/33743584
http://dx.doi.org/10.1186/s12859-021-04072-0
_version_ 1783667613044310016
author Zhang, Yiqun
Chen, Fengju
Creighton, Chad J.
author_facet Zhang, Yiqun
Chen, Fengju
Creighton, Chad J.
author_sort Zhang, Yiqun
collection PubMed
description BACKGROUND: Combined whole-genome sequencing (WGS) and RNA sequencing of cancers offer the opportunity to identify genes with altered expression due to genomic rearrangements. Somatic structural variants (SVs), as identified by WGS, can involve altered gene cis-regulation, gene fusions, copy number alterations, or gene disruption. The absence of computational tools to streamline integrative analysis steps may represent a barrier in identifying genes recurrently altered by genomic rearrangement. RESULTS: Here, we introduce SVExpress, a set of tools for carrying out integrative analysis of SV and gene expression data. SVExpress enables systematic cataloging of genes that consistently show increased or decreased expression in conjunction with the presence of nearby SV breakpoints. SVExpress can evaluate breakpoints in proximity to genes for potential enhancer translocation events or disruption of topologically associated domains, two mechanisms by which SVs may deregulate genes. The output from any commonly used SV calling algorithm may be easily adapted for use with SVExpress. SVExpress can readily analyze genomic datasets involving hundreds of cancer sample profiles. Here, we used SVExpress to analyze SV and expression data across 327 cancer cell lines with combined SV and expression data in the Cancer Cell Line Encyclopedia (CCLE). In the CCLE dataset, hundreds of genes showed altered gene expression in relation to nearby SV breakpoints. Altered genes involved TAD disruption, enhancer hijacking, and gene fusions. When comparing the top set of SV-altered genes from cancer cell lines with the top SV-altered genes previously reported for human tumors from The Cancer Genome Atlas and the Pan-Cancer Analysis of Whole Genomes datasets, a significant number of genes overlapped in the same direction for both cell lines and tumors, while some genes were significant for cell lines but not for human tumors and vice versa. CONCLUSION: Our SVExpress tools allow computational biologists with a working knowledge of R to integrate gene expression with SV breakpoint data to identify recurrently altered genes. SVExpress is freely available for academic or commercial use at https://github.com/chadcreighton/SVExpress. SVExpress is implemented as a set of Excel macros and R code. All source code (R and Visual Basic for Applications) is available. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04072-0.
format Online
Article
Text
id pubmed-7981925
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-79819252021-03-22 SVExpress: identifying gene features altered recurrently in expression with nearby structural variant breakpoints Zhang, Yiqun Chen, Fengju Creighton, Chad J. BMC Bioinformatics Software BACKGROUND: Combined whole-genome sequencing (WGS) and RNA sequencing of cancers offer the opportunity to identify genes with altered expression due to genomic rearrangements. Somatic structural variants (SVs), as identified by WGS, can involve altered gene cis-regulation, gene fusions, copy number alterations, or gene disruption. The absence of computational tools to streamline integrative analysis steps may represent a barrier in identifying genes recurrently altered by genomic rearrangement. RESULTS: Here, we introduce SVExpress, a set of tools for carrying out integrative analysis of SV and gene expression data. SVExpress enables systematic cataloging of genes that consistently show increased or decreased expression in conjunction with the presence of nearby SV breakpoints. SVExpress can evaluate breakpoints in proximity to genes for potential enhancer translocation events or disruption of topologically associated domains, two mechanisms by which SVs may deregulate genes. The output from any commonly used SV calling algorithm may be easily adapted for use with SVExpress. SVExpress can readily analyze genomic datasets involving hundreds of cancer sample profiles. Here, we used SVExpress to analyze SV and expression data across 327 cancer cell lines with combined SV and expression data in the Cancer Cell Line Encyclopedia (CCLE). In the CCLE dataset, hundreds of genes showed altered gene expression in relation to nearby SV breakpoints. Altered genes involved TAD disruption, enhancer hijacking, and gene fusions. When comparing the top set of SV-altered genes from cancer cell lines with the top SV-altered genes previously reported for human tumors from The Cancer Genome Atlas and the Pan-Cancer Analysis of Whole Genomes datasets, a significant number of genes overlapped in the same direction for both cell lines and tumors, while some genes were significant for cell lines but not for human tumors and vice versa. CONCLUSION: Our SVExpress tools allow computational biologists with a working knowledge of R to integrate gene expression with SV breakpoint data to identify recurrently altered genes. SVExpress is freely available for academic or commercial use at https://github.com/chadcreighton/SVExpress. SVExpress is implemented as a set of Excel macros and R code. All source code (R and Visual Basic for Applications) is available. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04072-0. BioMed Central 2021-03-21 /pmc/articles/PMC7981925/ /pubmed/33743584 http://dx.doi.org/10.1186/s12859-021-04072-0 Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Software
Zhang, Yiqun
Chen, Fengju
Creighton, Chad J.
SVExpress: identifying gene features altered recurrently in expression with nearby structural variant breakpoints
title SVExpress: identifying gene features altered recurrently in expression with nearby structural variant breakpoints
title_full SVExpress: identifying gene features altered recurrently in expression with nearby structural variant breakpoints
title_fullStr SVExpress: identifying gene features altered recurrently in expression with nearby structural variant breakpoints
title_full_unstemmed SVExpress: identifying gene features altered recurrently in expression with nearby structural variant breakpoints
title_short SVExpress: identifying gene features altered recurrently in expression with nearby structural variant breakpoints
title_sort svexpress: identifying gene features altered recurrently in expression with nearby structural variant breakpoints
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7981925/
https://www.ncbi.nlm.nih.gov/pubmed/33743584
http://dx.doi.org/10.1186/s12859-021-04072-0
work_keys_str_mv AT zhangyiqun svexpressidentifyinggenefeaturesalteredrecurrentlyinexpressionwithnearbystructuralvariantbreakpoints
AT chenfengju svexpressidentifyinggenefeaturesalteredrecurrentlyinexpressionwithnearbystructuralvariantbreakpoints
AT creightonchadj svexpressidentifyinggenefeaturesalteredrecurrentlyinexpressionwithnearbystructuralvariantbreakpoints