Cargando…

NIMBus: a negative binomial regression based Integrative Method for mutation Burden Analysis

BACKGROUND: Identifying frequently mutated regions is a key approach to discover DNA elements influencing cancer progression. However, it is challenging to identify these burdened regions due to mutation rate heterogeneity across the genome and across different individuals. Moreover, it is known tha...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Jing, Liu, Jason, McGillivray, Patrick, Yi, Caroline, Lochovsky, Lucas, Lee, Donghoon, Gerstein, Mark
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7580035/
https://www.ncbi.nlm.nih.gov/pubmed/33092526
http://dx.doi.org/10.1186/s12859-020-03758-1
_version_ 1783598712883249152
author Zhang, Jing
Liu, Jason
McGillivray, Patrick
Yi, Caroline
Lochovsky, Lucas
Lee, Donghoon
Gerstein, Mark
author_facet Zhang, Jing
Liu, Jason
McGillivray, Patrick
Yi, Caroline
Lochovsky, Lucas
Lee, Donghoon
Gerstein, Mark
author_sort Zhang, Jing
collection PubMed
description BACKGROUND: Identifying frequently mutated regions is a key approach to discover DNA elements influencing cancer progression. However, it is challenging to identify these burdened regions due to mutation rate heterogeneity across the genome and across different individuals. Moreover, it is known that this heterogeneity partially stems from genomic confounding factors, such as replication timing and chromatin organization. The increasing availability of cancer whole genome sequences and functional genomics data from the Encyclopedia of DNA Elements (ENCODE) may help address these issues. RESULTS: We developed a negative binomial regression-based Integrative Method for mutation Burden analysiS (NIMBus). Our approach addresses the over-dispersion of mutation count statistics by (1) using a Gamma–Poisson mixture model to capture the mutation-rate heterogeneity across different individuals and (2) estimating regional background mutation rates by regressing the varying local mutation counts against genomic features extracted from ENCODE. We applied NIMBus to whole-genome cancer sequences from the PanCancer Analysis of Whole Genomes project (PCAWG) and other cohorts. It successfully identified well-known coding and noncoding drivers, such as TP53 and the TERT promoter. To further characterize the burdening of non-coding regions, we used NIMBus to screen transcription factor binding sites in promoter regions that intersect DNase I hypersensitive sites (DHSs). This analysis identified mutational hotspots that potentially disrupt gene regulatory networks in cancer. We also compare this method to other mutation burden analysis methods. CONCLUSION: NIMBus is a powerful tool to identify mutational hotspots. The NIMBus software and results are available as an online resource at github.gersteinlab.org/nimbus.
format Online
Article
Text
id pubmed-7580035
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-75800352020-10-22 NIMBus: a negative binomial regression based Integrative Method for mutation Burden Analysis Zhang, Jing Liu, Jason McGillivray, Patrick Yi, Caroline Lochovsky, Lucas Lee, Donghoon Gerstein, Mark BMC Bioinformatics Methodology Article BACKGROUND: Identifying frequently mutated regions is a key approach to discover DNA elements influencing cancer progression. However, it is challenging to identify these burdened regions due to mutation rate heterogeneity across the genome and across different individuals. Moreover, it is known that this heterogeneity partially stems from genomic confounding factors, such as replication timing and chromatin organization. The increasing availability of cancer whole genome sequences and functional genomics data from the Encyclopedia of DNA Elements (ENCODE) may help address these issues. RESULTS: We developed a negative binomial regression-based Integrative Method for mutation Burden analysiS (NIMBus). Our approach addresses the over-dispersion of mutation count statistics by (1) using a Gamma–Poisson mixture model to capture the mutation-rate heterogeneity across different individuals and (2) estimating regional background mutation rates by regressing the varying local mutation counts against genomic features extracted from ENCODE. We applied NIMBus to whole-genome cancer sequences from the PanCancer Analysis of Whole Genomes project (PCAWG) and other cohorts. It successfully identified well-known coding and noncoding drivers, such as TP53 and the TERT promoter. To further characterize the burdening of non-coding regions, we used NIMBus to screen transcription factor binding sites in promoter regions that intersect DNase I hypersensitive sites (DHSs). This analysis identified mutational hotspots that potentially disrupt gene regulatory networks in cancer. We also compare this method to other mutation burden analysis methods. CONCLUSION: NIMBus is a powerful tool to identify mutational hotspots. The NIMBus software and results are available as an online resource at github.gersteinlab.org/nimbus. BioMed Central 2020-10-22 /pmc/articles/PMC7580035/ /pubmed/33092526 http://dx.doi.org/10.1186/s12859-020-03758-1 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Zhang, Jing
Liu, Jason
McGillivray, Patrick
Yi, Caroline
Lochovsky, Lucas
Lee, Donghoon
Gerstein, Mark
NIMBus: a negative binomial regression based Integrative Method for mutation Burden Analysis
title NIMBus: a negative binomial regression based Integrative Method for mutation Burden Analysis
title_full NIMBus: a negative binomial regression based Integrative Method for mutation Burden Analysis
title_fullStr NIMBus: a negative binomial regression based Integrative Method for mutation Burden Analysis
title_full_unstemmed NIMBus: a negative binomial regression based Integrative Method for mutation Burden Analysis
title_short NIMBus: a negative binomial regression based Integrative Method for mutation Burden Analysis
title_sort nimbus: a negative binomial regression based integrative method for mutation burden analysis
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7580035/
https://www.ncbi.nlm.nih.gov/pubmed/33092526
http://dx.doi.org/10.1186/s12859-020-03758-1
work_keys_str_mv AT zhangjing nimbusanegativebinomialregressionbasedintegrativemethodformutationburdenanalysis
AT liujason nimbusanegativebinomialregressionbasedintegrativemethodformutationburdenanalysis
AT mcgillivraypatrick nimbusanegativebinomialregressionbasedintegrativemethodformutationburdenanalysis
AT yicaroline nimbusanegativebinomialregressionbasedintegrativemethodformutationburdenanalysis
AT lochovskylucas nimbusanegativebinomialregressionbasedintegrativemethodformutationburdenanalysis
AT leedonghoon nimbusanegativebinomialregressionbasedintegrativemethodformutationburdenanalysis
AT gersteinmark nimbusanegativebinomialregressionbasedintegrativemethodformutationburdenanalysis