Cargando…
Methodology and software to detect viral integration site hot-spots
BACKGROUND: Modern gene therapy methods have limited control over where a therapeutic viral vector inserts into the host genome. Vector integration can activate local gene expression, which can cause cancer if the vector inserts near an oncogene. Viral integration hot-spots or 'common insertion...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3203353/ https://www.ncbi.nlm.nih.gov/pubmed/21914224 http://dx.doi.org/10.1186/1471-2105-12-367 |
_version_ | 1782215109970493440 |
---|---|
author | Presson, Angela P Kim, Namshin Xiaofei, Yan Chen, Irvin SY Kim, Sanggu |
author_facet | Presson, Angela P Kim, Namshin Xiaofei, Yan Chen, Irvin SY Kim, Sanggu |
author_sort | Presson, Angela P |
collection | PubMed |
description | BACKGROUND: Modern gene therapy methods have limited control over where a therapeutic viral vector inserts into the host genome. Vector integration can activate local gene expression, which can cause cancer if the vector inserts near an oncogene. Viral integration hot-spots or 'common insertion sites' (CIS) are scrutinized to evaluate and predict patient safety. CIS are typically defined by a minimum density of insertions (such as 2-4 within a 30-100 kb region), which unfortunately depends on the total number of observed VIS. This is problematic for comparing hot-spot distributions across data sets and patients, where the VIS numbers may vary. RESULTS: We develop two new methods for defining hot-spots that are relatively independent of data set size. Both methods operate on distributions of VIS across consecutive 1 Mb 'bins' of the genome. The first method 'z-threshold' tallies the number of VIS per bin, converts these counts to z-scores, and applies a threshold to define high density bins. The second method 'BCP' applies a Bayesian change-point model to the z-scores to define hot-spots. The novel hot-spot methods are compared with a conventional CIS method using simulated data sets and data sets from five published human studies, including the X-linked ALD (adrenoleukodystrophy), CGD (chronic granulomatous disease) and SCID-X1 (X-linked severe combined immunodeficiency) trials. The BCP analysis of the human X-linked ALD data for two patients separately (774 and 1627 VIS) and combined (2401 VIS) resulted in 5-6 hot-spots covering 0.17-0.251% of the genome and containing 5.56-7.74% of the total VIS. In comparison, the CIS analysis resulted in 12-110 hot-spots covering 0.018-0.246% of the genome and containing 5.81-22.7% of the VIS, corresponding to a greater number of hot-spots as the data set size increased. Our hot-spot methods enable one to evaluate the extent of VIS clustering, and formally compare data sets in terms of hot-spot overlap. Finally, we show that the BCP hot-spots from the repopulating samples coincide with greater gene and CpG island density than the median genome density. CONCLUSIONS: The z-threshold and BCP methods are useful for comparing hot-spot patterns across data sets of disparate sizes. The methodology and software provided here should enable one to study hot-spot conservation across a variety of VIS data sets and evaluate vector safety for gene therapy trials. |
format | Online Article Text |
id | pubmed-3203353 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-32033532011-10-31 Methodology and software to detect viral integration site hot-spots Presson, Angela P Kim, Namshin Xiaofei, Yan Chen, Irvin SY Kim, Sanggu BMC Bioinformatics Research Article BACKGROUND: Modern gene therapy methods have limited control over where a therapeutic viral vector inserts into the host genome. Vector integration can activate local gene expression, which can cause cancer if the vector inserts near an oncogene. Viral integration hot-spots or 'common insertion sites' (CIS) are scrutinized to evaluate and predict patient safety. CIS are typically defined by a minimum density of insertions (such as 2-4 within a 30-100 kb region), which unfortunately depends on the total number of observed VIS. This is problematic for comparing hot-spot distributions across data sets and patients, where the VIS numbers may vary. RESULTS: We develop two new methods for defining hot-spots that are relatively independent of data set size. Both methods operate on distributions of VIS across consecutive 1 Mb 'bins' of the genome. The first method 'z-threshold' tallies the number of VIS per bin, converts these counts to z-scores, and applies a threshold to define high density bins. The second method 'BCP' applies a Bayesian change-point model to the z-scores to define hot-spots. The novel hot-spot methods are compared with a conventional CIS method using simulated data sets and data sets from five published human studies, including the X-linked ALD (adrenoleukodystrophy), CGD (chronic granulomatous disease) and SCID-X1 (X-linked severe combined immunodeficiency) trials. The BCP analysis of the human X-linked ALD data for two patients separately (774 and 1627 VIS) and combined (2401 VIS) resulted in 5-6 hot-spots covering 0.17-0.251% of the genome and containing 5.56-7.74% of the total VIS. In comparison, the CIS analysis resulted in 12-110 hot-spots covering 0.018-0.246% of the genome and containing 5.81-22.7% of the VIS, corresponding to a greater number of hot-spots as the data set size increased. Our hot-spot methods enable one to evaluate the extent of VIS clustering, and formally compare data sets in terms of hot-spot overlap. Finally, we show that the BCP hot-spots from the repopulating samples coincide with greater gene and CpG island density than the median genome density. CONCLUSIONS: The z-threshold and BCP methods are useful for comparing hot-spot patterns across data sets of disparate sizes. The methodology and software provided here should enable one to study hot-spot conservation across a variety of VIS data sets and evaluate vector safety for gene therapy trials. BioMed Central 2011-09-14 /pmc/articles/PMC3203353/ /pubmed/21914224 http://dx.doi.org/10.1186/1471-2105-12-367 Text en Copyright ©2011 Presson et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Presson, Angela P Kim, Namshin Xiaofei, Yan Chen, Irvin SY Kim, Sanggu Methodology and software to detect viral integration site hot-spots |
title | Methodology and software to detect viral integration site hot-spots |
title_full | Methodology and software to detect viral integration site hot-spots |
title_fullStr | Methodology and software to detect viral integration site hot-spots |
title_full_unstemmed | Methodology and software to detect viral integration site hot-spots |
title_short | Methodology and software to detect viral integration site hot-spots |
title_sort | methodology and software to detect viral integration site hot-spots |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3203353/ https://www.ncbi.nlm.nih.gov/pubmed/21914224 http://dx.doi.org/10.1186/1471-2105-12-367 |
work_keys_str_mv | AT pressonangelap methodologyandsoftwaretodetectviralintegrationsitehotspots AT kimnamshin methodologyandsoftwaretodetectviralintegrationsitehotspots AT xiaofeiyan methodologyandsoftwaretodetectviralintegrationsitehotspots AT chenirvinsy methodologyandsoftwaretodetectviralintegrationsitehotspots AT kimsanggu methodologyandsoftwaretodetectviralintegrationsitehotspots |