Cargando…
An analytical pipeline for identifying and mapping the integration sites of HIV and other retroviruses
BACKGROUND: All retroviruses, including human immunodeficiency virus (HIV), must integrate a DNA copy of their genomes into the genome of the infected host cell to replicate. Although integrated retroviral DNA, known as a provirus, can be found at many sites in the host genome, integration is not ra...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7063773/ https://www.ncbi.nlm.nih.gov/pubmed/32151239 http://dx.doi.org/10.1186/s12864-020-6647-4 |
_version_ | 1783504756097941504 |
---|---|
author | Wells, Daria W. Guo, Shuang Shao, Wei Bale, Michael J. Coffin, John M. Hughes, Stephen H. Wu, Xiaolin |
author_facet | Wells, Daria W. Guo, Shuang Shao, Wei Bale, Michael J. Coffin, John M. Hughes, Stephen H. Wu, Xiaolin |
author_sort | Wells, Daria W. |
collection | PubMed |
description | BACKGROUND: All retroviruses, including human immunodeficiency virus (HIV), must integrate a DNA copy of their genomes into the genome of the infected host cell to replicate. Although integrated retroviral DNA, known as a provirus, can be found at many sites in the host genome, integration is not random. The adaption of linker-mediated PCR (LM-PCR) protocols for high-throughput integration site mapping, using randomly-sheared genomic DNA and Illumina paired-end sequencing, has dramatically increased the number of mapped integration sites. Analysis of samples from human donors has shown that there is clonal expansion of HIV infected cells and that clonal expansion makes an important contribution to HIV persistence. However, analysis of HIV integration sites in samples taken from patients requires extensive PCR amplification and high-throughput sequencing, which makes the methodology prone to certain specific artifacts. RESULTS: To address the problems with artifacts, we use a comprehensive approach involving experimental procedures linked to a bioinformatics analysis pipeline. Using this combined approach, we are able to reduce the number of PCR/sequencing artifacts that arise and identify the ones that remain. Our streamlined workflow combines random cleavage of the DNA in the samples, end repair, and linker ligation in a single step. We provide guidance on primer and linker design that reduces some of the common artifacts. We also discuss how to identify and remove some of the common artifacts, including the products of PCR mispriming and PCR recombination, that have appeared in some published studies. Our improved bioinformatics pipeline rapidly parses the sequencing data and identifies bona fide integration sites in clonally expanded cells, producing an Excel-formatted report that can be used for additional data processing. CONCLUSIONS: We provide a detailed protocol that reduces the prevalence of artifacts that arise in the analysis of retroviral integration site data generated from in vivo samples and a bioinformatics pipeline that is able to remove the artifacts that remain. |
format | Online Article Text |
id | pubmed-7063773 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-70637732020-03-13 An analytical pipeline for identifying and mapping the integration sites of HIV and other retroviruses Wells, Daria W. Guo, Shuang Shao, Wei Bale, Michael J. Coffin, John M. Hughes, Stephen H. Wu, Xiaolin BMC Genomics Methodology Article BACKGROUND: All retroviruses, including human immunodeficiency virus (HIV), must integrate a DNA copy of their genomes into the genome of the infected host cell to replicate. Although integrated retroviral DNA, known as a provirus, can be found at many sites in the host genome, integration is not random. The adaption of linker-mediated PCR (LM-PCR) protocols for high-throughput integration site mapping, using randomly-sheared genomic DNA and Illumina paired-end sequencing, has dramatically increased the number of mapped integration sites. Analysis of samples from human donors has shown that there is clonal expansion of HIV infected cells and that clonal expansion makes an important contribution to HIV persistence. However, analysis of HIV integration sites in samples taken from patients requires extensive PCR amplification and high-throughput sequencing, which makes the methodology prone to certain specific artifacts. RESULTS: To address the problems with artifacts, we use a comprehensive approach involving experimental procedures linked to a bioinformatics analysis pipeline. Using this combined approach, we are able to reduce the number of PCR/sequencing artifacts that arise and identify the ones that remain. Our streamlined workflow combines random cleavage of the DNA in the samples, end repair, and linker ligation in a single step. We provide guidance on primer and linker design that reduces some of the common artifacts. We also discuss how to identify and remove some of the common artifacts, including the products of PCR mispriming and PCR recombination, that have appeared in some published studies. Our improved bioinformatics pipeline rapidly parses the sequencing data and identifies bona fide integration sites in clonally expanded cells, producing an Excel-formatted report that can be used for additional data processing. CONCLUSIONS: We provide a detailed protocol that reduces the prevalence of artifacts that arise in the analysis of retroviral integration site data generated from in vivo samples and a bioinformatics pipeline that is able to remove the artifacts that remain. BioMed Central 2020-03-09 /pmc/articles/PMC7063773/ /pubmed/32151239 http://dx.doi.org/10.1186/s12864-020-6647-4 Text en © The Author(s). 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Methodology Article Wells, Daria W. Guo, Shuang Shao, Wei Bale, Michael J. Coffin, John M. Hughes, Stephen H. Wu, Xiaolin An analytical pipeline for identifying and mapping the integration sites of HIV and other retroviruses |
title | An analytical pipeline for identifying and mapping the integration sites of HIV and other retroviruses |
title_full | An analytical pipeline for identifying and mapping the integration sites of HIV and other retroviruses |
title_fullStr | An analytical pipeline for identifying and mapping the integration sites of HIV and other retroviruses |
title_full_unstemmed | An analytical pipeline for identifying and mapping the integration sites of HIV and other retroviruses |
title_short | An analytical pipeline for identifying and mapping the integration sites of HIV and other retroviruses |
title_sort | analytical pipeline for identifying and mapping the integration sites of hiv and other retroviruses |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7063773/ https://www.ncbi.nlm.nih.gov/pubmed/32151239 http://dx.doi.org/10.1186/s12864-020-6647-4 |
work_keys_str_mv | AT wellsdariaw ananalyticalpipelineforidentifyingandmappingtheintegrationsitesofhivandotherretroviruses AT guoshuang ananalyticalpipelineforidentifyingandmappingtheintegrationsitesofhivandotherretroviruses AT shaowei ananalyticalpipelineforidentifyingandmappingtheintegrationsitesofhivandotherretroviruses AT balemichaelj ananalyticalpipelineforidentifyingandmappingtheintegrationsitesofhivandotherretroviruses AT coffinjohnm ananalyticalpipelineforidentifyingandmappingtheintegrationsitesofhivandotherretroviruses AT hughesstephenh ananalyticalpipelineforidentifyingandmappingtheintegrationsitesofhivandotherretroviruses AT wuxiaolin ananalyticalpipelineforidentifyingandmappingtheintegrationsitesofhivandotherretroviruses AT wellsdariaw analyticalpipelineforidentifyingandmappingtheintegrationsitesofhivandotherretroviruses AT guoshuang analyticalpipelineforidentifyingandmappingtheintegrationsitesofhivandotherretroviruses AT shaowei analyticalpipelineforidentifyingandmappingtheintegrationsitesofhivandotherretroviruses AT balemichaelj analyticalpipelineforidentifyingandmappingtheintegrationsitesofhivandotherretroviruses AT coffinjohnm analyticalpipelineforidentifyingandmappingtheintegrationsitesofhivandotherretroviruses AT hughesstephenh analyticalpipelineforidentifyingandmappingtheintegrationsitesofhivandotherretroviruses AT wuxiaolin analyticalpipelineforidentifyingandmappingtheintegrationsitesofhivandotherretroviruses |