Cargando…

Error correction and statistical analyses for intra-host comparisons of feline immunodeficiency virus diversity from high-throughput sequencing data

BACKGROUND: Infection with feline immunodeficiency virus (FIV) causes an immunosuppressive disease whose consequences are less severe if cats are co-infected with an attenuated FIV strain (PLV). We use virus diversity measurements, which reflect replication ability and the virus response to various...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Yang, Chiaromonte, Francesca, Ross, Howard, Malhotra, Raunaq, Elleder, Daniel, Poss, Mary
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4486422/
https://www.ncbi.nlm.nih.gov/pubmed/26123018
http://dx.doi.org/10.1186/s12859-015-0607-z
_version_ 1782378888522891264
author Liu, Yang
Chiaromonte, Francesca
Ross, Howard
Malhotra, Raunaq
Elleder, Daniel
Poss, Mary
author_facet Liu, Yang
Chiaromonte, Francesca
Ross, Howard
Malhotra, Raunaq
Elleder, Daniel
Poss, Mary
author_sort Liu, Yang
collection PubMed
description BACKGROUND: Infection with feline immunodeficiency virus (FIV) causes an immunosuppressive disease whose consequences are less severe if cats are co-infected with an attenuated FIV strain (PLV). We use virus diversity measurements, which reflect replication ability and the virus response to various conditions, to test whether diversity of virulent FIV in lymphoid tissues is altered in the presence of PLV. Our data consisted of the 3′ half of the FIV genome from three tissues of animals infected with FIV alone, or with FIV and PLV, sequenced by 454 technology. RESULTS: Since rare variants dominate virus populations, we had to carefully distinguish sequence variation from errors due to experimental protocols and sequencing. We considered an exponential-normal convolution model used for background correction of microarray data, and modified it to formulate an error correction approach for minor allele frequencies derived from high-throughput sequencing. Similar to accounting for over-dispersion in counts, this accounts for error-inflated variability in frequencies – and quite effectively reproduces empirically observed distributions. After obtaining error-corrected minor allele frequencies, we applied ANalysis Of VAriance (ANOVA) based on a linear mixed model and found that conserved sites and transition frequencies in FIV genes differ among tissues of dual and single infected cats. Furthermore, analysis of minor allele frequencies at individual FIV genome sites revealed 242 sites significantly affected by infection status (dual vs. single) or infection status by tissue interaction. All together, our results demonstrated a decrease in FIV diversity in bone marrow in the presence of PLV. Importantly, these effects were weakened or undetectable when error correction was performed with other approaches (thresholding of minor allele frequencies; probabilistic clustering of reads). We also queried the data for cytidine deaminase activity on the viral genome, which causes an asymmetric increase in G to A substitutions, but found no evidence for this host defense strategy. CONCLUSIONS: Our error correction approach for minor allele frequencies (more sensitive and computationally efficient than other algorithms) and our statistical treatment of variation (ANOVA) were critical for effective use of high-throughput sequencing data in understanding viral diversity. We found that co-infection with PLV shifts FIV diversity from bone marrow to lymph node and spleen. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0607-z) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4486422
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-44864222015-07-02 Error correction and statistical analyses for intra-host comparisons of feline immunodeficiency virus diversity from high-throughput sequencing data Liu, Yang Chiaromonte, Francesca Ross, Howard Malhotra, Raunaq Elleder, Daniel Poss, Mary BMC Bioinformatics Research Article BACKGROUND: Infection with feline immunodeficiency virus (FIV) causes an immunosuppressive disease whose consequences are less severe if cats are co-infected with an attenuated FIV strain (PLV). We use virus diversity measurements, which reflect replication ability and the virus response to various conditions, to test whether diversity of virulent FIV in lymphoid tissues is altered in the presence of PLV. Our data consisted of the 3′ half of the FIV genome from three tissues of animals infected with FIV alone, or with FIV and PLV, sequenced by 454 technology. RESULTS: Since rare variants dominate virus populations, we had to carefully distinguish sequence variation from errors due to experimental protocols and sequencing. We considered an exponential-normal convolution model used for background correction of microarray data, and modified it to formulate an error correction approach for minor allele frequencies derived from high-throughput sequencing. Similar to accounting for over-dispersion in counts, this accounts for error-inflated variability in frequencies – and quite effectively reproduces empirically observed distributions. After obtaining error-corrected minor allele frequencies, we applied ANalysis Of VAriance (ANOVA) based on a linear mixed model and found that conserved sites and transition frequencies in FIV genes differ among tissues of dual and single infected cats. Furthermore, analysis of minor allele frequencies at individual FIV genome sites revealed 242 sites significantly affected by infection status (dual vs. single) or infection status by tissue interaction. All together, our results demonstrated a decrease in FIV diversity in bone marrow in the presence of PLV. Importantly, these effects were weakened or undetectable when error correction was performed with other approaches (thresholding of minor allele frequencies; probabilistic clustering of reads). We also queried the data for cytidine deaminase activity on the viral genome, which causes an asymmetric increase in G to A substitutions, but found no evidence for this host defense strategy. CONCLUSIONS: Our error correction approach for minor allele frequencies (more sensitive and computationally efficient than other algorithms) and our statistical treatment of variation (ANOVA) were critical for effective use of high-throughput sequencing data in understanding viral diversity. We found that co-infection with PLV shifts FIV diversity from bone marrow to lymph node and spleen. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0607-z) contains supplementary material, which is available to authorized users. BioMed Central 2015-06-30 /pmc/articles/PMC4486422/ /pubmed/26123018 http://dx.doi.org/10.1186/s12859-015-0607-z Text en © Liu et al. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Liu, Yang
Chiaromonte, Francesca
Ross, Howard
Malhotra, Raunaq
Elleder, Daniel
Poss, Mary
Error correction and statistical analyses for intra-host comparisons of feline immunodeficiency virus diversity from high-throughput sequencing data
title Error correction and statistical analyses for intra-host comparisons of feline immunodeficiency virus diversity from high-throughput sequencing data
title_full Error correction and statistical analyses for intra-host comparisons of feline immunodeficiency virus diversity from high-throughput sequencing data
title_fullStr Error correction and statistical analyses for intra-host comparisons of feline immunodeficiency virus diversity from high-throughput sequencing data
title_full_unstemmed Error correction and statistical analyses for intra-host comparisons of feline immunodeficiency virus diversity from high-throughput sequencing data
title_short Error correction and statistical analyses for intra-host comparisons of feline immunodeficiency virus diversity from high-throughput sequencing data
title_sort error correction and statistical analyses for intra-host comparisons of feline immunodeficiency virus diversity from high-throughput sequencing data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4486422/
https://www.ncbi.nlm.nih.gov/pubmed/26123018
http://dx.doi.org/10.1186/s12859-015-0607-z
work_keys_str_mv AT liuyang errorcorrectionandstatisticalanalysesforintrahostcomparisonsoffelineimmunodeficiencyvirusdiversityfromhighthroughputsequencingdata
AT chiaromontefrancesca errorcorrectionandstatisticalanalysesforintrahostcomparisonsoffelineimmunodeficiencyvirusdiversityfromhighthroughputsequencingdata
AT rosshoward errorcorrectionandstatisticalanalysesforintrahostcomparisonsoffelineimmunodeficiencyvirusdiversityfromhighthroughputsequencingdata
AT malhotraraunaq errorcorrectionandstatisticalanalysesforintrahostcomparisonsoffelineimmunodeficiencyvirusdiversityfromhighthroughputsequencingdata
AT ellederdaniel errorcorrectionandstatisticalanalysesforintrahostcomparisonsoffelineimmunodeficiencyvirusdiversityfromhighthroughputsequencingdata
AT possmary errorcorrectionandstatisticalanalysesforintrahostcomparisonsoffelineimmunodeficiencyvirusdiversityfromhighthroughputsequencingdata