Cargando…
“Deep” Sequencing Accuracy and Reproducibility Using Roche/454 Technology for Inferring Co-Receptor Usage in HIV-1
Next generation, “deep”, sequencing has increasing applications both clinically and in disparate fields of research. This study investigates the accuracy and reproducibility of “deep” sequencing as applied to co-receptor prediction using the V3 loop of Human Immunodeficiency Virus-1. Despite increas...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4069016/ https://www.ncbi.nlm.nih.gov/pubmed/24959876 http://dx.doi.org/10.1371/journal.pone.0099508 |
Sumario: | Next generation, “deep”, sequencing has increasing applications both clinically and in disparate fields of research. This study investigates the accuracy and reproducibility of “deep” sequencing as applied to co-receptor prediction using the V3 loop of Human Immunodeficiency Virus-1. Despite increasing use in HIV co-receptor prediction, the accuracy and reproducibility of deep sequencing technology, and the factors which can affect it, have received only a limited level of investigation. To accomplish this, repeated deep sequencing results were generated using the Roche GS-FLX (454) from a number of sources including a non-homogeneous clinical sample (N = 47 replicates over 18 deep sequencing runs), and a large clinical cohort from the MOTIVATE and A400129 studies (N = 1521). For repeated measurements of a non-homogeneous clinical sample, increasing input copy number both decreased variance in the measured proportion of non-R5 using virus (p<<0.001 and 0.02 for single replicates and triplicates respectively) and increased measured viral diversity (p<0.001; multiple measures). Detection of sequences with a mean abundance less than 1% abundance showed a 2 fold increase in median coefficient of variation (CV) in repeated measurements of a non-homogeneous clinical sample, and a 2.7 fold increase in CV in the MOTIVATE/A400129 dataset compared to sequences with ≥1% abundance. An unexpected source of error included read position, with low accuracy reads occurring more frequently towards the edge of sequencing regions (p<<0.001). Overall, the primary source of variability was sampling error caused by low input copy number/minority species prevalence, though other sources of error including sequence intrinsic, temporal, and read-position related errors were detected. |
---|