Cargando…

Experimental Analysis of Sources of Error in Evolutionary Studies Based on Roche/454 Pyrosequencing of Viral Genomes

Factors affecting the reliability of Roche/454 pyrosequencing for analyzing sequence polymorphism in within-host viral populations were assessed by two experiments: 1) sequencing four clonal simian immunodeficiency virus (SIV) stocks and 2) sequencing mixtures in different proportions of two SIV str...

Descripción completa

Detalles Bibliográficos
Autores principales: Becker, Ericka A., Burns, Charles M., León, Enrique J., Rajabojan, Saravanan, Friedman, Robert, Friedrich, Thomas C., O'Connor, Shelby L., Hughes, Austin L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3342875/
https://www.ncbi.nlm.nih.gov/pubmed/22436995
http://dx.doi.org/10.1093/gbe/evs029
_version_ 1782231740778020864
author Becker, Ericka A.
Burns, Charles M.
León, Enrique J.
Rajabojan, Saravanan
Friedman, Robert
Friedrich, Thomas C.
O'Connor, Shelby L.
Hughes, Austin L.
author_facet Becker, Ericka A.
Burns, Charles M.
León, Enrique J.
Rajabojan, Saravanan
Friedman, Robert
Friedrich, Thomas C.
O'Connor, Shelby L.
Hughes, Austin L.
author_sort Becker, Ericka A.
collection PubMed
description Factors affecting the reliability of Roche/454 pyrosequencing for analyzing sequence polymorphism in within-host viral populations were assessed by two experiments: 1) sequencing four clonal simian immunodeficiency virus (SIV) stocks and 2) sequencing mixtures in different proportions of two SIV strains with known fixed nucleotide differences. Observed nucleotide diversity and frequency of undetermined nucleotides were increased at sites in homopolymer runs of four or more identical nucleotides, particularly at AT sites. However, in the mixed-strain experiments, the effects on estimated nucleotide diversity of such errors were small in comparison to known strain differences. The results suggest that biologically meaningful variants present at a frequency of around 10% and possibly much lower are easily distinguished from artifacts of the sequencing process. Analysis of the clonal stocks revealed numerous rare variants that showed the signature of purifying selection and that elimination of variants at frequencies of less than 1% reduced estimates of nucleotide diversity by about an order of magnitude. Thus, using a 1% frequency cutoff for accepting a variant as real represents a conservative standard, which may be useful in studies that are focused on the discovery of specific mutations (such as those conferring immune escape or drug resistance). On the other hand, if the goal is to estimate nucleotide diversity, an optimal strategy might be to include all observed variants (even those at less than 1% frequency), while masking out homopolymer runs of four or more nucleotides.
format Online
Article
Text
id pubmed-3342875
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-33428752012-05-04 Experimental Analysis of Sources of Error in Evolutionary Studies Based on Roche/454 Pyrosequencing of Viral Genomes Becker, Ericka A. Burns, Charles M. León, Enrique J. Rajabojan, Saravanan Friedman, Robert Friedrich, Thomas C. O'Connor, Shelby L. Hughes, Austin L. Genome Biol Evol Research Articles Factors affecting the reliability of Roche/454 pyrosequencing for analyzing sequence polymorphism in within-host viral populations were assessed by two experiments: 1) sequencing four clonal simian immunodeficiency virus (SIV) stocks and 2) sequencing mixtures in different proportions of two SIV strains with known fixed nucleotide differences. Observed nucleotide diversity and frequency of undetermined nucleotides were increased at sites in homopolymer runs of four or more identical nucleotides, particularly at AT sites. However, in the mixed-strain experiments, the effects on estimated nucleotide diversity of such errors were small in comparison to known strain differences. The results suggest that biologically meaningful variants present at a frequency of around 10% and possibly much lower are easily distinguished from artifacts of the sequencing process. Analysis of the clonal stocks revealed numerous rare variants that showed the signature of purifying selection and that elimination of variants at frequencies of less than 1% reduced estimates of nucleotide diversity by about an order of magnitude. Thus, using a 1% frequency cutoff for accepting a variant as real represents a conservative standard, which may be useful in studies that are focused on the discovery of specific mutations (such as those conferring immune escape or drug resistance). On the other hand, if the goal is to estimate nucleotide diversity, an optimal strategy might be to include all observed variants (even those at less than 1% frequency), while masking out homopolymer runs of four or more nucleotides. Oxford University Press 2012 2012-03-20 /pmc/articles/PMC3342875/ /pubmed/22436995 http://dx.doi.org/10.1093/gbe/evs029 Text en © The Author(s) 2012. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Articles
Becker, Ericka A.
Burns, Charles M.
León, Enrique J.
Rajabojan, Saravanan
Friedman, Robert
Friedrich, Thomas C.
O'Connor, Shelby L.
Hughes, Austin L.
Experimental Analysis of Sources of Error in Evolutionary Studies Based on Roche/454 Pyrosequencing of Viral Genomes
title Experimental Analysis of Sources of Error in Evolutionary Studies Based on Roche/454 Pyrosequencing of Viral Genomes
title_full Experimental Analysis of Sources of Error in Evolutionary Studies Based on Roche/454 Pyrosequencing of Viral Genomes
title_fullStr Experimental Analysis of Sources of Error in Evolutionary Studies Based on Roche/454 Pyrosequencing of Viral Genomes
title_full_unstemmed Experimental Analysis of Sources of Error in Evolutionary Studies Based on Roche/454 Pyrosequencing of Viral Genomes
title_short Experimental Analysis of Sources of Error in Evolutionary Studies Based on Roche/454 Pyrosequencing of Viral Genomes
title_sort experimental analysis of sources of error in evolutionary studies based on roche/454 pyrosequencing of viral genomes
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3342875/
https://www.ncbi.nlm.nih.gov/pubmed/22436995
http://dx.doi.org/10.1093/gbe/evs029
work_keys_str_mv AT beckererickaa experimentalanalysisofsourcesoferrorinevolutionarystudiesbasedonroche454pyrosequencingofviralgenomes
AT burnscharlesm experimentalanalysisofsourcesoferrorinevolutionarystudiesbasedonroche454pyrosequencingofviralgenomes
AT leonenriquej experimentalanalysisofsourcesoferrorinevolutionarystudiesbasedonroche454pyrosequencingofviralgenomes
AT rajabojansaravanan experimentalanalysisofsourcesoferrorinevolutionarystudiesbasedonroche454pyrosequencingofviralgenomes
AT friedmanrobert experimentalanalysisofsourcesoferrorinevolutionarystudiesbasedonroche454pyrosequencingofviralgenomes
AT friedrichthomasc experimentalanalysisofsourcesoferrorinevolutionarystudiesbasedonroche454pyrosequencingofviralgenomes
AT oconnorshelbyl experimentalanalysisofsourcesoferrorinevolutionarystudiesbasedonroche454pyrosequencingofviralgenomes
AT hughesaustinl experimentalanalysisofsourcesoferrorinevolutionarystudiesbasedonroche454pyrosequencingofviralgenomes