Cargando…

Detecting rare structural variation in evolving microbial populations from new sequence junctions using breseq

New mutations leading to structural variation (SV) in genomes—in the form of mobile element insertions, large deletions, gene duplications, and other chromosomal rearrangements—can play a key role in microbial evolution. Yet, SV is considerably more difficult to predict from short-read genome resequ...

Descripción completa

Detalles Bibliográficos
Autores principales: Deatherage, Daniel E., Traverse, Charles C., Wolf, Lindsey N., Barrick, Jeffrey E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4301190/
https://www.ncbi.nlm.nih.gov/pubmed/25653667
http://dx.doi.org/10.3389/fgene.2014.00468
_version_ 1782353620333756416
author Deatherage, Daniel E.
Traverse, Charles C.
Wolf, Lindsey N.
Barrick, Jeffrey E.
author_facet Deatherage, Daniel E.
Traverse, Charles C.
Wolf, Lindsey N.
Barrick, Jeffrey E.
author_sort Deatherage, Daniel E.
collection PubMed
description New mutations leading to structural variation (SV) in genomes—in the form of mobile element insertions, large deletions, gene duplications, and other chromosomal rearrangements—can play a key role in microbial evolution. Yet, SV is considerably more difficult to predict from short-read genome resequencing data than single-nucleotide substitutions and indels (SN), so it is not yet routinely identified in studies that profile population-level genetic diversity over time in evolution experiments. We implemented an algorithm for detecting polymorphic SV as part of the breseq computational pipeline. This procedure examines split-read alignments, in which the two ends of a single sequencing read match disjoint locations in the reference genome, in order to detect structural variants and estimate their frequencies within a sample. We tested our algorithm using simulated Escherichia coli data and then applied it to 500- and 1000-generation population samples from the Lenski E. coli long-term evolution experiment (LTEE). Knowledge of genes that are targets of selection in the LTEE and mutations present in previously analyzed clonal isolates allowed us to evaluate the accuracy of our procedure. Overall, SV accounted for ~25% of the genetic diversity found in these samples. By profiling rare SV, we were able to identify many cases where alternative mutations in key genes transiently competed within a single population. We also found, unexpectedly, that mutations in two genes that rose to prominence at these early time points always went extinct in the long term. Because it is not limited by the base-calling error rate of the sequencing technology, our approach for identifying rare SV in whole-population samples may have a lower detection limit than similar predictions of SNs in these data sets. We anticipate that this functionality of breseq will be useful for providing a more complete picture of genome dynamics during evolution experiments with haploid microorganisms.
format Online
Article
Text
id pubmed-4301190
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-43011902015-02-04 Detecting rare structural variation in evolving microbial populations from new sequence junctions using breseq Deatherage, Daniel E. Traverse, Charles C. Wolf, Lindsey N. Barrick, Jeffrey E. Front Genet Genetics New mutations leading to structural variation (SV) in genomes—in the form of mobile element insertions, large deletions, gene duplications, and other chromosomal rearrangements—can play a key role in microbial evolution. Yet, SV is considerably more difficult to predict from short-read genome resequencing data than single-nucleotide substitutions and indels (SN), so it is not yet routinely identified in studies that profile population-level genetic diversity over time in evolution experiments. We implemented an algorithm for detecting polymorphic SV as part of the breseq computational pipeline. This procedure examines split-read alignments, in which the two ends of a single sequencing read match disjoint locations in the reference genome, in order to detect structural variants and estimate their frequencies within a sample. We tested our algorithm using simulated Escherichia coli data and then applied it to 500- and 1000-generation population samples from the Lenski E. coli long-term evolution experiment (LTEE). Knowledge of genes that are targets of selection in the LTEE and mutations present in previously analyzed clonal isolates allowed us to evaluate the accuracy of our procedure. Overall, SV accounted for ~25% of the genetic diversity found in these samples. By profiling rare SV, we were able to identify many cases where alternative mutations in key genes transiently competed within a single population. We also found, unexpectedly, that mutations in two genes that rose to prominence at these early time points always went extinct in the long term. Because it is not limited by the base-calling error rate of the sequencing technology, our approach for identifying rare SV in whole-population samples may have a lower detection limit than similar predictions of SNs in these data sets. We anticipate that this functionality of breseq will be useful for providing a more complete picture of genome dynamics during evolution experiments with haploid microorganisms. Frontiers Media S.A. 2015-01-21 /pmc/articles/PMC4301190/ /pubmed/25653667 http://dx.doi.org/10.3389/fgene.2014.00468 Text en Copyright © 2015 Deatherage, Traverse, Wolf and Barrick. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Deatherage, Daniel E.
Traverse, Charles C.
Wolf, Lindsey N.
Barrick, Jeffrey E.
Detecting rare structural variation in evolving microbial populations from new sequence junctions using breseq
title Detecting rare structural variation in evolving microbial populations from new sequence junctions using breseq
title_full Detecting rare structural variation in evolving microbial populations from new sequence junctions using breseq
title_fullStr Detecting rare structural variation in evolving microbial populations from new sequence junctions using breseq
title_full_unstemmed Detecting rare structural variation in evolving microbial populations from new sequence junctions using breseq
title_short Detecting rare structural variation in evolving microbial populations from new sequence junctions using breseq
title_sort detecting rare structural variation in evolving microbial populations from new sequence junctions using breseq
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4301190/
https://www.ncbi.nlm.nih.gov/pubmed/25653667
http://dx.doi.org/10.3389/fgene.2014.00468
work_keys_str_mv AT deatheragedaniele detectingrarestructuralvariationinevolvingmicrobialpopulationsfromnewsequencejunctionsusingbreseq
AT traversecharlesc detectingrarestructuralvariationinevolvingmicrobialpopulationsfromnewsequencejunctionsusingbreseq
AT wolflindseyn detectingrarestructuralvariationinevolvingmicrobialpopulationsfromnewsequencejunctionsusingbreseq
AT barrickjeffreye detectingrarestructuralvariationinevolvingmicrobialpopulationsfromnewsequencejunctionsusingbreseq