Cargando…

A fast machine-learning-guided primer design pipeline for selective whole genome amplification

Addressing many of the major outstanding questions in the fields of microbial evolution and pathogenesis will require analyses of populations of microbial genomes. Although population genomic studies provide the analytical resolution to investigate evolutionary and mechanistic processes at fine spat...

Descripción completa

Detalles Bibliográficos
Autores principales: Dwivedi-Yu, Jane A., Oppler, Zachary J., Mitchell, Matthew W., Song, Yun S., Brisson, Dustin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10138271/
https://www.ncbi.nlm.nih.gov/pubmed/37068103
http://dx.doi.org/10.1371/journal.pcbi.1010137
_version_ 1785032667602878464
author Dwivedi-Yu, Jane A.
Oppler, Zachary J.
Mitchell, Matthew W.
Song, Yun S.
Brisson, Dustin
author_facet Dwivedi-Yu, Jane A.
Oppler, Zachary J.
Mitchell, Matthew W.
Song, Yun S.
Brisson, Dustin
author_sort Dwivedi-Yu, Jane A.
collection PubMed
description Addressing many of the major outstanding questions in the fields of microbial evolution and pathogenesis will require analyses of populations of microbial genomes. Although population genomic studies provide the analytical resolution to investigate evolutionary and mechanistic processes at fine spatial and temporal scales—precisely the scales at which these processes occur—microbial population genomic research is currently hindered by the practicalities of obtaining sufficient quantities of the relatively pure microbial genomic DNA necessary for next-generation sequencing. Here we present swga2.0, an optimized and parallelized pipeline to design selective whole genome amplification (SWGA) primer sets. Unlike previous methods, swga2.0 incorporates active and machine learning methods to evaluate the amplification efficacy of individual primers and primer sets. Additionally, swga2.0 optimizes primer set search and evaluation strategies, including parallelization at each stage of the pipeline, to dramatically decrease program runtime. Here we describe the swga2.0 pipeline, including the empirical data used to identify primer and primer set characteristics, that improve amplification performance. Additionally, we evaluate the novel swga2.0 pipeline by designing primer sets that successfully amplify Prevotella melaninogenica, an important component of the lung microbiome in cystic fibrosis patients, from samples dominated by human DNA.
format Online
Article
Text
id pubmed-10138271
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-101382712023-04-28 A fast machine-learning-guided primer design pipeline for selective whole genome amplification Dwivedi-Yu, Jane A. Oppler, Zachary J. Mitchell, Matthew W. Song, Yun S. Brisson, Dustin PLoS Comput Biol Research Article Addressing many of the major outstanding questions in the fields of microbial evolution and pathogenesis will require analyses of populations of microbial genomes. Although population genomic studies provide the analytical resolution to investigate evolutionary and mechanistic processes at fine spatial and temporal scales—precisely the scales at which these processes occur—microbial population genomic research is currently hindered by the practicalities of obtaining sufficient quantities of the relatively pure microbial genomic DNA necessary for next-generation sequencing. Here we present swga2.0, an optimized and parallelized pipeline to design selective whole genome amplification (SWGA) primer sets. Unlike previous methods, swga2.0 incorporates active and machine learning methods to evaluate the amplification efficacy of individual primers and primer sets. Additionally, swga2.0 optimizes primer set search and evaluation strategies, including parallelization at each stage of the pipeline, to dramatically decrease program runtime. Here we describe the swga2.0 pipeline, including the empirical data used to identify primer and primer set characteristics, that improve amplification performance. Additionally, we evaluate the novel swga2.0 pipeline by designing primer sets that successfully amplify Prevotella melaninogenica, an important component of the lung microbiome in cystic fibrosis patients, from samples dominated by human DNA. Public Library of Science 2023-04-17 /pmc/articles/PMC10138271/ /pubmed/37068103 http://dx.doi.org/10.1371/journal.pcbi.1010137 Text en © 2023 Dwivedi-Yu et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Dwivedi-Yu, Jane A.
Oppler, Zachary J.
Mitchell, Matthew W.
Song, Yun S.
Brisson, Dustin
A fast machine-learning-guided primer design pipeline for selective whole genome amplification
title A fast machine-learning-guided primer design pipeline for selective whole genome amplification
title_full A fast machine-learning-guided primer design pipeline for selective whole genome amplification
title_fullStr A fast machine-learning-guided primer design pipeline for selective whole genome amplification
title_full_unstemmed A fast machine-learning-guided primer design pipeline for selective whole genome amplification
title_short A fast machine-learning-guided primer design pipeline for selective whole genome amplification
title_sort fast machine-learning-guided primer design pipeline for selective whole genome amplification
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10138271/
https://www.ncbi.nlm.nih.gov/pubmed/37068103
http://dx.doi.org/10.1371/journal.pcbi.1010137
work_keys_str_mv AT dwivediyujanea afastmachinelearningguidedprimerdesignpipelineforselectivewholegenomeamplification
AT opplerzacharyj afastmachinelearningguidedprimerdesignpipelineforselectivewholegenomeamplification
AT mitchellmattheww afastmachinelearningguidedprimerdesignpipelineforselectivewholegenomeamplification
AT songyuns afastmachinelearningguidedprimerdesignpipelineforselectivewholegenomeamplification
AT brissondustin afastmachinelearningguidedprimerdesignpipelineforselectivewholegenomeamplification
AT dwivediyujanea fastmachinelearningguidedprimerdesignpipelineforselectivewholegenomeamplification
AT opplerzacharyj fastmachinelearningguidedprimerdesignpipelineforselectivewholegenomeamplification
AT mitchellmattheww fastmachinelearningguidedprimerdesignpipelineforselectivewholegenomeamplification
AT songyuns fastmachinelearningguidedprimerdesignpipelineforselectivewholegenomeamplification
AT brissondustin fastmachinelearningguidedprimerdesignpipelineforselectivewholegenomeamplification