Cargando…

Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations

The ready availability of vast amounts of genomic sequence data has created the need to rethink comparative genomics algorithms using ‘big data’ approaches. Neptune is an efficient system for rapidly locating differentially abundant genomic content in bacterial populations using an exact k-mer match...

Descripción completa

Detalles Bibliográficos
Autores principales: Marinier, Eric, Zaheer, Rahat, Berry, Chrystal, Weedmark, Kelly A., Domaratzki, Michael, Mabon, Philip, Knox, Natalie C., Reimer, Aleisha R., Graham, Morag R., Chui, Linda, Patterson-Fortin, Laura, Zhang, Jian, Pagotto, Franco, Farber, Jeff, Mahony, Jim, Seyer, Karine, Bekal, Sadjia, Tremblay, Cécile, Isaac-Renton, Judy, Prystajecky, Natalie, Chen, Jessica, Slade, Peter, Van Domselaar, Gary
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5737611/
https://www.ncbi.nlm.nih.gov/pubmed/29048594
http://dx.doi.org/10.1093/nar/gkx702
_version_ 1783287546781892608
author Marinier, Eric
Zaheer, Rahat
Berry, Chrystal
Weedmark, Kelly A.
Domaratzki, Michael
Mabon, Philip
Knox, Natalie C.
Reimer, Aleisha R.
Graham, Morag R.
Chui, Linda
Patterson-Fortin, Laura
Zhang, Jian
Pagotto, Franco
Farber, Jeff
Mahony, Jim
Seyer, Karine
Bekal, Sadjia
Tremblay, Cécile
Isaac-Renton, Judy
Prystajecky, Natalie
Chen, Jessica
Slade, Peter
Van Domselaar, Gary
author_facet Marinier, Eric
Zaheer, Rahat
Berry, Chrystal
Weedmark, Kelly A.
Domaratzki, Michael
Mabon, Philip
Knox, Natalie C.
Reimer, Aleisha R.
Graham, Morag R.
Chui, Linda
Patterson-Fortin, Laura
Zhang, Jian
Pagotto, Franco
Farber, Jeff
Mahony, Jim
Seyer, Karine
Bekal, Sadjia
Tremblay, Cécile
Isaac-Renton, Judy
Prystajecky, Natalie
Chen, Jessica
Slade, Peter
Van Domselaar, Gary
author_sort Marinier, Eric
collection PubMed
description The ready availability of vast amounts of genomic sequence data has created the need to rethink comparative genomics algorithms using ‘big data’ approaches. Neptune is an efficient system for rapidly locating differentially abundant genomic content in bacterial populations using an exact k-mer matching strategy, while accommodating k-mer mismatches. Neptune’s loci discovery process identifies sequences that are sufficiently common to a group of target sequences and sufficiently absent from non-targets using probabilistic models. Neptune uses parallel computing to efficiently identify and extract these loci from draft genome assemblies without requiring multiple sequence alignments or other computationally expensive comparative sequence analyses. Tests on simulated and real datasets showed that Neptune rapidly identifies regions that are both sensitive and specific. We demonstrate that this system can identify trait-specific loci from different bacterial lineages. Neptune is broadly applicable for comparative bacterial analyses, yet will particularly benefit pathogenomic applications, owing to efficient and sensitive discovery of differentially abundant genomic loci. The software is available for download at: http://github.com/phac-nml/neptune.
format Online
Article
Text
id pubmed-5737611
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-57376112018-01-04 Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations Marinier, Eric Zaheer, Rahat Berry, Chrystal Weedmark, Kelly A. Domaratzki, Michael Mabon, Philip Knox, Natalie C. Reimer, Aleisha R. Graham, Morag R. Chui, Linda Patterson-Fortin, Laura Zhang, Jian Pagotto, Franco Farber, Jeff Mahony, Jim Seyer, Karine Bekal, Sadjia Tremblay, Cécile Isaac-Renton, Judy Prystajecky, Natalie Chen, Jessica Slade, Peter Van Domselaar, Gary Nucleic Acids Res Methods Online The ready availability of vast amounts of genomic sequence data has created the need to rethink comparative genomics algorithms using ‘big data’ approaches. Neptune is an efficient system for rapidly locating differentially abundant genomic content in bacterial populations using an exact k-mer matching strategy, while accommodating k-mer mismatches. Neptune’s loci discovery process identifies sequences that are sufficiently common to a group of target sequences and sufficiently absent from non-targets using probabilistic models. Neptune uses parallel computing to efficiently identify and extract these loci from draft genome assemblies without requiring multiple sequence alignments or other computationally expensive comparative sequence analyses. Tests on simulated and real datasets showed that Neptune rapidly identifies regions that are both sensitive and specific. We demonstrate that this system can identify trait-specific loci from different bacterial lineages. Neptune is broadly applicable for comparative bacterial analyses, yet will particularly benefit pathogenomic applications, owing to efficient and sensitive discovery of differentially abundant genomic loci. The software is available for download at: http://github.com/phac-nml/neptune. Oxford University Press 2017-10-13 2017-08-17 /pmc/articles/PMC5737611/ /pubmed/29048594 http://dx.doi.org/10.1093/nar/gkx702 Text en © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods Online
Marinier, Eric
Zaheer, Rahat
Berry, Chrystal
Weedmark, Kelly A.
Domaratzki, Michael
Mabon, Philip
Knox, Natalie C.
Reimer, Aleisha R.
Graham, Morag R.
Chui, Linda
Patterson-Fortin, Laura
Zhang, Jian
Pagotto, Franco
Farber, Jeff
Mahony, Jim
Seyer, Karine
Bekal, Sadjia
Tremblay, Cécile
Isaac-Renton, Judy
Prystajecky, Natalie
Chen, Jessica
Slade, Peter
Van Domselaar, Gary
Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations
title Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations
title_full Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations
title_fullStr Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations
title_full_unstemmed Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations
title_short Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations
title_sort neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5737611/
https://www.ncbi.nlm.nih.gov/pubmed/29048594
http://dx.doi.org/10.1093/nar/gkx702
work_keys_str_mv AT mariniereric neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations
AT zaheerrahat neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations
AT berrychrystal neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations
AT weedmarkkellya neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations
AT domaratzkimichael neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations
AT mabonphilip neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations
AT knoxnataliec neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations
AT reimeraleishar neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations
AT grahammoragr neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations
AT chuilinda neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations
AT pattersonfortinlaura neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations
AT zhangjian neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations
AT pagottofranco neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations
AT farberjeff neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations
AT mahonyjim neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations
AT seyerkarine neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations
AT bekalsadjia neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations
AT tremblaycecile neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations
AT isaacrentonjudy neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations
AT prystajeckynatalie neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations
AT chenjessica neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations
AT sladepeter neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations
AT vandomselaargary neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations