Cargando…
Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations
The ready availability of vast amounts of genomic sequence data has created the need to rethink comparative genomics algorithms using ‘big data’ approaches. Neptune is an efficient system for rapidly locating differentially abundant genomic content in bacterial populations using an exact k-mer match...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5737611/ https://www.ncbi.nlm.nih.gov/pubmed/29048594 http://dx.doi.org/10.1093/nar/gkx702 |
_version_ | 1783287546781892608 |
---|---|
author | Marinier, Eric Zaheer, Rahat Berry, Chrystal Weedmark, Kelly A. Domaratzki, Michael Mabon, Philip Knox, Natalie C. Reimer, Aleisha R. Graham, Morag R. Chui, Linda Patterson-Fortin, Laura Zhang, Jian Pagotto, Franco Farber, Jeff Mahony, Jim Seyer, Karine Bekal, Sadjia Tremblay, Cécile Isaac-Renton, Judy Prystajecky, Natalie Chen, Jessica Slade, Peter Van Domselaar, Gary |
author_facet | Marinier, Eric Zaheer, Rahat Berry, Chrystal Weedmark, Kelly A. Domaratzki, Michael Mabon, Philip Knox, Natalie C. Reimer, Aleisha R. Graham, Morag R. Chui, Linda Patterson-Fortin, Laura Zhang, Jian Pagotto, Franco Farber, Jeff Mahony, Jim Seyer, Karine Bekal, Sadjia Tremblay, Cécile Isaac-Renton, Judy Prystajecky, Natalie Chen, Jessica Slade, Peter Van Domselaar, Gary |
author_sort | Marinier, Eric |
collection | PubMed |
description | The ready availability of vast amounts of genomic sequence data has created the need to rethink comparative genomics algorithms using ‘big data’ approaches. Neptune is an efficient system for rapidly locating differentially abundant genomic content in bacterial populations using an exact k-mer matching strategy, while accommodating k-mer mismatches. Neptune’s loci discovery process identifies sequences that are sufficiently common to a group of target sequences and sufficiently absent from non-targets using probabilistic models. Neptune uses parallel computing to efficiently identify and extract these loci from draft genome assemblies without requiring multiple sequence alignments or other computationally expensive comparative sequence analyses. Tests on simulated and real datasets showed that Neptune rapidly identifies regions that are both sensitive and specific. We demonstrate that this system can identify trait-specific loci from different bacterial lineages. Neptune is broadly applicable for comparative bacterial analyses, yet will particularly benefit pathogenomic applications, owing to efficient and sensitive discovery of differentially abundant genomic loci. The software is available for download at: http://github.com/phac-nml/neptune. |
format | Online Article Text |
id | pubmed-5737611 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-57376112018-01-04 Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations Marinier, Eric Zaheer, Rahat Berry, Chrystal Weedmark, Kelly A. Domaratzki, Michael Mabon, Philip Knox, Natalie C. Reimer, Aleisha R. Graham, Morag R. Chui, Linda Patterson-Fortin, Laura Zhang, Jian Pagotto, Franco Farber, Jeff Mahony, Jim Seyer, Karine Bekal, Sadjia Tremblay, Cécile Isaac-Renton, Judy Prystajecky, Natalie Chen, Jessica Slade, Peter Van Domselaar, Gary Nucleic Acids Res Methods Online The ready availability of vast amounts of genomic sequence data has created the need to rethink comparative genomics algorithms using ‘big data’ approaches. Neptune is an efficient system for rapidly locating differentially abundant genomic content in bacterial populations using an exact k-mer matching strategy, while accommodating k-mer mismatches. Neptune’s loci discovery process identifies sequences that are sufficiently common to a group of target sequences and sufficiently absent from non-targets using probabilistic models. Neptune uses parallel computing to efficiently identify and extract these loci from draft genome assemblies without requiring multiple sequence alignments or other computationally expensive comparative sequence analyses. Tests on simulated and real datasets showed that Neptune rapidly identifies regions that are both sensitive and specific. We demonstrate that this system can identify trait-specific loci from different bacterial lineages. Neptune is broadly applicable for comparative bacterial analyses, yet will particularly benefit pathogenomic applications, owing to efficient and sensitive discovery of differentially abundant genomic loci. The software is available for download at: http://github.com/phac-nml/neptune. Oxford University Press 2017-10-13 2017-08-17 /pmc/articles/PMC5737611/ /pubmed/29048594 http://dx.doi.org/10.1093/nar/gkx702 Text en © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Methods Online Marinier, Eric Zaheer, Rahat Berry, Chrystal Weedmark, Kelly A. Domaratzki, Michael Mabon, Philip Knox, Natalie C. Reimer, Aleisha R. Graham, Morag R. Chui, Linda Patterson-Fortin, Laura Zhang, Jian Pagotto, Franco Farber, Jeff Mahony, Jim Seyer, Karine Bekal, Sadjia Tremblay, Cécile Isaac-Renton, Judy Prystajecky, Natalie Chen, Jessica Slade, Peter Van Domselaar, Gary Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations |
title | Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations |
title_full | Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations |
title_fullStr | Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations |
title_full_unstemmed | Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations |
title_short | Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations |
title_sort | neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations |
topic | Methods Online |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5737611/ https://www.ncbi.nlm.nih.gov/pubmed/29048594 http://dx.doi.org/10.1093/nar/gkx702 |
work_keys_str_mv | AT mariniereric neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations AT zaheerrahat neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations AT berrychrystal neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations AT weedmarkkellya neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations AT domaratzkimichael neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations AT mabonphilip neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations AT knoxnataliec neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations AT reimeraleishar neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations AT grahammoragr neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations AT chuilinda neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations AT pattersonfortinlaura neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations AT zhangjian neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations AT pagottofranco neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations AT farberjeff neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations AT mahonyjim neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations AT seyerkarine neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations AT bekalsadjia neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations AT tremblaycecile neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations AT isaacrentonjudy neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations AT prystajeckynatalie neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations AT chenjessica neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations AT sladepeter neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations AT vandomselaargary neptuneabioinformaticstoolforrapiddiscoveryofgenomicvariationinbacterialpopulations |