Cargando…

A computational genomics pipeline for prokaryotic sequencing projects

Motivation: New sequencing technologies have accelerated research on prokaryotic genomes and have made genome sequencing operations outside major genome sequencing centers routine. However, no off-the-shelf solution exists for the combined assembly, gene prediction, genome annotation and data presen...

Descripción completa

Detalles Bibliográficos
Autores principales: Kislyuk, Andrey O., Katz, Lee S., Agrawal, Sonia, Hagen, Matthew S., Conley, Andrew B., Jayaraman, Pushkala, Nelakuditi, Viswateja, Humphrey, Jay C., Sammons, Scott A., Govil, Dhwani, Mair, Raydel D., Tatti, Kathleen M., Tondella, Maria L., Harcourt, Brian H., Mayer, Leonard W., Jordan, I. King
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2905547/
https://www.ncbi.nlm.nih.gov/pubmed/20519285
http://dx.doi.org/10.1093/bioinformatics/btq284
_version_ 1782183972509319168
author Kislyuk, Andrey O.
Katz, Lee S.
Agrawal, Sonia
Hagen, Matthew S.
Conley, Andrew B.
Jayaraman, Pushkala
Nelakuditi, Viswateja
Humphrey, Jay C.
Sammons, Scott A.
Govil, Dhwani
Mair, Raydel D.
Tatti, Kathleen M.
Tondella, Maria L.
Harcourt, Brian H.
Mayer, Leonard W.
Jordan, I. King
author_facet Kislyuk, Andrey O.
Katz, Lee S.
Agrawal, Sonia
Hagen, Matthew S.
Conley, Andrew B.
Jayaraman, Pushkala
Nelakuditi, Viswateja
Humphrey, Jay C.
Sammons, Scott A.
Govil, Dhwani
Mair, Raydel D.
Tatti, Kathleen M.
Tondella, Maria L.
Harcourt, Brian H.
Mayer, Leonard W.
Jordan, I. King
author_sort Kislyuk, Andrey O.
collection PubMed
description Motivation: New sequencing technologies have accelerated research on prokaryotic genomes and have made genome sequencing operations outside major genome sequencing centers routine. However, no off-the-shelf solution exists for the combined assembly, gene prediction, genome annotation and data presentation necessary to interpret sequencing data. The resulting requirement to invest significant resources into custom informatics support for genome sequencing projects remains a major impediment to the accessibility of high-throughput sequence data. Results: We present a self-contained, automated high-throughput open source genome sequencing and computational genomics pipeline suitable for prokaryotic sequencing projects. The pipeline has been used at the Georgia Institute of Technology and the Centers for Disease Control and Prevention for the analysis of Neisseria meningitidis and Bordetella bronchiseptica genomes. The pipeline is capable of enhanced or manually assisted reference-based assembly using multiple assemblers and modes; gene predictor combining; and functional annotation of genes and gene products. Because every component of the pipeline is executed on a local machine with no need to access resources over the Internet, the pipeline is suitable for projects of a sensitive nature. Annotation of virulence-related features makes the pipeline particularly useful for projects working with pathogenic prokaryotes. Availability and implementation: The pipeline is licensed under the open-source GNU General Public License and available at the Georgia Tech Neisseria Base (http://nbase.biology.gatech.edu/). The pipeline is implemented with a combination of Perl, Bourne Shell and MySQL and is compatible with Linux and other Unix systems. Contact: king.jordan@biology.gatech.edu Supplementary information: Supplementary data are available at Bioinformatics online.
format Text
id pubmed-2905547
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-29055472010-07-19 A computational genomics pipeline for prokaryotic sequencing projects Kislyuk, Andrey O. Katz, Lee S. Agrawal, Sonia Hagen, Matthew S. Conley, Andrew B. Jayaraman, Pushkala Nelakuditi, Viswateja Humphrey, Jay C. Sammons, Scott A. Govil, Dhwani Mair, Raydel D. Tatti, Kathleen M. Tondella, Maria L. Harcourt, Brian H. Mayer, Leonard W. Jordan, I. King Bioinformatics Original Papers Motivation: New sequencing technologies have accelerated research on prokaryotic genomes and have made genome sequencing operations outside major genome sequencing centers routine. However, no off-the-shelf solution exists for the combined assembly, gene prediction, genome annotation and data presentation necessary to interpret sequencing data. The resulting requirement to invest significant resources into custom informatics support for genome sequencing projects remains a major impediment to the accessibility of high-throughput sequence data. Results: We present a self-contained, automated high-throughput open source genome sequencing and computational genomics pipeline suitable for prokaryotic sequencing projects. The pipeline has been used at the Georgia Institute of Technology and the Centers for Disease Control and Prevention for the analysis of Neisseria meningitidis and Bordetella bronchiseptica genomes. The pipeline is capable of enhanced or manually assisted reference-based assembly using multiple assemblers and modes; gene predictor combining; and functional annotation of genes and gene products. Because every component of the pipeline is executed on a local machine with no need to access resources over the Internet, the pipeline is suitable for projects of a sensitive nature. Annotation of virulence-related features makes the pipeline particularly useful for projects working with pathogenic prokaryotes. Availability and implementation: The pipeline is licensed under the open-source GNU General Public License and available at the Georgia Tech Neisseria Base (http://nbase.biology.gatech.edu/). The pipeline is implemented with a combination of Perl, Bourne Shell and MySQL and is compatible with Linux and other Unix systems. Contact: king.jordan@biology.gatech.edu Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2010-08-01 2010-06-02 /pmc/articles/PMC2905547/ /pubmed/20519285 http://dx.doi.org/10.1093/bioinformatics/btq284 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Kislyuk, Andrey O.
Katz, Lee S.
Agrawal, Sonia
Hagen, Matthew S.
Conley, Andrew B.
Jayaraman, Pushkala
Nelakuditi, Viswateja
Humphrey, Jay C.
Sammons, Scott A.
Govil, Dhwani
Mair, Raydel D.
Tatti, Kathleen M.
Tondella, Maria L.
Harcourt, Brian H.
Mayer, Leonard W.
Jordan, I. King
A computational genomics pipeline for prokaryotic sequencing projects
title A computational genomics pipeline for prokaryotic sequencing projects
title_full A computational genomics pipeline for prokaryotic sequencing projects
title_fullStr A computational genomics pipeline for prokaryotic sequencing projects
title_full_unstemmed A computational genomics pipeline for prokaryotic sequencing projects
title_short A computational genomics pipeline for prokaryotic sequencing projects
title_sort computational genomics pipeline for prokaryotic sequencing projects
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2905547/
https://www.ncbi.nlm.nih.gov/pubmed/20519285
http://dx.doi.org/10.1093/bioinformatics/btq284
work_keys_str_mv AT kislyukandreyo acomputationalgenomicspipelineforprokaryoticsequencingprojects
AT katzlees acomputationalgenomicspipelineforprokaryoticsequencingprojects
AT agrawalsonia acomputationalgenomicspipelineforprokaryoticsequencingprojects
AT hagenmatthews acomputationalgenomicspipelineforprokaryoticsequencingprojects
AT conleyandrewb acomputationalgenomicspipelineforprokaryoticsequencingprojects
AT jayaramanpushkala acomputationalgenomicspipelineforprokaryoticsequencingprojects
AT nelakuditiviswateja acomputationalgenomicspipelineforprokaryoticsequencingprojects
AT humphreyjayc acomputationalgenomicspipelineforprokaryoticsequencingprojects
AT sammonsscotta acomputationalgenomicspipelineforprokaryoticsequencingprojects
AT govildhwani acomputationalgenomicspipelineforprokaryoticsequencingprojects
AT mairraydeld acomputationalgenomicspipelineforprokaryoticsequencingprojects
AT tattikathleenm acomputationalgenomicspipelineforprokaryoticsequencingprojects
AT tondellamarial acomputationalgenomicspipelineforprokaryoticsequencingprojects
AT harcourtbrianh acomputationalgenomicspipelineforprokaryoticsequencingprojects
AT mayerleonardw acomputationalgenomicspipelineforprokaryoticsequencingprojects
AT jordaniking acomputationalgenomicspipelineforprokaryoticsequencingprojects
AT kislyukandreyo computationalgenomicspipelineforprokaryoticsequencingprojects
AT katzlees computationalgenomicspipelineforprokaryoticsequencingprojects
AT agrawalsonia computationalgenomicspipelineforprokaryoticsequencingprojects
AT hagenmatthews computationalgenomicspipelineforprokaryoticsequencingprojects
AT conleyandrewb computationalgenomicspipelineforprokaryoticsequencingprojects
AT jayaramanpushkala computationalgenomicspipelineforprokaryoticsequencingprojects
AT nelakuditiviswateja computationalgenomicspipelineforprokaryoticsequencingprojects
AT humphreyjayc computationalgenomicspipelineforprokaryoticsequencingprojects
AT sammonsscotta computationalgenomicspipelineforprokaryoticsequencingprojects
AT govildhwani computationalgenomicspipelineforprokaryoticsequencingprojects
AT mairraydeld computationalgenomicspipelineforprokaryoticsequencingprojects
AT tattikathleenm computationalgenomicspipelineforprokaryoticsequencingprojects
AT tondellamarial computationalgenomicspipelineforprokaryoticsequencingprojects
AT harcourtbrianh computationalgenomicspipelineforprokaryoticsequencingprojects
AT mayerleonardw computationalgenomicspipelineforprokaryoticsequencingprojects
AT jordaniking computationalgenomicspipelineforprokaryoticsequencingprojects