Cargando…
A computational genomics pipeline for prokaryotic sequencing projects
Motivation: New sequencing technologies have accelerated research on prokaryotic genomes and have made genome sequencing operations outside major genome sequencing centers routine. However, no off-the-shelf solution exists for the combined assembly, gene prediction, genome annotation and data presen...
Autores principales: | , , , , , , , , , , , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2905547/ https://www.ncbi.nlm.nih.gov/pubmed/20519285 http://dx.doi.org/10.1093/bioinformatics/btq284 |
_version_ | 1782183972509319168 |
---|---|
author | Kislyuk, Andrey O. Katz, Lee S. Agrawal, Sonia Hagen, Matthew S. Conley, Andrew B. Jayaraman, Pushkala Nelakuditi, Viswateja Humphrey, Jay C. Sammons, Scott A. Govil, Dhwani Mair, Raydel D. Tatti, Kathleen M. Tondella, Maria L. Harcourt, Brian H. Mayer, Leonard W. Jordan, I. King |
author_facet | Kislyuk, Andrey O. Katz, Lee S. Agrawal, Sonia Hagen, Matthew S. Conley, Andrew B. Jayaraman, Pushkala Nelakuditi, Viswateja Humphrey, Jay C. Sammons, Scott A. Govil, Dhwani Mair, Raydel D. Tatti, Kathleen M. Tondella, Maria L. Harcourt, Brian H. Mayer, Leonard W. Jordan, I. King |
author_sort | Kislyuk, Andrey O. |
collection | PubMed |
description | Motivation: New sequencing technologies have accelerated research on prokaryotic genomes and have made genome sequencing operations outside major genome sequencing centers routine. However, no off-the-shelf solution exists for the combined assembly, gene prediction, genome annotation and data presentation necessary to interpret sequencing data. The resulting requirement to invest significant resources into custom informatics support for genome sequencing projects remains a major impediment to the accessibility of high-throughput sequence data. Results: We present a self-contained, automated high-throughput open source genome sequencing and computational genomics pipeline suitable for prokaryotic sequencing projects. The pipeline has been used at the Georgia Institute of Technology and the Centers for Disease Control and Prevention for the analysis of Neisseria meningitidis and Bordetella bronchiseptica genomes. The pipeline is capable of enhanced or manually assisted reference-based assembly using multiple assemblers and modes; gene predictor combining; and functional annotation of genes and gene products. Because every component of the pipeline is executed on a local machine with no need to access resources over the Internet, the pipeline is suitable for projects of a sensitive nature. Annotation of virulence-related features makes the pipeline particularly useful for projects working with pathogenic prokaryotes. Availability and implementation: The pipeline is licensed under the open-source GNU General Public License and available at the Georgia Tech Neisseria Base (http://nbase.biology.gatech.edu/). The pipeline is implemented with a combination of Perl, Bourne Shell and MySQL and is compatible with Linux and other Unix systems. Contact: king.jordan@biology.gatech.edu Supplementary information: Supplementary data are available at Bioinformatics online. |
format | Text |
id | pubmed-2905547 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-29055472010-07-19 A computational genomics pipeline for prokaryotic sequencing projects Kislyuk, Andrey O. Katz, Lee S. Agrawal, Sonia Hagen, Matthew S. Conley, Andrew B. Jayaraman, Pushkala Nelakuditi, Viswateja Humphrey, Jay C. Sammons, Scott A. Govil, Dhwani Mair, Raydel D. Tatti, Kathleen M. Tondella, Maria L. Harcourt, Brian H. Mayer, Leonard W. Jordan, I. King Bioinformatics Original Papers Motivation: New sequencing technologies have accelerated research on prokaryotic genomes and have made genome sequencing operations outside major genome sequencing centers routine. However, no off-the-shelf solution exists for the combined assembly, gene prediction, genome annotation and data presentation necessary to interpret sequencing data. The resulting requirement to invest significant resources into custom informatics support for genome sequencing projects remains a major impediment to the accessibility of high-throughput sequence data. Results: We present a self-contained, automated high-throughput open source genome sequencing and computational genomics pipeline suitable for prokaryotic sequencing projects. The pipeline has been used at the Georgia Institute of Technology and the Centers for Disease Control and Prevention for the analysis of Neisseria meningitidis and Bordetella bronchiseptica genomes. The pipeline is capable of enhanced or manually assisted reference-based assembly using multiple assemblers and modes; gene predictor combining; and functional annotation of genes and gene products. Because every component of the pipeline is executed on a local machine with no need to access resources over the Internet, the pipeline is suitable for projects of a sensitive nature. Annotation of virulence-related features makes the pipeline particularly useful for projects working with pathogenic prokaryotes. Availability and implementation: The pipeline is licensed under the open-source GNU General Public License and available at the Georgia Tech Neisseria Base (http://nbase.biology.gatech.edu/). The pipeline is implemented with a combination of Perl, Bourne Shell and MySQL and is compatible with Linux and other Unix systems. Contact: king.jordan@biology.gatech.edu Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2010-08-01 2010-06-02 /pmc/articles/PMC2905547/ /pubmed/20519285 http://dx.doi.org/10.1093/bioinformatics/btq284 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Papers Kislyuk, Andrey O. Katz, Lee S. Agrawal, Sonia Hagen, Matthew S. Conley, Andrew B. Jayaraman, Pushkala Nelakuditi, Viswateja Humphrey, Jay C. Sammons, Scott A. Govil, Dhwani Mair, Raydel D. Tatti, Kathleen M. Tondella, Maria L. Harcourt, Brian H. Mayer, Leonard W. Jordan, I. King A computational genomics pipeline for prokaryotic sequencing projects |
title | A computational genomics pipeline for prokaryotic sequencing projects |
title_full | A computational genomics pipeline for prokaryotic sequencing projects |
title_fullStr | A computational genomics pipeline for prokaryotic sequencing projects |
title_full_unstemmed | A computational genomics pipeline for prokaryotic sequencing projects |
title_short | A computational genomics pipeline for prokaryotic sequencing projects |
title_sort | computational genomics pipeline for prokaryotic sequencing projects |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2905547/ https://www.ncbi.nlm.nih.gov/pubmed/20519285 http://dx.doi.org/10.1093/bioinformatics/btq284 |
work_keys_str_mv | AT kislyukandreyo acomputationalgenomicspipelineforprokaryoticsequencingprojects AT katzlees acomputationalgenomicspipelineforprokaryoticsequencingprojects AT agrawalsonia acomputationalgenomicspipelineforprokaryoticsequencingprojects AT hagenmatthews acomputationalgenomicspipelineforprokaryoticsequencingprojects AT conleyandrewb acomputationalgenomicspipelineforprokaryoticsequencingprojects AT jayaramanpushkala acomputationalgenomicspipelineforprokaryoticsequencingprojects AT nelakuditiviswateja acomputationalgenomicspipelineforprokaryoticsequencingprojects AT humphreyjayc acomputationalgenomicspipelineforprokaryoticsequencingprojects AT sammonsscotta acomputationalgenomicspipelineforprokaryoticsequencingprojects AT govildhwani acomputationalgenomicspipelineforprokaryoticsequencingprojects AT mairraydeld acomputationalgenomicspipelineforprokaryoticsequencingprojects AT tattikathleenm acomputationalgenomicspipelineforprokaryoticsequencingprojects AT tondellamarial acomputationalgenomicspipelineforprokaryoticsequencingprojects AT harcourtbrianh acomputationalgenomicspipelineforprokaryoticsequencingprojects AT mayerleonardw acomputationalgenomicspipelineforprokaryoticsequencingprojects AT jordaniking acomputationalgenomicspipelineforprokaryoticsequencingprojects AT kislyukandreyo computationalgenomicspipelineforprokaryoticsequencingprojects AT katzlees computationalgenomicspipelineforprokaryoticsequencingprojects AT agrawalsonia computationalgenomicspipelineforprokaryoticsequencingprojects AT hagenmatthews computationalgenomicspipelineforprokaryoticsequencingprojects AT conleyandrewb computationalgenomicspipelineforprokaryoticsequencingprojects AT jayaramanpushkala computationalgenomicspipelineforprokaryoticsequencingprojects AT nelakuditiviswateja computationalgenomicspipelineforprokaryoticsequencingprojects AT humphreyjayc computationalgenomicspipelineforprokaryoticsequencingprojects AT sammonsscotta computationalgenomicspipelineforprokaryoticsequencingprojects AT govildhwani computationalgenomicspipelineforprokaryoticsequencingprojects AT mairraydeld computationalgenomicspipelineforprokaryoticsequencingprojects AT tattikathleenm computationalgenomicspipelineforprokaryoticsequencingprojects AT tondellamarial computationalgenomicspipelineforprokaryoticsequencingprojects AT harcourtbrianh computationalgenomicspipelineforprokaryoticsequencingprojects AT mayerleonardw computationalgenomicspipelineforprokaryoticsequencingprojects AT jordaniking computationalgenomicspipelineforprokaryoticsequencingprojects |