Cargando…

Genome puzzle master (GPM): an integrated pipeline for building and editing pseudomolecules from fragmented sequences

Motivation: Next generation sequencing technologies have revolutionized our ability to rapidly and affordably generate vast quantities of sequence data. Once generated, raw sequences are assembled into contigs or scaffolds. However, these assemblies are mostly fragmented and inaccurate at the whole...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Jianwei, Kudrna, Dave, Mu, Ting, Li, Weiming, Copetti, Dario, Yu, Yeisoo, Goicoechea, Jose Luis, Lei, Yang, Wing, Rod A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5048067/
https://www.ncbi.nlm.nih.gov/pubmed/27318200
http://dx.doi.org/10.1093/bioinformatics/btw370
_version_ 1782457530720452608
author Zhang, Jianwei
Kudrna, Dave
Mu, Ting
Li, Weiming
Copetti, Dario
Yu, Yeisoo
Goicoechea, Jose Luis
Lei, Yang
Wing, Rod A.
author_facet Zhang, Jianwei
Kudrna, Dave
Mu, Ting
Li, Weiming
Copetti, Dario
Yu, Yeisoo
Goicoechea, Jose Luis
Lei, Yang
Wing, Rod A.
author_sort Zhang, Jianwei
collection PubMed
description Motivation: Next generation sequencing technologies have revolutionized our ability to rapidly and affordably generate vast quantities of sequence data. Once generated, raw sequences are assembled into contigs or scaffolds. However, these assemblies are mostly fragmented and inaccurate at the whole genome scale, largely due to the inability to integrate additional informative datasets (e.g. physical, optical and genetic maps). To address this problem, we developed a semi-automated software tool—Genome Puzzle Master (GPM)—that enables the integration of additional genomic signposts to edit and build ‘new-gen-assemblies’ that result in high-quality ‘annotation-ready’ pseudomolecules. Results: With GPM, loaded datasets can be connected to each other via their logical relationships which accomplishes tasks to ‘group,’ ‘merge,’ ‘order and orient’ sequences in a draft assembly. Manual editing can also be performed with a user-friendly graphical interface. Final pseudomolecules reflect a user’s total data package and are available for long-term project management. GPM is a web-based pipeline and an important part of a Laboratory Information Management System (LIMS) which can be easily deployed on local servers for any genome research laboratory. Availability and Implementation: The GPM (with LIMS) package is available at https://github.com/Jianwei-Zhang/LIMS Contacts: jzhang@mail.hzau.edu.cn or rwing@mail.arizona.edu Supplementary information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-5048067
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-50480672016-10-05 Genome puzzle master (GPM): an integrated pipeline for building and editing pseudomolecules from fragmented sequences Zhang, Jianwei Kudrna, Dave Mu, Ting Li, Weiming Copetti, Dario Yu, Yeisoo Goicoechea, Jose Luis Lei, Yang Wing, Rod A. Bioinformatics Original Paper Motivation: Next generation sequencing technologies have revolutionized our ability to rapidly and affordably generate vast quantities of sequence data. Once generated, raw sequences are assembled into contigs or scaffolds. However, these assemblies are mostly fragmented and inaccurate at the whole genome scale, largely due to the inability to integrate additional informative datasets (e.g. physical, optical and genetic maps). To address this problem, we developed a semi-automated software tool—Genome Puzzle Master (GPM)—that enables the integration of additional genomic signposts to edit and build ‘new-gen-assemblies’ that result in high-quality ‘annotation-ready’ pseudomolecules. Results: With GPM, loaded datasets can be connected to each other via their logical relationships which accomplishes tasks to ‘group,’ ‘merge,’ ‘order and orient’ sequences in a draft assembly. Manual editing can also be performed with a user-friendly graphical interface. Final pseudomolecules reflect a user’s total data package and are available for long-term project management. GPM is a web-based pipeline and an important part of a Laboratory Information Management System (LIMS) which can be easily deployed on local servers for any genome research laboratory. Availability and Implementation: The GPM (with LIMS) package is available at https://github.com/Jianwei-Zhang/LIMS Contacts: jzhang@mail.hzau.edu.cn or rwing@mail.arizona.edu Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2016-10-15 2016-06-17 /pmc/articles/PMC5048067/ /pubmed/27318200 http://dx.doi.org/10.1093/bioinformatics/btw370 Text en © The Author 2016. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Original Paper
Zhang, Jianwei
Kudrna, Dave
Mu, Ting
Li, Weiming
Copetti, Dario
Yu, Yeisoo
Goicoechea, Jose Luis
Lei, Yang
Wing, Rod A.
Genome puzzle master (GPM): an integrated pipeline for building and editing pseudomolecules from fragmented sequences
title Genome puzzle master (GPM): an integrated pipeline for building and editing pseudomolecules from fragmented sequences
title_full Genome puzzle master (GPM): an integrated pipeline for building and editing pseudomolecules from fragmented sequences
title_fullStr Genome puzzle master (GPM): an integrated pipeline for building and editing pseudomolecules from fragmented sequences
title_full_unstemmed Genome puzzle master (GPM): an integrated pipeline for building and editing pseudomolecules from fragmented sequences
title_short Genome puzzle master (GPM): an integrated pipeline for building and editing pseudomolecules from fragmented sequences
title_sort genome puzzle master (gpm): an integrated pipeline for building and editing pseudomolecules from fragmented sequences
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5048067/
https://www.ncbi.nlm.nih.gov/pubmed/27318200
http://dx.doi.org/10.1093/bioinformatics/btw370
work_keys_str_mv AT zhangjianwei genomepuzzlemastergpmanintegratedpipelineforbuildingandeditingpseudomoleculesfromfragmentedsequences
AT kudrnadave genomepuzzlemastergpmanintegratedpipelineforbuildingandeditingpseudomoleculesfromfragmentedsequences
AT muting genomepuzzlemastergpmanintegratedpipelineforbuildingandeditingpseudomoleculesfromfragmentedsequences
AT liweiming genomepuzzlemastergpmanintegratedpipelineforbuildingandeditingpseudomoleculesfromfragmentedsequences
AT copettidario genomepuzzlemastergpmanintegratedpipelineforbuildingandeditingpseudomoleculesfromfragmentedsequences
AT yuyeisoo genomepuzzlemastergpmanintegratedpipelineforbuildingandeditingpseudomoleculesfromfragmentedsequences
AT goicoecheajoseluis genomepuzzlemastergpmanintegratedpipelineforbuildingandeditingpseudomoleculesfromfragmentedsequences
AT leiyang genomepuzzlemastergpmanintegratedpipelineforbuildingandeditingpseudomoleculesfromfragmentedsequences
AT wingroda genomepuzzlemastergpmanintegratedpipelineforbuildingandeditingpseudomoleculesfromfragmentedsequences