Cargando…

A Sequence Distance Graph framework for genome assembly and analysis

The Sequence Distance Graph (SDG) framework works with genome assembly graphs and raw data from paired, linked and long reads. It includes a simple deBruijn graph module, and can import graphs using the graphical fragment assembly (GFA) format. It also maps raw reads onto graphs, and provides a Pyth...

Descripción completa

Detalles Bibliográficos
Autores principales: Yanes, Luis, Garcia Accinelli, Gonzalo, Wright, Jonathan, Ward, Ben J., Clavijo, Bernardo J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000 Research Limited 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6833988/
https://www.ncbi.nlm.nih.gov/pubmed/31723420
http://dx.doi.org/10.12688/f1000research.20233.1
_version_ 1783466409859219456
author Yanes, Luis
Garcia Accinelli, Gonzalo
Wright, Jonathan
Ward, Ben J.
Clavijo, Bernardo J.
author_facet Yanes, Luis
Garcia Accinelli, Gonzalo
Wright, Jonathan
Ward, Ben J.
Clavijo, Bernardo J.
author_sort Yanes, Luis
collection PubMed
description The Sequence Distance Graph (SDG) framework works with genome assembly graphs and raw data from paired, linked and long reads. It includes a simple deBruijn graph module, and can import graphs using the graphical fragment assembly (GFA) format. It also maps raw reads onto graphs, and provides a Python application programming interface (API) to navigate the graph, access the mapped and raw data and perform interactive or scripted analyses. Its complete workspace can be dumped to and loaded from disk, decoupling mapping from analysis and supporting multi-stage pipelines. We present the design and implementation of the framework, and example analyses scaffolding a short read graph with long reads, and navigating paths in a heterozygous graph for a simulated parent-offspring trio dataset. SDG  is  freely  available  under  the  MIT  license  at https://github.com/bioinfologics/sdg
format Online
Article
Text
id pubmed-6833988
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher F1000 Research Limited
record_format MEDLINE/PubMed
spelling pubmed-68339882019-11-12 A Sequence Distance Graph framework for genome assembly and analysis Yanes, Luis Garcia Accinelli, Gonzalo Wright, Jonathan Ward, Ben J. Clavijo, Bernardo J. F1000Res Software Tool Article The Sequence Distance Graph (SDG) framework works with genome assembly graphs and raw data from paired, linked and long reads. It includes a simple deBruijn graph module, and can import graphs using the graphical fragment assembly (GFA) format. It also maps raw reads onto graphs, and provides a Python application programming interface (API) to navigate the graph, access the mapped and raw data and perform interactive or scripted analyses. Its complete workspace can be dumped to and loaded from disk, decoupling mapping from analysis and supporting multi-stage pipelines. We present the design and implementation of the framework, and example analyses scaffolding a short read graph with long reads, and navigating paths in a heterozygous graph for a simulated parent-offspring trio dataset. SDG  is  freely  available  under  the  MIT  license  at https://github.com/bioinfologics/sdg F1000 Research Limited 2019-08-23 /pmc/articles/PMC6833988/ /pubmed/31723420 http://dx.doi.org/10.12688/f1000research.20233.1 Text en Copyright: © 2019 Yanes L et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software Tool Article
Yanes, Luis
Garcia Accinelli, Gonzalo
Wright, Jonathan
Ward, Ben J.
Clavijo, Bernardo J.
A Sequence Distance Graph framework for genome assembly and analysis
title A Sequence Distance Graph framework for genome assembly and analysis
title_full A Sequence Distance Graph framework for genome assembly and analysis
title_fullStr A Sequence Distance Graph framework for genome assembly and analysis
title_full_unstemmed A Sequence Distance Graph framework for genome assembly and analysis
title_short A Sequence Distance Graph framework for genome assembly and analysis
title_sort sequence distance graph framework for genome assembly and analysis
topic Software Tool Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6833988/
https://www.ncbi.nlm.nih.gov/pubmed/31723420
http://dx.doi.org/10.12688/f1000research.20233.1
work_keys_str_mv AT yanesluis asequencedistancegraphframeworkforgenomeassemblyandanalysis
AT garciaaccinelligonzalo asequencedistancegraphframeworkforgenomeassemblyandanalysis
AT wrightjonathan asequencedistancegraphframeworkforgenomeassemblyandanalysis
AT wardbenj asequencedistancegraphframeworkforgenomeassemblyandanalysis
AT clavijobernardoj asequencedistancegraphframeworkforgenomeassemblyandanalysis
AT yanesluis sequencedistancegraphframeworkforgenomeassemblyandanalysis
AT garciaaccinelligonzalo sequencedistancegraphframeworkforgenomeassemblyandanalysis
AT wrightjonathan sequencedistancegraphframeworkforgenomeassemblyandanalysis
AT wardbenj sequencedistancegraphframeworkforgenomeassemblyandanalysis
AT clavijobernardoj sequencedistancegraphframeworkforgenomeassemblyandanalysis