Cargando…

BlobToolKit – Interactive Quality Assessment of Genome Assemblies

Reconstruction of target genomes from sequence data produced by instruments that are agnostic as to the species-of-origin may be confounded by contaminant DNA. Whether introduced during sample processing or through co-extraction alongside the target DNA, if insufficient care is taken during the asse...

Descripción completa

Detalles Bibliográficos
Autores principales: Challis, Richard, Richards, Edward, Rajan, Jeena, Cochrane, Guy, Blaxter, Mark
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Genetics Society of America 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7144090/
https://www.ncbi.nlm.nih.gov/pubmed/32071071
http://dx.doi.org/10.1534/g3.119.400908
_version_ 1783519767097769984
author Challis, Richard
Richards, Edward
Rajan, Jeena
Cochrane, Guy
Blaxter, Mark
author_facet Challis, Richard
Richards, Edward
Rajan, Jeena
Cochrane, Guy
Blaxter, Mark
author_sort Challis, Richard
collection PubMed
description Reconstruction of target genomes from sequence data produced by instruments that are agnostic as to the species-of-origin may be confounded by contaminant DNA. Whether introduced during sample processing or through co-extraction alongside the target DNA, if insufficient care is taken during the assembly process, the final assembled genome may be a mixture of data from several species. Such assemblies can confound sequence-based biological inference and, when deposited in public databases, may be included in downstream analyses by users unaware of underlying problems. We present BlobToolKit, a software suite to aid researchers in identifying and isolating non-target data in draft and publicly available genome assemblies. BlobToolKit can be used to process assembly, read and analysis files for fully reproducible interactive exploration in the browser-based Viewer. BlobToolKit can be used during assembly to filter non-target DNA, helping researchers produce assemblies with high biological credibility. We have been running an automated BlobToolKit pipeline on eukaryotic assemblies publicly available in the International Nucleotide Sequence Data Collaboration and are making the results available through a public instance of the Viewer at https://blobtoolkit.genomehubs.org/view. We aim to complete analysis of all publicly available genomes and then maintain currency with the flow of new genomes. We have worked to embed these views into the presentation of genome assemblies at the European Nucleotide Archive, providing an indication of assembly quality alongside the public record with links out to allow full exploration in the Viewer.
format Online
Article
Text
id pubmed-7144090
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Genetics Society of America
record_format MEDLINE/PubMed
spelling pubmed-71440902020-04-14 BlobToolKit – Interactive Quality Assessment of Genome Assemblies Challis, Richard Richards, Edward Rajan, Jeena Cochrane, Guy Blaxter, Mark G3 (Bethesda) Investigations Reconstruction of target genomes from sequence data produced by instruments that are agnostic as to the species-of-origin may be confounded by contaminant DNA. Whether introduced during sample processing or through co-extraction alongside the target DNA, if insufficient care is taken during the assembly process, the final assembled genome may be a mixture of data from several species. Such assemblies can confound sequence-based biological inference and, when deposited in public databases, may be included in downstream analyses by users unaware of underlying problems. We present BlobToolKit, a software suite to aid researchers in identifying and isolating non-target data in draft and publicly available genome assemblies. BlobToolKit can be used to process assembly, read and analysis files for fully reproducible interactive exploration in the browser-based Viewer. BlobToolKit can be used during assembly to filter non-target DNA, helping researchers produce assemblies with high biological credibility. We have been running an automated BlobToolKit pipeline on eukaryotic assemblies publicly available in the International Nucleotide Sequence Data Collaboration and are making the results available through a public instance of the Viewer at https://blobtoolkit.genomehubs.org/view. We aim to complete analysis of all publicly available genomes and then maintain currency with the flow of new genomes. We have worked to embed these views into the presentation of genome assemblies at the European Nucleotide Archive, providing an indication of assembly quality alongside the public record with links out to allow full exploration in the Viewer. Genetics Society of America 2020-02-18 /pmc/articles/PMC7144090/ /pubmed/32071071 http://dx.doi.org/10.1534/g3.119.400908 Text en Copyright © 2020 Challis et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Investigations
Challis, Richard
Richards, Edward
Rajan, Jeena
Cochrane, Guy
Blaxter, Mark
BlobToolKit – Interactive Quality Assessment of Genome Assemblies
title BlobToolKit – Interactive Quality Assessment of Genome Assemblies
title_full BlobToolKit – Interactive Quality Assessment of Genome Assemblies
title_fullStr BlobToolKit – Interactive Quality Assessment of Genome Assemblies
title_full_unstemmed BlobToolKit – Interactive Quality Assessment of Genome Assemblies
title_short BlobToolKit – Interactive Quality Assessment of Genome Assemblies
title_sort blobtoolkit – interactive quality assessment of genome assemblies
topic Investigations
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7144090/
https://www.ncbi.nlm.nih.gov/pubmed/32071071
http://dx.doi.org/10.1534/g3.119.400908
work_keys_str_mv AT challisrichard blobtoolkitinteractivequalityassessmentofgenomeassemblies
AT richardsedward blobtoolkitinteractivequalityassessmentofgenomeassemblies
AT rajanjeena blobtoolkitinteractivequalityassessmentofgenomeassemblies
AT cochraneguy blobtoolkitinteractivequalityassessmentofgenomeassemblies
AT blaxtermark blobtoolkitinteractivequalityassessmentofgenomeassemblies