Cargando…

Significantly improving the quality of genome assemblies through curation

Genome sequence assemblies provide the basis for our understanding of biology. Generating error-free assemblies is therefore the ultimate, but sadly still unachieved goal of a multitude of research projects. Despite the ever-advancing improvements in data generation, assembly algorithms and pipeline...

Descripción completa

Detalles Bibliográficos
Autores principales: Howe, Kerstin, Chow, William, Collins, Joanna, Pelan, Sarah, Pointon, Damon-Lee, Sims, Ying, Torrance, James, Tracey, Alan, Wood, Jonathan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7794651/
https://www.ncbi.nlm.nih.gov/pubmed/33420778
http://dx.doi.org/10.1093/gigascience/giaa153
_version_ 1783634259863404544
author Howe, Kerstin
Chow, William
Collins, Joanna
Pelan, Sarah
Pointon, Damon-Lee
Sims, Ying
Torrance, James
Tracey, Alan
Wood, Jonathan
author_facet Howe, Kerstin
Chow, William
Collins, Joanna
Pelan, Sarah
Pointon, Damon-Lee
Sims, Ying
Torrance, James
Tracey, Alan
Wood, Jonathan
author_sort Howe, Kerstin
collection PubMed
description Genome sequence assemblies provide the basis for our understanding of biology. Generating error-free assemblies is therefore the ultimate, but sadly still unachieved goal of a multitude of research projects. Despite the ever-advancing improvements in data generation, assembly algorithms and pipelines, no automated approach has so far reliably generated near error-free genome assemblies for eukaryotes. Whilst working towards improved datasets and fully automated pipelines, assembly evaluation and curation is actively used to bridge this shortcoming and significantly reduce the number of assembly errors. In addition to this increase in product value, the insights gained from assembly curation are fed back into the automated assembly strategy and contribute to notable improvements in genome assembly quality. We describe our tried and tested approach for assembly curation using gEVAL, the genome evaluation browser. We outline the procedures applied to genome curation using gEVAL and also our recommendations for assembly curation in a gEVAL-independent context to facilitate the uptake of genome curation in the wider community.
format Online
Article
Text
id pubmed-7794651
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-77946512021-01-13 Significantly improving the quality of genome assemblies through curation Howe, Kerstin Chow, William Collins, Joanna Pelan, Sarah Pointon, Damon-Lee Sims, Ying Torrance, James Tracey, Alan Wood, Jonathan Gigascience Review Genome sequence assemblies provide the basis for our understanding of biology. Generating error-free assemblies is therefore the ultimate, but sadly still unachieved goal of a multitude of research projects. Despite the ever-advancing improvements in data generation, assembly algorithms and pipelines, no automated approach has so far reliably generated near error-free genome assemblies for eukaryotes. Whilst working towards improved datasets and fully automated pipelines, assembly evaluation and curation is actively used to bridge this shortcoming and significantly reduce the number of assembly errors. In addition to this increase in product value, the insights gained from assembly curation are fed back into the automated assembly strategy and contribute to notable improvements in genome assembly quality. We describe our tried and tested approach for assembly curation using gEVAL, the genome evaluation browser. We outline the procedures applied to genome curation using gEVAL and also our recommendations for assembly curation in a gEVAL-independent context to facilitate the uptake of genome curation in the wider community. Oxford University Press 2021-01-09 /pmc/articles/PMC7794651/ /pubmed/33420778 http://dx.doi.org/10.1093/gigascience/giaa153 Text en © The Author(s) 2021. Published by Oxford University Press GigaScience. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Review
Howe, Kerstin
Chow, William
Collins, Joanna
Pelan, Sarah
Pointon, Damon-Lee
Sims, Ying
Torrance, James
Tracey, Alan
Wood, Jonathan
Significantly improving the quality of genome assemblies through curation
title Significantly improving the quality of genome assemblies through curation
title_full Significantly improving the quality of genome assemblies through curation
title_fullStr Significantly improving the quality of genome assemblies through curation
title_full_unstemmed Significantly improving the quality of genome assemblies through curation
title_short Significantly improving the quality of genome assemblies through curation
title_sort significantly improving the quality of genome assemblies through curation
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7794651/
https://www.ncbi.nlm.nih.gov/pubmed/33420778
http://dx.doi.org/10.1093/gigascience/giaa153
work_keys_str_mv AT howekerstin significantlyimprovingthequalityofgenomeassembliesthroughcuration
AT chowwilliam significantlyimprovingthequalityofgenomeassembliesthroughcuration
AT collinsjoanna significantlyimprovingthequalityofgenomeassembliesthroughcuration
AT pelansarah significantlyimprovingthequalityofgenomeassembliesthroughcuration
AT pointondamonlee significantlyimprovingthequalityofgenomeassembliesthroughcuration
AT simsying significantlyimprovingthequalityofgenomeassembliesthroughcuration
AT torrancejames significantlyimprovingthequalityofgenomeassembliesthroughcuration
AT traceyalan significantlyimprovingthequalityofgenomeassembliesthroughcuration
AT woodjonathan significantlyimprovingthequalityofgenomeassembliesthroughcuration