Cargando…

Toward a more holistic method of genome assembly assessment

BACKGROUND: A key use of high throughput sequencing technology is the sequencing and assembly of full genome sequences. These genome assemblies are commonly assessed using statistics relating to contiguity of the assembly. Measures of contiguity are not strongly correlated with information about the...

Descripción completa

Detalles Bibliográficos
Autores principales: Thrash, Adam, Hoffmann, Federico, Perkins, Andy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7336394/
https://www.ncbi.nlm.nih.gov/pubmed/32631298
http://dx.doi.org/10.1186/s12859-020-3382-4
_version_ 1783554309894438912
author Thrash, Adam
Hoffmann, Federico
Perkins, Andy
author_facet Thrash, Adam
Hoffmann, Federico
Perkins, Andy
author_sort Thrash, Adam
collection PubMed
description BACKGROUND: A key use of high throughput sequencing technology is the sequencing and assembly of full genome sequences. These genome assemblies are commonly assessed using statistics relating to contiguity of the assembly. Measures of contiguity are not strongly correlated with information about the biological completion or correctness of the assembly, and a commonly reported metric, N50, can be misleading. Over the years, multiple research groups have rejected the overuse of N50 and sought to develop more informative metrics. RESULTS: This paper presents a review of problems that arise from relying solely on contiguity as a measure of genome assembly quality as well as current alternative methods. Alternative methods are compared on the basis of how informative they are about the biological quality of the assembly and how easy they are to use. A comprehensive method for using multiple metrics of measuring assembly quality is presented. CONCLUSIONS: This study aims to report on the status of assembly assessment methods and compare them, as well as to offer a comprehensive method that incorporates multiple facets of quality assessment. Weaknesses and strengths of varying methods are presented and explained, with recommendations based on speed of analysis and user friendliness.
format Online
Article
Text
id pubmed-7336394
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-73363942020-07-07 Toward a more holistic method of genome assembly assessment Thrash, Adam Hoffmann, Federico Perkins, Andy BMC Bioinformatics Research BACKGROUND: A key use of high throughput sequencing technology is the sequencing and assembly of full genome sequences. These genome assemblies are commonly assessed using statistics relating to contiguity of the assembly. Measures of contiguity are not strongly correlated with information about the biological completion or correctness of the assembly, and a commonly reported metric, N50, can be misleading. Over the years, multiple research groups have rejected the overuse of N50 and sought to develop more informative metrics. RESULTS: This paper presents a review of problems that arise from relying solely on contiguity as a measure of genome assembly quality as well as current alternative methods. Alternative methods are compared on the basis of how informative they are about the biological quality of the assembly and how easy they are to use. A comprehensive method for using multiple metrics of measuring assembly quality is presented. CONCLUSIONS: This study aims to report on the status of assembly assessment methods and compare them, as well as to offer a comprehensive method that incorporates multiple facets of quality assessment. Weaknesses and strengths of varying methods are presented and explained, with recommendations based on speed of analysis and user friendliness. BioMed Central 2020-07-06 /pmc/articles/PMC7336394/ /pubmed/32631298 http://dx.doi.org/10.1186/s12859-020-3382-4 Text en © The Author(s) 2020 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Thrash, Adam
Hoffmann, Federico
Perkins, Andy
Toward a more holistic method of genome assembly assessment
title Toward a more holistic method of genome assembly assessment
title_full Toward a more holistic method of genome assembly assessment
title_fullStr Toward a more holistic method of genome assembly assessment
title_full_unstemmed Toward a more holistic method of genome assembly assessment
title_short Toward a more holistic method of genome assembly assessment
title_sort toward a more holistic method of genome assembly assessment
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7336394/
https://www.ncbi.nlm.nih.gov/pubmed/32631298
http://dx.doi.org/10.1186/s12859-020-3382-4
work_keys_str_mv AT thrashadam towardamoreholisticmethodofgenomeassemblyassessment
AT hoffmannfederico towardamoreholisticmethodofgenomeassemblyassessment
AT perkinsandy towardamoreholisticmethodofgenomeassemblyassessment