Cargando…
Toward a more holistic method of genome assembly assessment
BACKGROUND: A key use of high throughput sequencing technology is the sequencing and assembly of full genome sequences. These genome assemblies are commonly assessed using statistics relating to contiguity of the assembly. Measures of contiguity are not strongly correlated with information about the...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7336394/ https://www.ncbi.nlm.nih.gov/pubmed/32631298 http://dx.doi.org/10.1186/s12859-020-3382-4 |
_version_ | 1783554309894438912 |
---|---|
author | Thrash, Adam Hoffmann, Federico Perkins, Andy |
author_facet | Thrash, Adam Hoffmann, Federico Perkins, Andy |
author_sort | Thrash, Adam |
collection | PubMed |
description | BACKGROUND: A key use of high throughput sequencing technology is the sequencing and assembly of full genome sequences. These genome assemblies are commonly assessed using statistics relating to contiguity of the assembly. Measures of contiguity are not strongly correlated with information about the biological completion or correctness of the assembly, and a commonly reported metric, N50, can be misleading. Over the years, multiple research groups have rejected the overuse of N50 and sought to develop more informative metrics. RESULTS: This paper presents a review of problems that arise from relying solely on contiguity as a measure of genome assembly quality as well as current alternative methods. Alternative methods are compared on the basis of how informative they are about the biological quality of the assembly and how easy they are to use. A comprehensive method for using multiple metrics of measuring assembly quality is presented. CONCLUSIONS: This study aims to report on the status of assembly assessment methods and compare them, as well as to offer a comprehensive method that incorporates multiple facets of quality assessment. Weaknesses and strengths of varying methods are presented and explained, with recommendations based on speed of analysis and user friendliness. |
format | Online Article Text |
id | pubmed-7336394 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-73363942020-07-07 Toward a more holistic method of genome assembly assessment Thrash, Adam Hoffmann, Federico Perkins, Andy BMC Bioinformatics Research BACKGROUND: A key use of high throughput sequencing technology is the sequencing and assembly of full genome sequences. These genome assemblies are commonly assessed using statistics relating to contiguity of the assembly. Measures of contiguity are not strongly correlated with information about the biological completion or correctness of the assembly, and a commonly reported metric, N50, can be misleading. Over the years, multiple research groups have rejected the overuse of N50 and sought to develop more informative metrics. RESULTS: This paper presents a review of problems that arise from relying solely on contiguity as a measure of genome assembly quality as well as current alternative methods. Alternative methods are compared on the basis of how informative they are about the biological quality of the assembly and how easy they are to use. A comprehensive method for using multiple metrics of measuring assembly quality is presented. CONCLUSIONS: This study aims to report on the status of assembly assessment methods and compare them, as well as to offer a comprehensive method that incorporates multiple facets of quality assessment. Weaknesses and strengths of varying methods are presented and explained, with recommendations based on speed of analysis and user friendliness. BioMed Central 2020-07-06 /pmc/articles/PMC7336394/ /pubmed/32631298 http://dx.doi.org/10.1186/s12859-020-3382-4 Text en © The Author(s) 2020 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Thrash, Adam Hoffmann, Federico Perkins, Andy Toward a more holistic method of genome assembly assessment |
title | Toward a more holistic method of genome assembly assessment |
title_full | Toward a more holistic method of genome assembly assessment |
title_fullStr | Toward a more holistic method of genome assembly assessment |
title_full_unstemmed | Toward a more holistic method of genome assembly assessment |
title_short | Toward a more holistic method of genome assembly assessment |
title_sort | toward a more holistic method of genome assembly assessment |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7336394/ https://www.ncbi.nlm.nih.gov/pubmed/32631298 http://dx.doi.org/10.1186/s12859-020-3382-4 |
work_keys_str_mv | AT thrashadam towardamoreholisticmethodofgenomeassemblyassessment AT hoffmannfederico towardamoreholisticmethodofgenomeassemblyassessment AT perkinsandy towardamoreholisticmethodofgenomeassemblyassessment |