Cargando…

Sequencing smart: De novo sequencing and assembly approaches for a non-model mammal

BACKGROUND: Whilst much sequencing effort has focused on key mammalian model organisms such as mouse and human, little is known about the relationship between genome sequencing techniques for non-model mammals and genome assembly quality. This is especially relevant to non-model mammals, where the s...

Descripción completa

Detalles Bibliográficos
Autores principales: Etherington, Graham J, Heavens, Darren, Baker, David, Lister, Ashleigh, McNelly, Rose, Garcia, Gonzalo, Clavijo, Bernardo, Macaulay, Iain, Haerty, Wilfried, Di Palma, Federica
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7216774/
https://www.ncbi.nlm.nih.gov/pubmed/32396200
http://dx.doi.org/10.1093/gigascience/giaa045
_version_ 1783532479361056768
author Etherington, Graham J
Heavens, Darren
Baker, David
Lister, Ashleigh
McNelly, Rose
Garcia, Gonzalo
Clavijo, Bernardo
Macaulay, Iain
Haerty, Wilfried
Di Palma, Federica
author_facet Etherington, Graham J
Heavens, Darren
Baker, David
Lister, Ashleigh
McNelly, Rose
Garcia, Gonzalo
Clavijo, Bernardo
Macaulay, Iain
Haerty, Wilfried
Di Palma, Federica
author_sort Etherington, Graham J
collection PubMed
description BACKGROUND: Whilst much sequencing effort has focused on key mammalian model organisms such as mouse and human, little is known about the relationship between genome sequencing techniques for non-model mammals and genome assembly quality. This is especially relevant to non-model mammals, where the samples to be sequenced are often degraded and of low quality. A key aspect when planning a genome project is the choice of sequencing data to generate. This decision is driven by several factors, including the biological questions being asked, the quality of DNA available, and the availability of funds. Cutting-edge sequencing technologies now make it possible to achieve highly contiguous, chromosome-level genome assemblies, but rely on high-quality high molecular weight DNA. However, funding is often insufficient for many independent research groups to use these techniques. Here we use a range of different genomic technologies generated from a roadkill European polecat (Mustela putorius) to assess various assembly techniques on this low-quality sample. We evaluated different approaches for de novo assemblies and discuss their value in relation to biological analyses. RESULTS: Generally, assemblies containing more data types achieved better scores in our ranking system. However, when accounting for misassemblies, this was not always the case for Bionano and low-coverage 10x Genomics (for scaffolding only). We also find that the extra cost associated with combining multiple data types is not necessarily associated with better genome assemblies. CONCLUSIONS: The high degree of variability between each de novo assembly method (assessed from the 7 key metrics) highlights the importance of carefully devising the sequencing strategy to be able to carry out the desired analysis. Adding more data to genome assemblies does not always result in better assemblies, so it is important to understand the nuances of genomic data integration explained here, in order to obtain cost-effective value for money when sequencing genomes.
format Online
Article
Text
id pubmed-7216774
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-72167742020-05-15 Sequencing smart: De novo sequencing and assembly approaches for a non-model mammal Etherington, Graham J Heavens, Darren Baker, David Lister, Ashleigh McNelly, Rose Garcia, Gonzalo Clavijo, Bernardo Macaulay, Iain Haerty, Wilfried Di Palma, Federica Gigascience Research BACKGROUND: Whilst much sequencing effort has focused on key mammalian model organisms such as mouse and human, little is known about the relationship between genome sequencing techniques for non-model mammals and genome assembly quality. This is especially relevant to non-model mammals, where the samples to be sequenced are often degraded and of low quality. A key aspect when planning a genome project is the choice of sequencing data to generate. This decision is driven by several factors, including the biological questions being asked, the quality of DNA available, and the availability of funds. Cutting-edge sequencing technologies now make it possible to achieve highly contiguous, chromosome-level genome assemblies, but rely on high-quality high molecular weight DNA. However, funding is often insufficient for many independent research groups to use these techniques. Here we use a range of different genomic technologies generated from a roadkill European polecat (Mustela putorius) to assess various assembly techniques on this low-quality sample. We evaluated different approaches for de novo assemblies and discuss their value in relation to biological analyses. RESULTS: Generally, assemblies containing more data types achieved better scores in our ranking system. However, when accounting for misassemblies, this was not always the case for Bionano and low-coverage 10x Genomics (for scaffolding only). We also find that the extra cost associated with combining multiple data types is not necessarily associated with better genome assemblies. CONCLUSIONS: The high degree of variability between each de novo assembly method (assessed from the 7 key metrics) highlights the importance of carefully devising the sequencing strategy to be able to carry out the desired analysis. Adding more data to genome assemblies does not always result in better assemblies, so it is important to understand the nuances of genomic data integration explained here, in order to obtain cost-effective value for money when sequencing genomes. Oxford University Press 2020-05-12 /pmc/articles/PMC7216774/ /pubmed/32396200 http://dx.doi.org/10.1093/gigascience/giaa045 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Etherington, Graham J
Heavens, Darren
Baker, David
Lister, Ashleigh
McNelly, Rose
Garcia, Gonzalo
Clavijo, Bernardo
Macaulay, Iain
Haerty, Wilfried
Di Palma, Federica
Sequencing smart: De novo sequencing and assembly approaches for a non-model mammal
title Sequencing smart: De novo sequencing and assembly approaches for a non-model mammal
title_full Sequencing smart: De novo sequencing and assembly approaches for a non-model mammal
title_fullStr Sequencing smart: De novo sequencing and assembly approaches for a non-model mammal
title_full_unstemmed Sequencing smart: De novo sequencing and assembly approaches for a non-model mammal
title_short Sequencing smart: De novo sequencing and assembly approaches for a non-model mammal
title_sort sequencing smart: de novo sequencing and assembly approaches for a non-model mammal
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7216774/
https://www.ncbi.nlm.nih.gov/pubmed/32396200
http://dx.doi.org/10.1093/gigascience/giaa045
work_keys_str_mv AT etheringtongrahamj sequencingsmartdenovosequencingandassemblyapproachesforanonmodelmammal
AT heavensdarren sequencingsmartdenovosequencingandassemblyapproachesforanonmodelmammal
AT bakerdavid sequencingsmartdenovosequencingandassemblyapproachesforanonmodelmammal
AT listerashleigh sequencingsmartdenovosequencingandassemblyapproachesforanonmodelmammal
AT mcnellyrose sequencingsmartdenovosequencingandassemblyapproachesforanonmodelmammal
AT garciagonzalo sequencingsmartdenovosequencingandassemblyapproachesforanonmodelmammal
AT clavijobernardo sequencingsmartdenovosequencingandassemblyapproachesforanonmodelmammal
AT macaulayiain sequencingsmartdenovosequencingandassemblyapproachesforanonmodelmammal
AT haertywilfried sequencingsmartdenovosequencingandassemblyapproachesforanonmodelmammal
AT dipalmafederica sequencingsmartdenovosequencingandassemblyapproachesforanonmodelmammal