Cargando…
Sequencing smart: De novo sequencing and assembly approaches for a non-model mammal
BACKGROUND: Whilst much sequencing effort has focused on key mammalian model organisms such as mouse and human, little is known about the relationship between genome sequencing techniques for non-model mammals and genome assembly quality. This is especially relevant to non-model mammals, where the s...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7216774/ https://www.ncbi.nlm.nih.gov/pubmed/32396200 http://dx.doi.org/10.1093/gigascience/giaa045 |
_version_ | 1783532479361056768 |
---|---|
author | Etherington, Graham J Heavens, Darren Baker, David Lister, Ashleigh McNelly, Rose Garcia, Gonzalo Clavijo, Bernardo Macaulay, Iain Haerty, Wilfried Di Palma, Federica |
author_facet | Etherington, Graham J Heavens, Darren Baker, David Lister, Ashleigh McNelly, Rose Garcia, Gonzalo Clavijo, Bernardo Macaulay, Iain Haerty, Wilfried Di Palma, Federica |
author_sort | Etherington, Graham J |
collection | PubMed |
description | BACKGROUND: Whilst much sequencing effort has focused on key mammalian model organisms such as mouse and human, little is known about the relationship between genome sequencing techniques for non-model mammals and genome assembly quality. This is especially relevant to non-model mammals, where the samples to be sequenced are often degraded and of low quality. A key aspect when planning a genome project is the choice of sequencing data to generate. This decision is driven by several factors, including the biological questions being asked, the quality of DNA available, and the availability of funds. Cutting-edge sequencing technologies now make it possible to achieve highly contiguous, chromosome-level genome assemblies, but rely on high-quality high molecular weight DNA. However, funding is often insufficient for many independent research groups to use these techniques. Here we use a range of different genomic technologies generated from a roadkill European polecat (Mustela putorius) to assess various assembly techniques on this low-quality sample. We evaluated different approaches for de novo assemblies and discuss their value in relation to biological analyses. RESULTS: Generally, assemblies containing more data types achieved better scores in our ranking system. However, when accounting for misassemblies, this was not always the case for Bionano and low-coverage 10x Genomics (for scaffolding only). We also find that the extra cost associated with combining multiple data types is not necessarily associated with better genome assemblies. CONCLUSIONS: The high degree of variability between each de novo assembly method (assessed from the 7 key metrics) highlights the importance of carefully devising the sequencing strategy to be able to carry out the desired analysis. Adding more data to genome assemblies does not always result in better assemblies, so it is important to understand the nuances of genomic data integration explained here, in order to obtain cost-effective value for money when sequencing genomes. |
format | Online Article Text |
id | pubmed-7216774 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-72167742020-05-15 Sequencing smart: De novo sequencing and assembly approaches for a non-model mammal Etherington, Graham J Heavens, Darren Baker, David Lister, Ashleigh McNelly, Rose Garcia, Gonzalo Clavijo, Bernardo Macaulay, Iain Haerty, Wilfried Di Palma, Federica Gigascience Research BACKGROUND: Whilst much sequencing effort has focused on key mammalian model organisms such as mouse and human, little is known about the relationship between genome sequencing techniques for non-model mammals and genome assembly quality. This is especially relevant to non-model mammals, where the samples to be sequenced are often degraded and of low quality. A key aspect when planning a genome project is the choice of sequencing data to generate. This decision is driven by several factors, including the biological questions being asked, the quality of DNA available, and the availability of funds. Cutting-edge sequencing technologies now make it possible to achieve highly contiguous, chromosome-level genome assemblies, but rely on high-quality high molecular weight DNA. However, funding is often insufficient for many independent research groups to use these techniques. Here we use a range of different genomic technologies generated from a roadkill European polecat (Mustela putorius) to assess various assembly techniques on this low-quality sample. We evaluated different approaches for de novo assemblies and discuss their value in relation to biological analyses. RESULTS: Generally, assemblies containing more data types achieved better scores in our ranking system. However, when accounting for misassemblies, this was not always the case for Bionano and low-coverage 10x Genomics (for scaffolding only). We also find that the extra cost associated with combining multiple data types is not necessarily associated with better genome assemblies. CONCLUSIONS: The high degree of variability between each de novo assembly method (assessed from the 7 key metrics) highlights the importance of carefully devising the sequencing strategy to be able to carry out the desired analysis. Adding more data to genome assemblies does not always result in better assemblies, so it is important to understand the nuances of genomic data integration explained here, in order to obtain cost-effective value for money when sequencing genomes. Oxford University Press 2020-05-12 /pmc/articles/PMC7216774/ /pubmed/32396200 http://dx.doi.org/10.1093/gigascience/giaa045 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Etherington, Graham J Heavens, Darren Baker, David Lister, Ashleigh McNelly, Rose Garcia, Gonzalo Clavijo, Bernardo Macaulay, Iain Haerty, Wilfried Di Palma, Federica Sequencing smart: De novo sequencing and assembly approaches for a non-model mammal |
title | Sequencing smart: De novo sequencing and assembly approaches for a non-model mammal |
title_full | Sequencing smart: De novo sequencing and assembly approaches for a non-model mammal |
title_fullStr | Sequencing smart: De novo sequencing and assembly approaches for a non-model mammal |
title_full_unstemmed | Sequencing smart: De novo sequencing and assembly approaches for a non-model mammal |
title_short | Sequencing smart: De novo sequencing and assembly approaches for a non-model mammal |
title_sort | sequencing smart: de novo sequencing and assembly approaches for a non-model mammal |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7216774/ https://www.ncbi.nlm.nih.gov/pubmed/32396200 http://dx.doi.org/10.1093/gigascience/giaa045 |
work_keys_str_mv | AT etheringtongrahamj sequencingsmartdenovosequencingandassemblyapproachesforanonmodelmammal AT heavensdarren sequencingsmartdenovosequencingandassemblyapproachesforanonmodelmammal AT bakerdavid sequencingsmartdenovosequencingandassemblyapproachesforanonmodelmammal AT listerashleigh sequencingsmartdenovosequencingandassemblyapproachesforanonmodelmammal AT mcnellyrose sequencingsmartdenovosequencingandassemblyapproachesforanonmodelmammal AT garciagonzalo sequencingsmartdenovosequencingandassemblyapproachesforanonmodelmammal AT clavijobernardo sequencingsmartdenovosequencingandassemblyapproachesforanonmodelmammal AT macaulayiain sequencingsmartdenovosequencingandassemblyapproachesforanonmodelmammal AT haertywilfried sequencingsmartdenovosequencingandassemblyapproachesforanonmodelmammal AT dipalmafederica sequencingsmartdenovosequencingandassemblyapproachesforanonmodelmammal |