Cargando…

Construction of Whole Genomes from Scaffolds Using Single Cell Strand-Seq Data

Accurate reference genome sequences provide the foundation for modern molecular biology and genomics as the interpretation of sequence data to study evolution, gene expression, and epigenetics depends heavily on the quality of the genome assembly used for its alignment. Correctly organising sequence...

Descripción completa

Detalles Bibliográficos
Autores principales: Hills, Mark, Falconer, Ester, O’Neill, Kieran, Sanders, Ashley D., Howe, Kerstin, Guryev, Victor, Lansdorp, Peter M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8037727/
https://www.ncbi.nlm.nih.gov/pubmed/33807210
http://dx.doi.org/10.3390/ijms22073617
_version_ 1783677211193114624
author Hills, Mark
Falconer, Ester
O’Neill, Kieran
Sanders, Ashley D.
Howe, Kerstin
Guryev, Victor
Lansdorp, Peter M.
author_facet Hills, Mark
Falconer, Ester
O’Neill, Kieran
Sanders, Ashley D.
Howe, Kerstin
Guryev, Victor
Lansdorp, Peter M.
author_sort Hills, Mark
collection PubMed
description Accurate reference genome sequences provide the foundation for modern molecular biology and genomics as the interpretation of sequence data to study evolution, gene expression, and epigenetics depends heavily on the quality of the genome assembly used for its alignment. Correctly organising sequenced fragments such as contigs and scaffolds in relation to each other is a critical and often challenging step in the construction of robust genome references. We previously identified misoriented regions in the mouse and human reference assemblies using Strand-seq, a single cell sequencing technique that preserves DNA directionality Here we demonstrate the ability of Strand-seq to build and correct full-length chromosomes by identifying which scaffolds belong to the same chromosome and determining their correct order and orientation, without the need for overlapping sequences. We demonstrate that Strand-seq exquisitely maps assembly fragments into large related groups and chromosome-sized clusters without using new assembly data. Using template strand inheritance as a bi-allelic marker, we employ genetic mapping principles to cluster scaffolds that are derived from the same chromosome and order them within the chromosome based solely on directionality of DNA strand inheritance. We prove the utility of our approach by generating improved genome assemblies for several model organisms including the ferret, pig, Xenopus, zebrafish, Tasmanian devil and the Guinea pig.
format Online
Article
Text
id pubmed-8037727
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-80377272021-04-12 Construction of Whole Genomes from Scaffolds Using Single Cell Strand-Seq Data Hills, Mark Falconer, Ester O’Neill, Kieran Sanders, Ashley D. Howe, Kerstin Guryev, Victor Lansdorp, Peter M. Int J Mol Sci Article Accurate reference genome sequences provide the foundation for modern molecular biology and genomics as the interpretation of sequence data to study evolution, gene expression, and epigenetics depends heavily on the quality of the genome assembly used for its alignment. Correctly organising sequenced fragments such as contigs and scaffolds in relation to each other is a critical and often challenging step in the construction of robust genome references. We previously identified misoriented regions in the mouse and human reference assemblies using Strand-seq, a single cell sequencing technique that preserves DNA directionality Here we demonstrate the ability of Strand-seq to build and correct full-length chromosomes by identifying which scaffolds belong to the same chromosome and determining their correct order and orientation, without the need for overlapping sequences. We demonstrate that Strand-seq exquisitely maps assembly fragments into large related groups and chromosome-sized clusters without using new assembly data. Using template strand inheritance as a bi-allelic marker, we employ genetic mapping principles to cluster scaffolds that are derived from the same chromosome and order them within the chromosome based solely on directionality of DNA strand inheritance. We prove the utility of our approach by generating improved genome assemblies for several model organisms including the ferret, pig, Xenopus, zebrafish, Tasmanian devil and the Guinea pig. MDPI 2021-03-31 /pmc/articles/PMC8037727/ /pubmed/33807210 http://dx.doi.org/10.3390/ijms22073617 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Hills, Mark
Falconer, Ester
O’Neill, Kieran
Sanders, Ashley D.
Howe, Kerstin
Guryev, Victor
Lansdorp, Peter M.
Construction of Whole Genomes from Scaffolds Using Single Cell Strand-Seq Data
title Construction of Whole Genomes from Scaffolds Using Single Cell Strand-Seq Data
title_full Construction of Whole Genomes from Scaffolds Using Single Cell Strand-Seq Data
title_fullStr Construction of Whole Genomes from Scaffolds Using Single Cell Strand-Seq Data
title_full_unstemmed Construction of Whole Genomes from Scaffolds Using Single Cell Strand-Seq Data
title_short Construction of Whole Genomes from Scaffolds Using Single Cell Strand-Seq Data
title_sort construction of whole genomes from scaffolds using single cell strand-seq data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8037727/
https://www.ncbi.nlm.nih.gov/pubmed/33807210
http://dx.doi.org/10.3390/ijms22073617
work_keys_str_mv AT hillsmark constructionofwholegenomesfromscaffoldsusingsinglecellstrandseqdata
AT falconerester constructionofwholegenomesfromscaffoldsusingsinglecellstrandseqdata
AT oneillkieran constructionofwholegenomesfromscaffoldsusingsinglecellstrandseqdata
AT sandersashleyd constructionofwholegenomesfromscaffoldsusingsinglecellstrandseqdata
AT howekerstin constructionofwholegenomesfromscaffoldsusingsinglecellstrandseqdata
AT guryevvictor constructionofwholegenomesfromscaffoldsusingsinglecellstrandseqdata
AT lansdorppeterm constructionofwholegenomesfromscaffoldsusingsinglecellstrandseqdata