Cargando…

Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost Contigs

For a long time, the construction of complete reference genomes for complex eukaryotic genomes has been hindered by the limitations of sequencing technologies. Recently, the Pacific Biosciences (PacBio) HiFi data and Oxford Nanopore Technologies (ONT) Ultra-Long data, leveraging their respective adv...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Junyang, Liu, Fang, Pan, Weihua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10606404/
https://www.ncbi.nlm.nih.gov/pubmed/37895275
http://dx.doi.org/10.3390/genes14101926
_version_ 1785127308209684480
author Liu, Junyang
Liu, Fang
Pan, Weihua
author_facet Liu, Junyang
Liu, Fang
Pan, Weihua
author_sort Liu, Junyang
collection PubMed
description For a long time, the construction of complete reference genomes for complex eukaryotic genomes has been hindered by the limitations of sequencing technologies. Recently, the Pacific Biosciences (PacBio) HiFi data and Oxford Nanopore Technologies (ONT) Ultra-Long data, leveraging their respective advantages in accuracy and length, have provided an opportunity for generating complete chromosome sequences. Nevertheless, for the majority of genomes, the chromosome-level assemblies generated using existing methods still miss a high proportion of sequences due to losing small contigs in the step of assembly and scaffolding. To address this shortcoming, in this paper, we propose a novel method that is able to identify and fill the gaps in the chromosome-level assembly by recalling the sequences in the lost small contigs. Experimental results on both real and simulated datasets demonstrate that this method is able to improve the completeness of the chromosome-level assembly.
format Online
Article
Text
id pubmed-10606404
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-106064042023-10-28 Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost Contigs Liu, Junyang Liu, Fang Pan, Weihua Genes (Basel) Brief Report For a long time, the construction of complete reference genomes for complex eukaryotic genomes has been hindered by the limitations of sequencing technologies. Recently, the Pacific Biosciences (PacBio) HiFi data and Oxford Nanopore Technologies (ONT) Ultra-Long data, leveraging their respective advantages in accuracy and length, have provided an opportunity for generating complete chromosome sequences. Nevertheless, for the majority of genomes, the chromosome-level assemblies generated using existing methods still miss a high proportion of sequences due to losing small contigs in the step of assembly and scaffolding. To address this shortcoming, in this paper, we propose a novel method that is able to identify and fill the gaps in the chromosome-level assembly by recalling the sequences in the lost small contigs. Experimental results on both real and simulated datasets demonstrate that this method is able to improve the completeness of the chromosome-level assembly. MDPI 2023-10-10 /pmc/articles/PMC10606404/ /pubmed/37895275 http://dx.doi.org/10.3390/genes14101926 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Brief Report
Liu, Junyang
Liu, Fang
Pan, Weihua
Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost Contigs
title Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost Contigs
title_full Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost Contigs
title_fullStr Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost Contigs
title_full_unstemmed Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost Contigs
title_short Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost Contigs
title_sort improving the completeness of chromosome-level assembly by recalling sequences from lost contigs
topic Brief Report
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10606404/
https://www.ncbi.nlm.nih.gov/pubmed/37895275
http://dx.doi.org/10.3390/genes14101926
work_keys_str_mv AT liujunyang improvingthecompletenessofchromosomelevelassemblybyrecallingsequencesfromlostcontigs
AT liufang improvingthecompletenessofchromosomelevelassemblybyrecallingsequencesfromlostcontigs
AT panweihua improvingthecompletenessofchromosomelevelassemblybyrecallingsequencesfromlostcontigs