Cargando…

How Long Are Long Tandem Repeats? A Challenge for Current Methods of Whole-Genome Sequence Assembly: The Case of Satellites in Caenorhabditis elegans

Repetitive genome regions have been difficult to sequence, mainly because of the comparatively small size of the fragments used in assembly. Satellites or tandem repeats are very abundant in nematodes and offer an excellent playground to evaluate different assembly methods. Here, we compare the stru...

Descripción completa

Detalles Bibliográficos
Autores principales: Subirana, Juan A., Messeguer, Xavier
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6210790/
https://www.ncbi.nlm.nih.gov/pubmed/30332836
http://dx.doi.org/10.3390/genes9100500
_version_ 1783367197915086848
author Subirana, Juan A.
Messeguer, Xavier
author_facet Subirana, Juan A.
Messeguer, Xavier
author_sort Subirana, Juan A.
collection PubMed
description Repetitive genome regions have been difficult to sequence, mainly because of the comparatively small size of the fragments used in assembly. Satellites or tandem repeats are very abundant in nematodes and offer an excellent playground to evaluate different assembly methods. Here, we compare the structure of satellites found in three different assemblies of the Caenorhabditis elegans genome: the original sequence obtained by Sanger sequencing, an assembly based on PacBio technology, and an assembly using Nanopore sequencing reads. In general, satellites were found in equivalent genomic regions, but the new long-read methods (PacBio and Nanopore) tended to result in longer assembled satellites. Important differences exist between the assemblies resulting from the two long-read technologies, such as the sizes of long satellites. Our results also suggest that the lengths of some annotated genes with internal repeats which were assembled using Sanger sequencing are likely to be incorrect.
format Online
Article
Text
id pubmed-6210790
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-62107902018-11-02 How Long Are Long Tandem Repeats? A Challenge for Current Methods of Whole-Genome Sequence Assembly: The Case of Satellites in Caenorhabditis elegans Subirana, Juan A. Messeguer, Xavier Genes (Basel) Article Repetitive genome regions have been difficult to sequence, mainly because of the comparatively small size of the fragments used in assembly. Satellites or tandem repeats are very abundant in nematodes and offer an excellent playground to evaluate different assembly methods. Here, we compare the structure of satellites found in three different assemblies of the Caenorhabditis elegans genome: the original sequence obtained by Sanger sequencing, an assembly based on PacBio technology, and an assembly using Nanopore sequencing reads. In general, satellites were found in equivalent genomic regions, but the new long-read methods (PacBio and Nanopore) tended to result in longer assembled satellites. Important differences exist between the assemblies resulting from the two long-read technologies, such as the sizes of long satellites. Our results also suggest that the lengths of some annotated genes with internal repeats which were assembled using Sanger sequencing are likely to be incorrect. MDPI 2018-10-16 /pmc/articles/PMC6210790/ /pubmed/30332836 http://dx.doi.org/10.3390/genes9100500 Text en © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Subirana, Juan A.
Messeguer, Xavier
How Long Are Long Tandem Repeats? A Challenge for Current Methods of Whole-Genome Sequence Assembly: The Case of Satellites in Caenorhabditis elegans
title How Long Are Long Tandem Repeats? A Challenge for Current Methods of Whole-Genome Sequence Assembly: The Case of Satellites in Caenorhabditis elegans
title_full How Long Are Long Tandem Repeats? A Challenge for Current Methods of Whole-Genome Sequence Assembly: The Case of Satellites in Caenorhabditis elegans
title_fullStr How Long Are Long Tandem Repeats? A Challenge for Current Methods of Whole-Genome Sequence Assembly: The Case of Satellites in Caenorhabditis elegans
title_full_unstemmed How Long Are Long Tandem Repeats? A Challenge for Current Methods of Whole-Genome Sequence Assembly: The Case of Satellites in Caenorhabditis elegans
title_short How Long Are Long Tandem Repeats? A Challenge for Current Methods of Whole-Genome Sequence Assembly: The Case of Satellites in Caenorhabditis elegans
title_sort how long are long tandem repeats? a challenge for current methods of whole-genome sequence assembly: the case of satellites in caenorhabditis elegans
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6210790/
https://www.ncbi.nlm.nih.gov/pubmed/30332836
http://dx.doi.org/10.3390/genes9100500
work_keys_str_mv AT subiranajuana howlongarelongtandemrepeatsachallengeforcurrentmethodsofwholegenomesequenceassemblythecaseofsatellitesincaenorhabditiselegans
AT messeguerxavier howlongarelongtandemrepeatsachallengeforcurrentmethodsofwholegenomesequenceassemblythecaseofsatellitesincaenorhabditiselegans