Cargando…

Generation of Physical Map Contig-Specific Sequences Useful for Whole Genome Sequence Scaffolding

Along with the rapid advances of the nextgen sequencing technologies, more and more species are added to the list of organisms whose whole genomes are sequenced. However, the assembled draft genome of many organisms consists of numerous small contigs, due to the short length of the reads generated b...

Descripción completa

Detalles Bibliográficos
Autores principales: Jiang, Yanliang, Ninwichian, Parichart, Liu, Shikai, Zhang, Jiaren, Kucuktas, Huseyin, Sun, Fanyue, Kaltenboeck, Ludmilla, Sun, Luyang, Bao, Lisui, Liu, Zhanjiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3811975/
https://www.ncbi.nlm.nih.gov/pubmed/24205335
http://dx.doi.org/10.1371/journal.pone.0078872
_version_ 1782288907275075584
author Jiang, Yanliang
Ninwichian, Parichart
Liu, Shikai
Zhang, Jiaren
Kucuktas, Huseyin
Sun, Fanyue
Kaltenboeck, Ludmilla
Sun, Luyang
Bao, Lisui
Liu, Zhanjiang
author_facet Jiang, Yanliang
Ninwichian, Parichart
Liu, Shikai
Zhang, Jiaren
Kucuktas, Huseyin
Sun, Fanyue
Kaltenboeck, Ludmilla
Sun, Luyang
Bao, Lisui
Liu, Zhanjiang
author_sort Jiang, Yanliang
collection PubMed
description Along with the rapid advances of the nextgen sequencing technologies, more and more species are added to the list of organisms whose whole genomes are sequenced. However, the assembled draft genome of many organisms consists of numerous small contigs, due to the short length of the reads generated by nextgen sequencing platforms. In order to improve the assembly and bring the genome contigs together, more genome resources are needed. In this study, we developed a strategy to generate a valuable genome resource, physical map contig-specific sequences, which are randomly distributed genome sequences in each physical contig. Two-dimensional tagging method was used to create specific tags for 1,824 physical contigs, in which the cost was dramatically reduced. A total of 94,111,841 100-bp reads and 315,277 assembled contigs are identified containing physical map contig-specific tags. The physical map contig-specific sequences along with the currently available BAC end sequences were then used to anchor the catfish draft genome contigs. A total of 156,457 genome contigs (~79% of whole genome sequencing assembly) were anchored and grouped into 1,824 pools, in which 16,680 unique genes were annotated. The physical map contig-specific sequences are valuable resources to link physical map, genetic linkage map and draft whole genome sequences, consequently have the capability to improve the whole genome sequences assembly and scaffolding, and improve the genome-wide comparative analysis as well. The strategy developed in this study could also be adopted in other species whose whole genome assembly is still facing a challenge.
format Online
Article
Text
id pubmed-3811975
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-38119752013-11-07 Generation of Physical Map Contig-Specific Sequences Useful for Whole Genome Sequence Scaffolding Jiang, Yanliang Ninwichian, Parichart Liu, Shikai Zhang, Jiaren Kucuktas, Huseyin Sun, Fanyue Kaltenboeck, Ludmilla Sun, Luyang Bao, Lisui Liu, Zhanjiang PLoS One Research Article Along with the rapid advances of the nextgen sequencing technologies, more and more species are added to the list of organisms whose whole genomes are sequenced. However, the assembled draft genome of many organisms consists of numerous small contigs, due to the short length of the reads generated by nextgen sequencing platforms. In order to improve the assembly and bring the genome contigs together, more genome resources are needed. In this study, we developed a strategy to generate a valuable genome resource, physical map contig-specific sequences, which are randomly distributed genome sequences in each physical contig. Two-dimensional tagging method was used to create specific tags for 1,824 physical contigs, in which the cost was dramatically reduced. A total of 94,111,841 100-bp reads and 315,277 assembled contigs are identified containing physical map contig-specific tags. The physical map contig-specific sequences along with the currently available BAC end sequences were then used to anchor the catfish draft genome contigs. A total of 156,457 genome contigs (~79% of whole genome sequencing assembly) were anchored and grouped into 1,824 pools, in which 16,680 unique genes were annotated. The physical map contig-specific sequences are valuable resources to link physical map, genetic linkage map and draft whole genome sequences, consequently have the capability to improve the whole genome sequences assembly and scaffolding, and improve the genome-wide comparative analysis as well. The strategy developed in this study could also be adopted in other species whose whole genome assembly is still facing a challenge. Public Library of Science 2013-10-24 /pmc/articles/PMC3811975/ /pubmed/24205335 http://dx.doi.org/10.1371/journal.pone.0078872 Text en © 2013 Jiang et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Jiang, Yanliang
Ninwichian, Parichart
Liu, Shikai
Zhang, Jiaren
Kucuktas, Huseyin
Sun, Fanyue
Kaltenboeck, Ludmilla
Sun, Luyang
Bao, Lisui
Liu, Zhanjiang
Generation of Physical Map Contig-Specific Sequences Useful for Whole Genome Sequence Scaffolding
title Generation of Physical Map Contig-Specific Sequences Useful for Whole Genome Sequence Scaffolding
title_full Generation of Physical Map Contig-Specific Sequences Useful for Whole Genome Sequence Scaffolding
title_fullStr Generation of Physical Map Contig-Specific Sequences Useful for Whole Genome Sequence Scaffolding
title_full_unstemmed Generation of Physical Map Contig-Specific Sequences Useful for Whole Genome Sequence Scaffolding
title_short Generation of Physical Map Contig-Specific Sequences Useful for Whole Genome Sequence Scaffolding
title_sort generation of physical map contig-specific sequences useful for whole genome sequence scaffolding
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3811975/
https://www.ncbi.nlm.nih.gov/pubmed/24205335
http://dx.doi.org/10.1371/journal.pone.0078872
work_keys_str_mv AT jiangyanliang generationofphysicalmapcontigspecificsequencesusefulforwholegenomesequencescaffolding
AT ninwichianparichart generationofphysicalmapcontigspecificsequencesusefulforwholegenomesequencescaffolding
AT liushikai generationofphysicalmapcontigspecificsequencesusefulforwholegenomesequencescaffolding
AT zhangjiaren generationofphysicalmapcontigspecificsequencesusefulforwholegenomesequencescaffolding
AT kucuktashuseyin generationofphysicalmapcontigspecificsequencesusefulforwholegenomesequencescaffolding
AT sunfanyue generationofphysicalmapcontigspecificsequencesusefulforwholegenomesequencescaffolding
AT kaltenboeckludmilla generationofphysicalmapcontigspecificsequencesusefulforwholegenomesequencescaffolding
AT sunluyang generationofphysicalmapcontigspecificsequencesusefulforwholegenomesequencescaffolding
AT baolisui generationofphysicalmapcontigspecificsequencesusefulforwholegenomesequencescaffolding
AT liuzhanjiang generationofphysicalmapcontigspecificsequencesusefulforwholegenomesequencescaffolding