Cargando…

Improving the ostrich genome assembly using optical mapping data

BACKGROUND: The ostrich (Struthio camelus) is the tallest and heaviest living bird. Ostrich meat is considered a healthy red meat, with an annual worldwide production ranging from 12,000 to 15,000 tons. As part of the avian phylogenomics project, we sequenced the ostrich genome for phylogenetic and...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Jilin, Li, Cai, Zhou, Qi, Zhang, Guojie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4427950/
https://www.ncbi.nlm.nih.gov/pubmed/25969728
http://dx.doi.org/10.1186/s13742-015-0062-9
_version_ 1782370807420289024
author Zhang, Jilin
Li, Cai
Zhou, Qi
Zhang, Guojie
author_facet Zhang, Jilin
Li, Cai
Zhou, Qi
Zhang, Guojie
author_sort Zhang, Jilin
collection PubMed
description BACKGROUND: The ostrich (Struthio camelus) is the tallest and heaviest living bird. Ostrich meat is considered a healthy red meat, with an annual worldwide production ranging from 12,000 to 15,000 tons. As part of the avian phylogenomics project, we sequenced the ostrich genome for phylogenetic and comparative genomics analyses. The initial Illumina-based assembly of this genome had a scaffold N50 of 3.59 Mb and a total size of 1.23 Gb. Since longer scaffolds are critical for many genomic analyses, particularly for chromosome-level comparative analysis, we generated optical mapping (OM) data to obtain an improved assembly. The OM technique is a non-PCR-based method to generate genome-wide restriction enzyme maps, which improves the quality of de novo genome assembly. FINDINGS: In order to generate OM data, we digested the ostrich genome with KpnI, which yielded 1.99 million DNA molecules (>250 kb) and covered the genome at least 500×. The pattern of molecules was subsequently assembled to align with the Illumina-based assembly to achieve sequence extension. This resulted in an OM assembly with a scaffold N50 of 17.71 Mb, which is 5 times as large as that of the initial assembly. The number of scaffolds covering 90% of the genome was reduced from 414 to 75, which means an average of ~3 super-scaffolds for each chromosome. Upon integrating the OM data with previously published FISH (fluorescence in situ hybridization) markers, we recovered the full PAR (pseudoatosomal region) on the ostrich Z chromosome with 4 super-scaffolds, as well as most of the degenerated regions. CONCLUSIONS: The OM data significantly improved the assembled scaffolds of the ostrich genome and facilitated chromosome evolution studies in birds. Similar strategies can be applied to other genome sequencing projects to obtain better assemblies.
format Online
Article
Text
id pubmed-4427950
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-44279502015-05-13 Improving the ostrich genome assembly using optical mapping data Zhang, Jilin Li, Cai Zhou, Qi Zhang, Guojie Gigascience Data Note BACKGROUND: The ostrich (Struthio camelus) is the tallest and heaviest living bird. Ostrich meat is considered a healthy red meat, with an annual worldwide production ranging from 12,000 to 15,000 tons. As part of the avian phylogenomics project, we sequenced the ostrich genome for phylogenetic and comparative genomics analyses. The initial Illumina-based assembly of this genome had a scaffold N50 of 3.59 Mb and a total size of 1.23 Gb. Since longer scaffolds are critical for many genomic analyses, particularly for chromosome-level comparative analysis, we generated optical mapping (OM) data to obtain an improved assembly. The OM technique is a non-PCR-based method to generate genome-wide restriction enzyme maps, which improves the quality of de novo genome assembly. FINDINGS: In order to generate OM data, we digested the ostrich genome with KpnI, which yielded 1.99 million DNA molecules (>250 kb) and covered the genome at least 500×. The pattern of molecules was subsequently assembled to align with the Illumina-based assembly to achieve sequence extension. This resulted in an OM assembly with a scaffold N50 of 17.71 Mb, which is 5 times as large as that of the initial assembly. The number of scaffolds covering 90% of the genome was reduced from 414 to 75, which means an average of ~3 super-scaffolds for each chromosome. Upon integrating the OM data with previously published FISH (fluorescence in situ hybridization) markers, we recovered the full PAR (pseudoatosomal region) on the ostrich Z chromosome with 4 super-scaffolds, as well as most of the degenerated regions. CONCLUSIONS: The OM data significantly improved the assembled scaffolds of the ostrich genome and facilitated chromosome evolution studies in birds. Similar strategies can be applied to other genome sequencing projects to obtain better assemblies. BioMed Central 2015-05-12 /pmc/articles/PMC4427950/ /pubmed/25969728 http://dx.doi.org/10.1186/s13742-015-0062-9 Text en © Zhang et al.; licensee BioMed Central. 2015 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Data Note
Zhang, Jilin
Li, Cai
Zhou, Qi
Zhang, Guojie
Improving the ostrich genome assembly using optical mapping data
title Improving the ostrich genome assembly using optical mapping data
title_full Improving the ostrich genome assembly using optical mapping data
title_fullStr Improving the ostrich genome assembly using optical mapping data
title_full_unstemmed Improving the ostrich genome assembly using optical mapping data
title_short Improving the ostrich genome assembly using optical mapping data
title_sort improving the ostrich genome assembly using optical mapping data
topic Data Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4427950/
https://www.ncbi.nlm.nih.gov/pubmed/25969728
http://dx.doi.org/10.1186/s13742-015-0062-9
work_keys_str_mv AT zhangjilin improvingtheostrichgenomeassemblyusingopticalmappingdata
AT licai improvingtheostrichgenomeassemblyusingopticalmappingdata
AT zhouqi improvingtheostrichgenomeassemblyusingopticalmappingdata
AT zhangguojie improvingtheostrichgenomeassemblyusingopticalmappingdata