Cargando…
Comparative genomic data of the Avian Phylogenomics Project
BACKGROUND: The evolutionary relationships of modern birds are among the most challenging to understand in systematic biology and have been debated for centuries. To address this challenge, we assembled or collected the genomes of 48 avian species spanning most orders of birds, including all Neognat...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4322804/ https://www.ncbi.nlm.nih.gov/pubmed/25671091 http://dx.doi.org/10.1186/2047-217X-3-26 |
_version_ | 1782356445426089984 |
---|---|
author | Zhang, Guojie Li, Bo Li, Cai Gilbert, M Thomas P Jarvis, Erich D Wang, Jun |
author_facet | Zhang, Guojie Li, Bo Li, Cai Gilbert, M Thomas P Jarvis, Erich D Wang, Jun |
author_sort | Zhang, Guojie |
collection | PubMed |
description | BACKGROUND: The evolutionary relationships of modern birds are among the most challenging to understand in systematic biology and have been debated for centuries. To address this challenge, we assembled or collected the genomes of 48 avian species spanning most orders of birds, including all Neognathae and two of the five Palaeognathae orders, and used the genomes to construct a genome-scale avian phylogenetic tree and perform comparative genomics analyses (Jarvis et al. in press; Zhang et al. in press). Here we release assemblies and datasets associated with the comparative genome analyses, which include 38 newly sequenced avian genomes plus previously released or simultaneously released genomes of Chicken, Zebra finch, Turkey, Pigeon, Peregrine falcon, Duck, Budgerigar, Adelie penguin, Emperor penguin and the Medium Ground Finch. We hope that this resource will serve future efforts in phylogenomics and comparative genomics. FINDINGS: The 38 bird genomes were sequenced using the Illumina HiSeq 2000 platform and assembled using a whole genome shotgun strategy. The 48 genomes were categorized into two groups according to the N50 scaffold size of the assemblies: a high depth group comprising 23 species sequenced at high coverage (>50X) with multiple insert size libraries resulting in N50 scaffold sizes greater than 1 Mb (except the White-throated Tinamou and Bald Eagle); and a low depth group comprising 25 species sequenced at a low coverage (~30X) with two insert size libraries resulting in an average N50 scaffold size of about 50 kb. Repetitive elements comprised 4%-22% of the bird genomes. The assembled scaffolds allowed the homology-based annotation of 13,000 ~ 17000 protein coding genes in each avian genome relative to chicken, zebra finch and human, as well as comparative and sequence conservation analyses. CONCLUSIONS: Here we release full genome assemblies of 38 newly sequenced avian species, link genome assembly downloads for the 7 of the remaining 10 species, and provide a guideline of genomic data that has been generated and used in our Avian Phylogenomics Project. To the best of our knowledge, the Avian Phylogenomics Project is the biggest vertebrate comparative genomics project to date. The genomic data presented here is expected to accelerate further analyses in many fields, including phylogenetics, comparative genomics, evolution, neurobiology, development biology, and other related areas. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/2047-217X-3-26) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4322804 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-43228042015-02-11 Comparative genomic data of the Avian Phylogenomics Project Zhang, Guojie Li, Bo Li, Cai Gilbert, M Thomas P Jarvis, Erich D Wang, Jun Gigascience Data Note BACKGROUND: The evolutionary relationships of modern birds are among the most challenging to understand in systematic biology and have been debated for centuries. To address this challenge, we assembled or collected the genomes of 48 avian species spanning most orders of birds, including all Neognathae and two of the five Palaeognathae orders, and used the genomes to construct a genome-scale avian phylogenetic tree and perform comparative genomics analyses (Jarvis et al. in press; Zhang et al. in press). Here we release assemblies and datasets associated with the comparative genome analyses, which include 38 newly sequenced avian genomes plus previously released or simultaneously released genomes of Chicken, Zebra finch, Turkey, Pigeon, Peregrine falcon, Duck, Budgerigar, Adelie penguin, Emperor penguin and the Medium Ground Finch. We hope that this resource will serve future efforts in phylogenomics and comparative genomics. FINDINGS: The 38 bird genomes were sequenced using the Illumina HiSeq 2000 platform and assembled using a whole genome shotgun strategy. The 48 genomes were categorized into two groups according to the N50 scaffold size of the assemblies: a high depth group comprising 23 species sequenced at high coverage (>50X) with multiple insert size libraries resulting in N50 scaffold sizes greater than 1 Mb (except the White-throated Tinamou and Bald Eagle); and a low depth group comprising 25 species sequenced at a low coverage (~30X) with two insert size libraries resulting in an average N50 scaffold size of about 50 kb. Repetitive elements comprised 4%-22% of the bird genomes. The assembled scaffolds allowed the homology-based annotation of 13,000 ~ 17000 protein coding genes in each avian genome relative to chicken, zebra finch and human, as well as comparative and sequence conservation analyses. CONCLUSIONS: Here we release full genome assemblies of 38 newly sequenced avian species, link genome assembly downloads for the 7 of the remaining 10 species, and provide a guideline of genomic data that has been generated and used in our Avian Phylogenomics Project. To the best of our knowledge, the Avian Phylogenomics Project is the biggest vertebrate comparative genomics project to date. The genomic data presented here is expected to accelerate further analyses in many fields, including phylogenetics, comparative genomics, evolution, neurobiology, development biology, and other related areas. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/2047-217X-3-26) contains supplementary material, which is available to authorized users. BioMed Central 2014-12-11 /pmc/articles/PMC4322804/ /pubmed/25671091 http://dx.doi.org/10.1186/2047-217X-3-26 Text en © Zhang et al.; licensee BioMed Central Ltd. 2014 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Data Note Zhang, Guojie Li, Bo Li, Cai Gilbert, M Thomas P Jarvis, Erich D Wang, Jun Comparative genomic data of the Avian Phylogenomics Project |
title | Comparative genomic data of the Avian Phylogenomics Project |
title_full | Comparative genomic data of the Avian Phylogenomics Project |
title_fullStr | Comparative genomic data of the Avian Phylogenomics Project |
title_full_unstemmed | Comparative genomic data of the Avian Phylogenomics Project |
title_short | Comparative genomic data of the Avian Phylogenomics Project |
title_sort | comparative genomic data of the avian phylogenomics project |
topic | Data Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4322804/ https://www.ncbi.nlm.nih.gov/pubmed/25671091 http://dx.doi.org/10.1186/2047-217X-3-26 |
work_keys_str_mv | AT zhangguojie comparativegenomicdataoftheavianphylogenomicsproject AT libo comparativegenomicdataoftheavianphylogenomicsproject AT licai comparativegenomicdataoftheavianphylogenomicsproject AT gilbertmthomasp comparativegenomicdataoftheavianphylogenomicsproject AT jarviserichd comparativegenomicdataoftheavianphylogenomicsproject AT wangjun comparativegenomicdataoftheavianphylogenomicsproject AT comparativegenomicdataoftheavianphylogenomicsproject |