Cargando…

Revisiting avian ‘missing’ genes from de novo assembled transcripts

BACKGROUND: Argument remains as to whether birds have lost genes compared with mammals and non-avian vertebrates during speciation. High quality-reference gene sets are necessary for precisely evaluating gene gain and loss. It is essential to explore new reference transcripts from large-scale de nov...

Descripción completa

Detalles Bibliográficos
Autores principales: Yin, Zhong-Tao, Zhu, Feng, Lin, Fang-Bin, Jia, Ting, Wang, Zhen, Sun, Dong-Ting, Li, Guang-Shen, Zhang, Cheng-Lin, Smith, Jacqueline, Yang, Ning, Hou, Zhuo-Cheng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6321700/
https://www.ncbi.nlm.nih.gov/pubmed/30611188
http://dx.doi.org/10.1186/s12864-018-5407-1
Descripción
Sumario:BACKGROUND: Argument remains as to whether birds have lost genes compared with mammals and non-avian vertebrates during speciation. High quality-reference gene sets are necessary for precisely evaluating gene gain and loss. It is essential to explore new reference transcripts from large-scale de novo assembled transcriptomes to recover the potential hidden genes in avian genomes. RESULTS: We explored 196 high quality transcriptomic datasets from five bird species to reconstruct transcripts for the purpose of discovering potential hidden genes in the avian genomes. We constructed a relatively complete and high-quality bird transcript database (1,623,045 transcripts after quality control in five birds) from a large amount of avian transcriptomic data, and found most of the presumed missing genes (83.2%) could be recovered in at least one bird species. Most of these genes have been identified for the first time in birds. Our results demonstrate that 67.94% genes have GC content over 50%, while 2.91% genes are AT-rich (AT% > 60%). In our results, 239 (53.59%) genes had a tissue-specific expression index of more than 0.9 in chicken. The missing genes also have lower Ka/Ks values than average (genome-wide: Ka/Ks = 0.99; missing gene: Ka/Ks = 0.90; t-test = 1.25E-14). Among all presumed missing genes, there were 135 for which we did not find any meaningful orthologues in any of the 5 species studied. CONCLUSION: Insufficient reference genome quality is the major reason for wrongly inferring missing genes in birds. Those presumably missing genes often have a very strong tissue-specific expression pattern. We show multi-tissue transcriptomic data from various species are necessary for inferring gene family evolution for species with only draft reference genomes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-018-5407-1) contains supplementary material, which is available to authorized users.