Cargando…

Ab initio identification of transcription start sites in the Rhesus macaque genome by histone modification and RNA-Seq

Rhesus macaque is a widely used primate model organism. Its genome annotations are however still largely comparative computational predictions derived mainly from human genes, which precludes studies on the macaque-specific genes, gene isoforms or their regulations. Here we took advantage of histone...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Yi, Han, Dali, Han, Yixing, Yan, Zheng, Xie, Bin, Li, Jing, Qiao, Nan, Hu, Haiyang, Khaitovich, Philipp, Gao, Yuan, Han, Jing-Dong J.
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3045608/
https://www.ncbi.nlm.nih.gov/pubmed/20952408
http://dx.doi.org/10.1093/nar/gkq956
_version_ 1782198855480115200
author Liu, Yi
Han, Dali
Han, Yixing
Yan, Zheng
Xie, Bin
Li, Jing
Qiao, Nan
Hu, Haiyang
Khaitovich, Philipp
Gao, Yuan
Han, Jing-Dong J.
author_facet Liu, Yi
Han, Dali
Han, Yixing
Yan, Zheng
Xie, Bin
Li, Jing
Qiao, Nan
Hu, Haiyang
Khaitovich, Philipp
Gao, Yuan
Han, Jing-Dong J.
author_sort Liu, Yi
collection PubMed
description Rhesus macaque is a widely used primate model organism. Its genome annotations are however still largely comparative computational predictions derived mainly from human genes, which precludes studies on the macaque-specific genes, gene isoforms or their regulations. Here we took advantage of histone H3 lysine 4 trimethylation (H3K4me3)’s ability to mark transcription start sites (TSSs) and the recently developed ChIP-Seq and RNA-Seq technology to survey the transcript structures. We generated 14 013 757 sequence tags by H3K4me3 ChIP-Seq and obtained 17 322 358 paired end reads for mRNA, and 10 698 419 short reads for sRNA from the macaque brain. By integrating these data with genomic sequence features and extending and improving a state-of-the-art TSS prediction algorithm, we ab initio predicted and verified 17 933 of previously electronically annotated TSSs at 500-bp resolution. We also predicted approximately 10 000 novel TSSs. These provide an important rich resource for close examination of the species-specific transcript structures and transcription regulations in the Rhesus macaque genome. Our approach exemplifies a relatively inexpensive way to generate a reasonably reliable TSS map for a large genome. It may serve as a guiding example for similar genome annotation efforts targeted at other model organisms.
format Text
id pubmed-3045608
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-30456082011-02-28 Ab initio identification of transcription start sites in the Rhesus macaque genome by histone modification and RNA-Seq Liu, Yi Han, Dali Han, Yixing Yan, Zheng Xie, Bin Li, Jing Qiao, Nan Hu, Haiyang Khaitovich, Philipp Gao, Yuan Han, Jing-Dong J. Nucleic Acids Res Genomics Rhesus macaque is a widely used primate model organism. Its genome annotations are however still largely comparative computational predictions derived mainly from human genes, which precludes studies on the macaque-specific genes, gene isoforms or their regulations. Here we took advantage of histone H3 lysine 4 trimethylation (H3K4me3)’s ability to mark transcription start sites (TSSs) and the recently developed ChIP-Seq and RNA-Seq technology to survey the transcript structures. We generated 14 013 757 sequence tags by H3K4me3 ChIP-Seq and obtained 17 322 358 paired end reads for mRNA, and 10 698 419 short reads for sRNA from the macaque brain. By integrating these data with genomic sequence features and extending and improving a state-of-the-art TSS prediction algorithm, we ab initio predicted and verified 17 933 of previously electronically annotated TSSs at 500-bp resolution. We also predicted approximately 10 000 novel TSSs. These provide an important rich resource for close examination of the species-specific transcript structures and transcription regulations in the Rhesus macaque genome. Our approach exemplifies a relatively inexpensive way to generate a reasonably reliable TSS map for a large genome. It may serve as a guiding example for similar genome annotation efforts targeted at other model organisms. Oxford University Press 2011-03 2010-10-14 /pmc/articles/PMC3045608/ /pubmed/20952408 http://dx.doi.org/10.1093/nar/gkq956 Text en © The Author(s) 2010. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Genomics
Liu, Yi
Han, Dali
Han, Yixing
Yan, Zheng
Xie, Bin
Li, Jing
Qiao, Nan
Hu, Haiyang
Khaitovich, Philipp
Gao, Yuan
Han, Jing-Dong J.
Ab initio identification of transcription start sites in the Rhesus macaque genome by histone modification and RNA-Seq
title Ab initio identification of transcription start sites in the Rhesus macaque genome by histone modification and RNA-Seq
title_full Ab initio identification of transcription start sites in the Rhesus macaque genome by histone modification and RNA-Seq
title_fullStr Ab initio identification of transcription start sites in the Rhesus macaque genome by histone modification and RNA-Seq
title_full_unstemmed Ab initio identification of transcription start sites in the Rhesus macaque genome by histone modification and RNA-Seq
title_short Ab initio identification of transcription start sites in the Rhesus macaque genome by histone modification and RNA-Seq
title_sort ab initio identification of transcription start sites in the rhesus macaque genome by histone modification and rna-seq
topic Genomics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3045608/
https://www.ncbi.nlm.nih.gov/pubmed/20952408
http://dx.doi.org/10.1093/nar/gkq956
work_keys_str_mv AT liuyi abinitioidentificationoftranscriptionstartsitesintherhesusmacaquegenomebyhistonemodificationandrnaseq
AT handali abinitioidentificationoftranscriptionstartsitesintherhesusmacaquegenomebyhistonemodificationandrnaseq
AT hanyixing abinitioidentificationoftranscriptionstartsitesintherhesusmacaquegenomebyhistonemodificationandrnaseq
AT yanzheng abinitioidentificationoftranscriptionstartsitesintherhesusmacaquegenomebyhistonemodificationandrnaseq
AT xiebin abinitioidentificationoftranscriptionstartsitesintherhesusmacaquegenomebyhistonemodificationandrnaseq
AT lijing abinitioidentificationoftranscriptionstartsitesintherhesusmacaquegenomebyhistonemodificationandrnaseq
AT qiaonan abinitioidentificationoftranscriptionstartsitesintherhesusmacaquegenomebyhistonemodificationandrnaseq
AT huhaiyang abinitioidentificationoftranscriptionstartsitesintherhesusmacaquegenomebyhistonemodificationandrnaseq
AT khaitovichphilipp abinitioidentificationoftranscriptionstartsitesintherhesusmacaquegenomebyhistonemodificationandrnaseq
AT gaoyuan abinitioidentificationoftranscriptionstartsitesintherhesusmacaquegenomebyhistonemodificationandrnaseq
AT hanjingdongj abinitioidentificationoftranscriptionstartsitesintherhesusmacaquegenomebyhistonemodificationandrnaseq