Cargando…

Refined annotation and assembly of the Tetrahymena thermophila genome sequence through EST analysis, comparative genomic hybridization, and targeted gap closure

BACKGROUND: Tetrahymena thermophila, a widely studied model for cellular and molecular biology, is a binucleated single-celled organism with a germline micronucleus (MIC) and somatic macronucleus (MAC). The recent draft MAC genome assembly revealed low sequence repetitiveness, a result of the epigen...

Descripción completa

Detalles Bibliográficos
Autores principales: Coyne, Robert S, Thiagarajan, Mathangi, Jones, Kristie M, Wortman, Jennifer R, Tallon, Luke J, Haas, Brian J, Cassidy-Hanley, Donna M, Wiley, Emily A, Smith, Joshua J, Collins, Kathleen, Lee, Suzanne R, Couvillion, Mary T, Liu, Yifan, Garg, Jyoti, Pearlman, Ronald E, Hamilton, Eileen P, Orias, Eduardo, Eisen, Jonathan A, Methé, Barbara A
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2612030/
https://www.ncbi.nlm.nih.gov/pubmed/19036158
http://dx.doi.org/10.1186/1471-2164-9-562
_version_ 1782163118212775936
author Coyne, Robert S
Thiagarajan, Mathangi
Jones, Kristie M
Wortman, Jennifer R
Tallon, Luke J
Haas, Brian J
Cassidy-Hanley, Donna M
Wiley, Emily A
Smith, Joshua J
Collins, Kathleen
Lee, Suzanne R
Couvillion, Mary T
Liu, Yifan
Garg, Jyoti
Pearlman, Ronald E
Hamilton, Eileen P
Orias, Eduardo
Eisen, Jonathan A
Methé, Barbara A
author_facet Coyne, Robert S
Thiagarajan, Mathangi
Jones, Kristie M
Wortman, Jennifer R
Tallon, Luke J
Haas, Brian J
Cassidy-Hanley, Donna M
Wiley, Emily A
Smith, Joshua J
Collins, Kathleen
Lee, Suzanne R
Couvillion, Mary T
Liu, Yifan
Garg, Jyoti
Pearlman, Ronald E
Hamilton, Eileen P
Orias, Eduardo
Eisen, Jonathan A
Methé, Barbara A
author_sort Coyne, Robert S
collection PubMed
description BACKGROUND: Tetrahymena thermophila, a widely studied model for cellular and molecular biology, is a binucleated single-celled organism with a germline micronucleus (MIC) and somatic macronucleus (MAC). The recent draft MAC genome assembly revealed low sequence repetitiveness, a result of the epigenetic removal of invasive DNA elements found only in the MIC genome. Such low repetitiveness makes complete closure of the MAC genome a feasible goal, which to achieve would require standard closure methods as well as removal of minor MIC contamination of the MAC genome assembly. Highly accurate preliminary annotation of Tetrahymena's coding potential was hindered by the lack of both comparative genomic sequence information from close relatives and significant amounts of cDNA evidence, thus limiting the value of the genomic information and also leaving unanswered certain questions, such as the frequency of alternative splicing. RESULTS: We addressed the problem of MIC contamination using comparative genomic hybridization with purified MIC and MAC DNA probes against a whole genome oligonucleotide microarray, allowing the identification of 763 genome scaffolds likely to contain MIC-limited DNA sequences. We also employed standard genome closure methods to essentially finish over 60% of the MAC genome. For the improvement of annotation, we have sequenced and analyzed over 60,000 verified EST reads from a variety of cellular growth and development conditions. Using this EST evidence, a combination of automated and manual reannotation efforts led to updates that affect 16% of the current protein-coding gene models. By comparing EST abundance, many genes showing apparent differential expression between these conditions were identified. Rare instances of alternative splicing and uses of the non-standard amino acid selenocysteine were also identified. CONCLUSION: We report here significant progress in genome closure and reannotation of Tetrahymena thermophila. Our experience to date suggests that complete closure of the MAC genome is attainable. Using the new EST evidence, automated and manual curation has resulted in substantial improvements to the over 24,000 gene models, which will be valuable to researchers studying this model organism as well as for comparative genomics purposes.
format Text
id pubmed-2612030
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26120302008-12-30 Refined annotation and assembly of the Tetrahymena thermophila genome sequence through EST analysis, comparative genomic hybridization, and targeted gap closure Coyne, Robert S Thiagarajan, Mathangi Jones, Kristie M Wortman, Jennifer R Tallon, Luke J Haas, Brian J Cassidy-Hanley, Donna M Wiley, Emily A Smith, Joshua J Collins, Kathleen Lee, Suzanne R Couvillion, Mary T Liu, Yifan Garg, Jyoti Pearlman, Ronald E Hamilton, Eileen P Orias, Eduardo Eisen, Jonathan A Methé, Barbara A BMC Genomics Research Article BACKGROUND: Tetrahymena thermophila, a widely studied model for cellular and molecular biology, is a binucleated single-celled organism with a germline micronucleus (MIC) and somatic macronucleus (MAC). The recent draft MAC genome assembly revealed low sequence repetitiveness, a result of the epigenetic removal of invasive DNA elements found only in the MIC genome. Such low repetitiveness makes complete closure of the MAC genome a feasible goal, which to achieve would require standard closure methods as well as removal of minor MIC contamination of the MAC genome assembly. Highly accurate preliminary annotation of Tetrahymena's coding potential was hindered by the lack of both comparative genomic sequence information from close relatives and significant amounts of cDNA evidence, thus limiting the value of the genomic information and also leaving unanswered certain questions, such as the frequency of alternative splicing. RESULTS: We addressed the problem of MIC contamination using comparative genomic hybridization with purified MIC and MAC DNA probes against a whole genome oligonucleotide microarray, allowing the identification of 763 genome scaffolds likely to contain MIC-limited DNA sequences. We also employed standard genome closure methods to essentially finish over 60% of the MAC genome. For the improvement of annotation, we have sequenced and analyzed over 60,000 verified EST reads from a variety of cellular growth and development conditions. Using this EST evidence, a combination of automated and manual reannotation efforts led to updates that affect 16% of the current protein-coding gene models. By comparing EST abundance, many genes showing apparent differential expression between these conditions were identified. Rare instances of alternative splicing and uses of the non-standard amino acid selenocysteine were also identified. CONCLUSION: We report here significant progress in genome closure and reannotation of Tetrahymena thermophila. Our experience to date suggests that complete closure of the MAC genome is attainable. Using the new EST evidence, automated and manual curation has resulted in substantial improvements to the over 24,000 gene models, which will be valuable to researchers studying this model organism as well as for comparative genomics purposes. BioMed Central 2008-11-26 /pmc/articles/PMC2612030/ /pubmed/19036158 http://dx.doi.org/10.1186/1471-2164-9-562 Text en Copyright © 2008 Coyne et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Coyne, Robert S
Thiagarajan, Mathangi
Jones, Kristie M
Wortman, Jennifer R
Tallon, Luke J
Haas, Brian J
Cassidy-Hanley, Donna M
Wiley, Emily A
Smith, Joshua J
Collins, Kathleen
Lee, Suzanne R
Couvillion, Mary T
Liu, Yifan
Garg, Jyoti
Pearlman, Ronald E
Hamilton, Eileen P
Orias, Eduardo
Eisen, Jonathan A
Methé, Barbara A
Refined annotation and assembly of the Tetrahymena thermophila genome sequence through EST analysis, comparative genomic hybridization, and targeted gap closure
title Refined annotation and assembly of the Tetrahymena thermophila genome sequence through EST analysis, comparative genomic hybridization, and targeted gap closure
title_full Refined annotation and assembly of the Tetrahymena thermophila genome sequence through EST analysis, comparative genomic hybridization, and targeted gap closure
title_fullStr Refined annotation and assembly of the Tetrahymena thermophila genome sequence through EST analysis, comparative genomic hybridization, and targeted gap closure
title_full_unstemmed Refined annotation and assembly of the Tetrahymena thermophila genome sequence through EST analysis, comparative genomic hybridization, and targeted gap closure
title_short Refined annotation and assembly of the Tetrahymena thermophila genome sequence through EST analysis, comparative genomic hybridization, and targeted gap closure
title_sort refined annotation and assembly of the tetrahymena thermophila genome sequence through est analysis, comparative genomic hybridization, and targeted gap closure
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2612030/
https://www.ncbi.nlm.nih.gov/pubmed/19036158
http://dx.doi.org/10.1186/1471-2164-9-562
work_keys_str_mv AT coyneroberts refinedannotationandassemblyofthetetrahymenathermophilagenomesequencethroughestanalysiscomparativegenomichybridizationandtargetedgapclosure
AT thiagarajanmathangi refinedannotationandassemblyofthetetrahymenathermophilagenomesequencethroughestanalysiscomparativegenomichybridizationandtargetedgapclosure
AT joneskristiem refinedannotationandassemblyofthetetrahymenathermophilagenomesequencethroughestanalysiscomparativegenomichybridizationandtargetedgapclosure
AT wortmanjenniferr refinedannotationandassemblyofthetetrahymenathermophilagenomesequencethroughestanalysiscomparativegenomichybridizationandtargetedgapclosure
AT tallonlukej refinedannotationandassemblyofthetetrahymenathermophilagenomesequencethroughestanalysiscomparativegenomichybridizationandtargetedgapclosure
AT haasbrianj refinedannotationandassemblyofthetetrahymenathermophilagenomesequencethroughestanalysiscomparativegenomichybridizationandtargetedgapclosure
AT cassidyhanleydonnam refinedannotationandassemblyofthetetrahymenathermophilagenomesequencethroughestanalysiscomparativegenomichybridizationandtargetedgapclosure
AT wileyemilya refinedannotationandassemblyofthetetrahymenathermophilagenomesequencethroughestanalysiscomparativegenomichybridizationandtargetedgapclosure
AT smithjoshuaj refinedannotationandassemblyofthetetrahymenathermophilagenomesequencethroughestanalysiscomparativegenomichybridizationandtargetedgapclosure
AT collinskathleen refinedannotationandassemblyofthetetrahymenathermophilagenomesequencethroughestanalysiscomparativegenomichybridizationandtargetedgapclosure
AT leesuzanner refinedannotationandassemblyofthetetrahymenathermophilagenomesequencethroughestanalysiscomparativegenomichybridizationandtargetedgapclosure
AT couvillionmaryt refinedannotationandassemblyofthetetrahymenathermophilagenomesequencethroughestanalysiscomparativegenomichybridizationandtargetedgapclosure
AT liuyifan refinedannotationandassemblyofthetetrahymenathermophilagenomesequencethroughestanalysiscomparativegenomichybridizationandtargetedgapclosure
AT gargjyoti refinedannotationandassemblyofthetetrahymenathermophilagenomesequencethroughestanalysiscomparativegenomichybridizationandtargetedgapclosure
AT pearlmanronalde refinedannotationandassemblyofthetetrahymenathermophilagenomesequencethroughestanalysiscomparativegenomichybridizationandtargetedgapclosure
AT hamiltoneileenp refinedannotationandassemblyofthetetrahymenathermophilagenomesequencethroughestanalysiscomparativegenomichybridizationandtargetedgapclosure
AT oriaseduardo refinedannotationandassemblyofthetetrahymenathermophilagenomesequencethroughestanalysiscomparativegenomichybridizationandtargetedgapclosure
AT eisenjonathana refinedannotationandassemblyofthetetrahymenathermophilagenomesequencethroughestanalysiscomparativegenomichybridizationandtargetedgapclosure
AT methebarbaraa refinedannotationandassemblyofthetetrahymenathermophilagenomesequencethroughestanalysiscomparativegenomichybridizationandtargetedgapclosure