Cargando…

A second generation framework for the analysis of microsatellites in expressed sequence tags and the development of EST-SSR markers for a conifer, Cryptomeria japonica

BACKGROUND: Microsatellites or simple sequence repeats (SSRs) in expressed sequence tags (ESTs) are useful resources for genome analysis because of their abundance, functionality and polymorphism. The advent of commercial second generation sequencing machines has lead to new strategies for developin...

Descripción completa

Detalles Bibliográficos
Autores principales: Ueno, Saneyoshi, Moriguchi, Yoshinari, Uchiyama, Kentaro, Ujino-Ihara, Tokuko, Futamura, Norihiro, Sakurai, Tetsuya, Shinohara, Kenji, Tsumura, Yoshihiko
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3424129/
https://www.ncbi.nlm.nih.gov/pubmed/22507374
http://dx.doi.org/10.1186/1471-2164-13-136
_version_ 1782241180710338560
author Ueno, Saneyoshi
Moriguchi, Yoshinari
Uchiyama, Kentaro
Ujino-Ihara, Tokuko
Futamura, Norihiro
Sakurai, Tetsuya
Shinohara, Kenji
Tsumura, Yoshihiko
author_facet Ueno, Saneyoshi
Moriguchi, Yoshinari
Uchiyama, Kentaro
Ujino-Ihara, Tokuko
Futamura, Norihiro
Sakurai, Tetsuya
Shinohara, Kenji
Tsumura, Yoshihiko
author_sort Ueno, Saneyoshi
collection PubMed
description BACKGROUND: Microsatellites or simple sequence repeats (SSRs) in expressed sequence tags (ESTs) are useful resources for genome analysis because of their abundance, functionality and polymorphism. The advent of commercial second generation sequencing machines has lead to new strategies for developing EST-SSR markers, necessitating the development of bioinformatic framework that can keep pace with the increasing quality and quantity of sequence data produced. We describe an open scheme for analyzing ESTs and developing EST-SSR markers from reads collected by Sanger sequencing and pyrosequencing of sugi (Cryptomeria japonica). RESULTS: We collected 141,097 sequence reads by Sanger sequencing and 1,333,444 by pyrosequencing. After trimming contaminant and low quality sequences, 118,319 Sanger and 1,201,150 pyrosequencing reads were passed to the MIRA assembler, generating 81,284 contigs that were analysed for SSRs. 4,059 SSRs were found in 3,694 (4.54%) contigs, giving an SSR frequency lower than that in seven other plant species with gene indices (5.4–21.9%). The average GC content of the SSR-containing contigs was 41.55%, compared to 40.23% for all contigs. Tri-SSRs were the most common SSRs; the most common motif was AT, which was found in 655 (46.3%) di-SSRs, followed by the AAG motif, found in 342 (25.9%) tri-SSRs. Most (72.8%) tri-SSRs were in coding regions, but 55.6% of the di-SSRs were in non-coding regions; the AT motif was most abundant in 3′ untranslated regions. Gene ontology (GO) annotations showed that six GO terms were significantly overrepresented within SSR-containing contigs. Forty–four EST-SSR markers were developed from 192 primer pairs using two pipelines: read2Marker and the newly-developed CMiB, which combines several open tools. Markers resulting from both pipelines showed no differences in PCR success rate and polymorphisms, but PCR success and polymorphism were significantly affected by the expected PCR product size and number of SSR repeats, respectively. EST-SSR markers exhibited less polymorphism than genomic SSRs. CONCLUSIONS: We have created a new open pipeline for developing EST-SSR markers and applied it in a comprehensive analysis of EST-SSRs and EST-SSR markers in C. japonica. The results will be useful in genomic analyses of conifers and other non-model species.
format Online
Article
Text
id pubmed-3424129
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-34241292012-08-22 A second generation framework for the analysis of microsatellites in expressed sequence tags and the development of EST-SSR markers for a conifer, Cryptomeria japonica Ueno, Saneyoshi Moriguchi, Yoshinari Uchiyama, Kentaro Ujino-Ihara, Tokuko Futamura, Norihiro Sakurai, Tetsuya Shinohara, Kenji Tsumura, Yoshihiko BMC Genomics Research Article BACKGROUND: Microsatellites or simple sequence repeats (SSRs) in expressed sequence tags (ESTs) are useful resources for genome analysis because of their abundance, functionality and polymorphism. The advent of commercial second generation sequencing machines has lead to new strategies for developing EST-SSR markers, necessitating the development of bioinformatic framework that can keep pace with the increasing quality and quantity of sequence data produced. We describe an open scheme for analyzing ESTs and developing EST-SSR markers from reads collected by Sanger sequencing and pyrosequencing of sugi (Cryptomeria japonica). RESULTS: We collected 141,097 sequence reads by Sanger sequencing and 1,333,444 by pyrosequencing. After trimming contaminant and low quality sequences, 118,319 Sanger and 1,201,150 pyrosequencing reads were passed to the MIRA assembler, generating 81,284 contigs that were analysed for SSRs. 4,059 SSRs were found in 3,694 (4.54%) contigs, giving an SSR frequency lower than that in seven other plant species with gene indices (5.4–21.9%). The average GC content of the SSR-containing contigs was 41.55%, compared to 40.23% for all contigs. Tri-SSRs were the most common SSRs; the most common motif was AT, which was found in 655 (46.3%) di-SSRs, followed by the AAG motif, found in 342 (25.9%) tri-SSRs. Most (72.8%) tri-SSRs were in coding regions, but 55.6% of the di-SSRs were in non-coding regions; the AT motif was most abundant in 3′ untranslated regions. Gene ontology (GO) annotations showed that six GO terms were significantly overrepresented within SSR-containing contigs. Forty–four EST-SSR markers were developed from 192 primer pairs using two pipelines: read2Marker and the newly-developed CMiB, which combines several open tools. Markers resulting from both pipelines showed no differences in PCR success rate and polymorphisms, but PCR success and polymorphism were significantly affected by the expected PCR product size and number of SSR repeats, respectively. EST-SSR markers exhibited less polymorphism than genomic SSRs. CONCLUSIONS: We have created a new open pipeline for developing EST-SSR markers and applied it in a comprehensive analysis of EST-SSRs and EST-SSR markers in C. japonica. The results will be useful in genomic analyses of conifers and other non-model species. BioMed Central 2012-04-16 /pmc/articles/PMC3424129/ /pubmed/22507374 http://dx.doi.org/10.1186/1471-2164-13-136 Text en Copyright ©2012 Ueno et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Ueno, Saneyoshi
Moriguchi, Yoshinari
Uchiyama, Kentaro
Ujino-Ihara, Tokuko
Futamura, Norihiro
Sakurai, Tetsuya
Shinohara, Kenji
Tsumura, Yoshihiko
A second generation framework for the analysis of microsatellites in expressed sequence tags and the development of EST-SSR markers for a conifer, Cryptomeria japonica
title A second generation framework for the analysis of microsatellites in expressed sequence tags and the development of EST-SSR markers for a conifer, Cryptomeria japonica
title_full A second generation framework for the analysis of microsatellites in expressed sequence tags and the development of EST-SSR markers for a conifer, Cryptomeria japonica
title_fullStr A second generation framework for the analysis of microsatellites in expressed sequence tags and the development of EST-SSR markers for a conifer, Cryptomeria japonica
title_full_unstemmed A second generation framework for the analysis of microsatellites in expressed sequence tags and the development of EST-SSR markers for a conifer, Cryptomeria japonica
title_short A second generation framework for the analysis of microsatellites in expressed sequence tags and the development of EST-SSR markers for a conifer, Cryptomeria japonica
title_sort second generation framework for the analysis of microsatellites in expressed sequence tags and the development of est-ssr markers for a conifer, cryptomeria japonica
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3424129/
https://www.ncbi.nlm.nih.gov/pubmed/22507374
http://dx.doi.org/10.1186/1471-2164-13-136
work_keys_str_mv AT uenosaneyoshi asecondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica
AT moriguchiyoshinari asecondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica
AT uchiyamakentaro asecondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica
AT ujinoiharatokuko asecondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica
AT futamuranorihiro asecondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica
AT sakuraitetsuya asecondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica
AT shinoharakenji asecondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica
AT tsumurayoshihiko asecondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica
AT uenosaneyoshi secondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica
AT moriguchiyoshinari secondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica
AT uchiyamakentaro secondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica
AT ujinoiharatokuko secondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica
AT futamuranorihiro secondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica
AT sakuraitetsuya secondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica
AT shinoharakenji secondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica
AT tsumurayoshihiko secondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica