Cargando…
A second generation framework for the analysis of microsatellites in expressed sequence tags and the development of EST-SSR markers for a conifer, Cryptomeria japonica
BACKGROUND: Microsatellites or simple sequence repeats (SSRs) in expressed sequence tags (ESTs) are useful resources for genome analysis because of their abundance, functionality and polymorphism. The advent of commercial second generation sequencing machines has lead to new strategies for developin...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3424129/ https://www.ncbi.nlm.nih.gov/pubmed/22507374 http://dx.doi.org/10.1186/1471-2164-13-136 |
_version_ | 1782241180710338560 |
---|---|
author | Ueno, Saneyoshi Moriguchi, Yoshinari Uchiyama, Kentaro Ujino-Ihara, Tokuko Futamura, Norihiro Sakurai, Tetsuya Shinohara, Kenji Tsumura, Yoshihiko |
author_facet | Ueno, Saneyoshi Moriguchi, Yoshinari Uchiyama, Kentaro Ujino-Ihara, Tokuko Futamura, Norihiro Sakurai, Tetsuya Shinohara, Kenji Tsumura, Yoshihiko |
author_sort | Ueno, Saneyoshi |
collection | PubMed |
description | BACKGROUND: Microsatellites or simple sequence repeats (SSRs) in expressed sequence tags (ESTs) are useful resources for genome analysis because of their abundance, functionality and polymorphism. The advent of commercial second generation sequencing machines has lead to new strategies for developing EST-SSR markers, necessitating the development of bioinformatic framework that can keep pace with the increasing quality and quantity of sequence data produced. We describe an open scheme for analyzing ESTs and developing EST-SSR markers from reads collected by Sanger sequencing and pyrosequencing of sugi (Cryptomeria japonica). RESULTS: We collected 141,097 sequence reads by Sanger sequencing and 1,333,444 by pyrosequencing. After trimming contaminant and low quality sequences, 118,319 Sanger and 1,201,150 pyrosequencing reads were passed to the MIRA assembler, generating 81,284 contigs that were analysed for SSRs. 4,059 SSRs were found in 3,694 (4.54%) contigs, giving an SSR frequency lower than that in seven other plant species with gene indices (5.4–21.9%). The average GC content of the SSR-containing contigs was 41.55%, compared to 40.23% for all contigs. Tri-SSRs were the most common SSRs; the most common motif was AT, which was found in 655 (46.3%) di-SSRs, followed by the AAG motif, found in 342 (25.9%) tri-SSRs. Most (72.8%) tri-SSRs were in coding regions, but 55.6% of the di-SSRs were in non-coding regions; the AT motif was most abundant in 3′ untranslated regions. Gene ontology (GO) annotations showed that six GO terms were significantly overrepresented within SSR-containing contigs. Forty–four EST-SSR markers were developed from 192 primer pairs using two pipelines: read2Marker and the newly-developed CMiB, which combines several open tools. Markers resulting from both pipelines showed no differences in PCR success rate and polymorphisms, but PCR success and polymorphism were significantly affected by the expected PCR product size and number of SSR repeats, respectively. EST-SSR markers exhibited less polymorphism than genomic SSRs. CONCLUSIONS: We have created a new open pipeline for developing EST-SSR markers and applied it in a comprehensive analysis of EST-SSRs and EST-SSR markers in C. japonica. The results will be useful in genomic analyses of conifers and other non-model species. |
format | Online Article Text |
id | pubmed-3424129 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-34241292012-08-22 A second generation framework for the analysis of microsatellites in expressed sequence tags and the development of EST-SSR markers for a conifer, Cryptomeria japonica Ueno, Saneyoshi Moriguchi, Yoshinari Uchiyama, Kentaro Ujino-Ihara, Tokuko Futamura, Norihiro Sakurai, Tetsuya Shinohara, Kenji Tsumura, Yoshihiko BMC Genomics Research Article BACKGROUND: Microsatellites or simple sequence repeats (SSRs) in expressed sequence tags (ESTs) are useful resources for genome analysis because of their abundance, functionality and polymorphism. The advent of commercial second generation sequencing machines has lead to new strategies for developing EST-SSR markers, necessitating the development of bioinformatic framework that can keep pace with the increasing quality and quantity of sequence data produced. We describe an open scheme for analyzing ESTs and developing EST-SSR markers from reads collected by Sanger sequencing and pyrosequencing of sugi (Cryptomeria japonica). RESULTS: We collected 141,097 sequence reads by Sanger sequencing and 1,333,444 by pyrosequencing. After trimming contaminant and low quality sequences, 118,319 Sanger and 1,201,150 pyrosequencing reads were passed to the MIRA assembler, generating 81,284 contigs that were analysed for SSRs. 4,059 SSRs were found in 3,694 (4.54%) contigs, giving an SSR frequency lower than that in seven other plant species with gene indices (5.4–21.9%). The average GC content of the SSR-containing contigs was 41.55%, compared to 40.23% for all contigs. Tri-SSRs were the most common SSRs; the most common motif was AT, which was found in 655 (46.3%) di-SSRs, followed by the AAG motif, found in 342 (25.9%) tri-SSRs. Most (72.8%) tri-SSRs were in coding regions, but 55.6% of the di-SSRs were in non-coding regions; the AT motif was most abundant in 3′ untranslated regions. Gene ontology (GO) annotations showed that six GO terms were significantly overrepresented within SSR-containing contigs. Forty–four EST-SSR markers were developed from 192 primer pairs using two pipelines: read2Marker and the newly-developed CMiB, which combines several open tools. Markers resulting from both pipelines showed no differences in PCR success rate and polymorphisms, but PCR success and polymorphism were significantly affected by the expected PCR product size and number of SSR repeats, respectively. EST-SSR markers exhibited less polymorphism than genomic SSRs. CONCLUSIONS: We have created a new open pipeline for developing EST-SSR markers and applied it in a comprehensive analysis of EST-SSRs and EST-SSR markers in C. japonica. The results will be useful in genomic analyses of conifers and other non-model species. BioMed Central 2012-04-16 /pmc/articles/PMC3424129/ /pubmed/22507374 http://dx.doi.org/10.1186/1471-2164-13-136 Text en Copyright ©2012 Ueno et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Ueno, Saneyoshi Moriguchi, Yoshinari Uchiyama, Kentaro Ujino-Ihara, Tokuko Futamura, Norihiro Sakurai, Tetsuya Shinohara, Kenji Tsumura, Yoshihiko A second generation framework for the analysis of microsatellites in expressed sequence tags and the development of EST-SSR markers for a conifer, Cryptomeria japonica |
title | A second generation framework for the analysis of microsatellites in expressed sequence tags and the development of EST-SSR markers for a conifer, Cryptomeria japonica |
title_full | A second generation framework for the analysis of microsatellites in expressed sequence tags and the development of EST-SSR markers for a conifer, Cryptomeria japonica |
title_fullStr | A second generation framework for the analysis of microsatellites in expressed sequence tags and the development of EST-SSR markers for a conifer, Cryptomeria japonica |
title_full_unstemmed | A second generation framework for the analysis of microsatellites in expressed sequence tags and the development of EST-SSR markers for a conifer, Cryptomeria japonica |
title_short | A second generation framework for the analysis of microsatellites in expressed sequence tags and the development of EST-SSR markers for a conifer, Cryptomeria japonica |
title_sort | second generation framework for the analysis of microsatellites in expressed sequence tags and the development of est-ssr markers for a conifer, cryptomeria japonica |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3424129/ https://www.ncbi.nlm.nih.gov/pubmed/22507374 http://dx.doi.org/10.1186/1471-2164-13-136 |
work_keys_str_mv | AT uenosaneyoshi asecondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica AT moriguchiyoshinari asecondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica AT uchiyamakentaro asecondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica AT ujinoiharatokuko asecondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica AT futamuranorihiro asecondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica AT sakuraitetsuya asecondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica AT shinoharakenji asecondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica AT tsumurayoshihiko asecondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica AT uenosaneyoshi secondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica AT moriguchiyoshinari secondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica AT uchiyamakentaro secondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica AT ujinoiharatokuko secondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica AT futamuranorihiro secondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica AT sakuraitetsuya secondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica AT shinoharakenji secondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica AT tsumurayoshihiko secondgenerationframeworkfortheanalysisofmicrosatellitesinexpressedsequencetagsandthedevelopmentofestssrmarkersforaconifercryptomeriajaponica |