Cargando…

Transcriptome sequencing in an ecologically important tree species: assembly, annotation, and marker discovery

BACKGROUND: Massively parallel sequencing of cDNA is now an efficient route for generating enormous sequence collections that represent expressed genes. This approach provides a valuable starting point for characterizing functional genetic variation in non-model organisms, especially where whole gen...

Descripción completa

Detalles Bibliográficos
Autores principales: Parchman, Thomas L, Geist, Katherine S, Grahnen, Johan A, Benkman, Craig W, Buerkle, C Alex
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2851599/
https://www.ncbi.nlm.nih.gov/pubmed/20233449
http://dx.doi.org/10.1186/1471-2164-11-180
_version_ 1782179880597716992
author Parchman, Thomas L
Geist, Katherine S
Grahnen, Johan A
Benkman, Craig W
Buerkle, C Alex
author_facet Parchman, Thomas L
Geist, Katherine S
Grahnen, Johan A
Benkman, Craig W
Buerkle, C Alex
author_sort Parchman, Thomas L
collection PubMed
description BACKGROUND: Massively parallel sequencing of cDNA is now an efficient route for generating enormous sequence collections that represent expressed genes. This approach provides a valuable starting point for characterizing functional genetic variation in non-model organisms, especially where whole genome sequencing efforts are currently cost and time prohibitive. The large and complex genomes of pines (Pinus spp.) have hindered the development of genomic resources, despite the ecological and economical importance of the group. While most genomic studies have focused on a single species (P. taeda), genomic level resources for other pines are insufficiently developed to facilitate ecological genomic research. Lodgepole pine (P. contorta) is an ecologically important foundation species of montane forest ecosystems and exhibits substantial adaptive variation across its range in western North America. Here we describe a sequencing study of expressed genes from P. contorta, including their assembly and annotation, and their potential for molecular marker development to support population and association genetic studies. RESULTS: We obtained 586,732 sequencing reads from a 454 GS XLR70 Titanium pyrosequencer (mean length: 306 base pairs). A combination of reference-based and de novo assemblies yielded 63,657 contigs, with 239,793 reads remaining as singletons. Based on sequence similarity with known proteins, these sequences represent approximately 17,000 unique genes, many of which are well covered by contig sequences. This sequence collection also included a surprisingly large number of retrotransposon sequences, suggesting that they are highly transcriptionally active in the tissues we sampled. We located and characterized thousands of simple sequence repeats and single nucleotide polymorphisms as potential molecular markers in our assembled and annotated sequences. High quality PCR primers were designed for a substantial number of the SSR loci, and a large number of these were amplified successfully in initial screening. CONCLUSIONS: This sequence collection represents a major genomic resource for P. contorta, and the large number of genetic markers characterized should contribute to future research in this and other pines. Our results illustrate the utility of next generation sequencing as a basis for marker development and population genomics in non-model species.
format Text
id pubmed-2851599
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-28515992010-04-09 Transcriptome sequencing in an ecologically important tree species: assembly, annotation, and marker discovery Parchman, Thomas L Geist, Katherine S Grahnen, Johan A Benkman, Craig W Buerkle, C Alex BMC Genomics Research Article BACKGROUND: Massively parallel sequencing of cDNA is now an efficient route for generating enormous sequence collections that represent expressed genes. This approach provides a valuable starting point for characterizing functional genetic variation in non-model organisms, especially where whole genome sequencing efforts are currently cost and time prohibitive. The large and complex genomes of pines (Pinus spp.) have hindered the development of genomic resources, despite the ecological and economical importance of the group. While most genomic studies have focused on a single species (P. taeda), genomic level resources for other pines are insufficiently developed to facilitate ecological genomic research. Lodgepole pine (P. contorta) is an ecologically important foundation species of montane forest ecosystems and exhibits substantial adaptive variation across its range in western North America. Here we describe a sequencing study of expressed genes from P. contorta, including their assembly and annotation, and their potential for molecular marker development to support population and association genetic studies. RESULTS: We obtained 586,732 sequencing reads from a 454 GS XLR70 Titanium pyrosequencer (mean length: 306 base pairs). A combination of reference-based and de novo assemblies yielded 63,657 contigs, with 239,793 reads remaining as singletons. Based on sequence similarity with known proteins, these sequences represent approximately 17,000 unique genes, many of which are well covered by contig sequences. This sequence collection also included a surprisingly large number of retrotransposon sequences, suggesting that they are highly transcriptionally active in the tissues we sampled. We located and characterized thousands of simple sequence repeats and single nucleotide polymorphisms as potential molecular markers in our assembled and annotated sequences. High quality PCR primers were designed for a substantial number of the SSR loci, and a large number of these were amplified successfully in initial screening. CONCLUSIONS: This sequence collection represents a major genomic resource for P. contorta, and the large number of genetic markers characterized should contribute to future research in this and other pines. Our results illustrate the utility of next generation sequencing as a basis for marker development and population genomics in non-model species. BioMed Central 2010-03-16 /pmc/articles/PMC2851599/ /pubmed/20233449 http://dx.doi.org/10.1186/1471-2164-11-180 Text en Copyright ©2010 Parchman et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Parchman, Thomas L
Geist, Katherine S
Grahnen, Johan A
Benkman, Craig W
Buerkle, C Alex
Transcriptome sequencing in an ecologically important tree species: assembly, annotation, and marker discovery
title Transcriptome sequencing in an ecologically important tree species: assembly, annotation, and marker discovery
title_full Transcriptome sequencing in an ecologically important tree species: assembly, annotation, and marker discovery
title_fullStr Transcriptome sequencing in an ecologically important tree species: assembly, annotation, and marker discovery
title_full_unstemmed Transcriptome sequencing in an ecologically important tree species: assembly, annotation, and marker discovery
title_short Transcriptome sequencing in an ecologically important tree species: assembly, annotation, and marker discovery
title_sort transcriptome sequencing in an ecologically important tree species: assembly, annotation, and marker discovery
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2851599/
https://www.ncbi.nlm.nih.gov/pubmed/20233449
http://dx.doi.org/10.1186/1471-2164-11-180
work_keys_str_mv AT parchmanthomasl transcriptomesequencinginanecologicallyimportanttreespeciesassemblyannotationandmarkerdiscovery
AT geistkatherines transcriptomesequencinginanecologicallyimportanttreespeciesassemblyannotationandmarkerdiscovery
AT grahnenjohana transcriptomesequencinginanecologicallyimportanttreespeciesassemblyannotationandmarkerdiscovery
AT benkmancraigw transcriptomesequencinginanecologicallyimportanttreespeciesassemblyannotationandmarkerdiscovery
AT buerklecalex transcriptomesequencinginanecologicallyimportanttreespeciesassemblyannotationandmarkerdiscovery