Cargando…
Identification of single nucleotide polymorphisms from the transcriptome of an organism with a whole genome duplication
BACKGROUND: The common ancestor of salmonid fishes, including rainbow trout (Oncorhynchus mykiss), experienced a whole genome duplication between 20 and 100 million years ago, and many of the duplicated genes have been retained in the trout genome. This retention complicates efforts to detect alleli...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3840595/ https://www.ncbi.nlm.nih.gov/pubmed/24237905 http://dx.doi.org/10.1186/1471-2105-14-325 |
_version_ | 1782478531012329472 |
---|---|
author | Christensen, Kris A Brunelli, Joseph P Lambert, Matthew J DeKoning, Jenefer Phillips, Ruth B Thorgaard, Gary H |
author_facet | Christensen, Kris A Brunelli, Joseph P Lambert, Matthew J DeKoning, Jenefer Phillips, Ruth B Thorgaard, Gary H |
author_sort | Christensen, Kris A |
collection | PubMed |
description | BACKGROUND: The common ancestor of salmonid fishes, including rainbow trout (Oncorhynchus mykiss), experienced a whole genome duplication between 20 and 100 million years ago, and many of the duplicated genes have been retained in the trout genome. This retention complicates efforts to detect allelic variation in salmonid fishes. Specifically, single nucleotide polymorphism (SNP) detection is problematic because nucleotide variation can be found between the duplicate copies (paralogs) of a gene as well as between alleles. RESULTS: We present a method of differentiating between allelic and paralogous (gene copy) sequence variants, allowing identification of SNPs in organisms with multiple copies of a gene or set of genes. The basic strategy is to: 1) identify windows of unique cDNA sequences with homology to each other, 2) compare these unique cDNAs if they are not shared between individuals (i.e. the cDNA is homozygous in one individual and homozygous for another cDNA in the other individual), and 3) give a “SNP score” value between zero and one to each candidate sequence variant based on six criteria. Using this strategy we were able to detect about seven thousand potential SNPs from the transcriptomes of several clonal lines of rainbow trout. When directly compared to a pre-validated set of SNPs in polyploid wheat, we were also able to estimate the false-positive rate of this strategy as 0 to 28% depending on parameters used. CONCLUSIONS: This strategy has an advantage over traditional techniques of SNP identification because another dimension of sequencing information is utilized. This method is especially well suited for identifying SNPs in polyploids, both outbred and inbred, but would tend to be conservative for diploid organisms. |
format | Online Article Text |
id | pubmed-3840595 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-38405952013-11-27 Identification of single nucleotide polymorphisms from the transcriptome of an organism with a whole genome duplication Christensen, Kris A Brunelli, Joseph P Lambert, Matthew J DeKoning, Jenefer Phillips, Ruth B Thorgaard, Gary H BMC Bioinformatics Methodology Article BACKGROUND: The common ancestor of salmonid fishes, including rainbow trout (Oncorhynchus mykiss), experienced a whole genome duplication between 20 and 100 million years ago, and many of the duplicated genes have been retained in the trout genome. This retention complicates efforts to detect allelic variation in salmonid fishes. Specifically, single nucleotide polymorphism (SNP) detection is problematic because nucleotide variation can be found between the duplicate copies (paralogs) of a gene as well as between alleles. RESULTS: We present a method of differentiating between allelic and paralogous (gene copy) sequence variants, allowing identification of SNPs in organisms with multiple copies of a gene or set of genes. The basic strategy is to: 1) identify windows of unique cDNA sequences with homology to each other, 2) compare these unique cDNAs if they are not shared between individuals (i.e. the cDNA is homozygous in one individual and homozygous for another cDNA in the other individual), and 3) give a “SNP score” value between zero and one to each candidate sequence variant based on six criteria. Using this strategy we were able to detect about seven thousand potential SNPs from the transcriptomes of several clonal lines of rainbow trout. When directly compared to a pre-validated set of SNPs in polyploid wheat, we were also able to estimate the false-positive rate of this strategy as 0 to 28% depending on parameters used. CONCLUSIONS: This strategy has an advantage over traditional techniques of SNP identification because another dimension of sequencing information is utilized. This method is especially well suited for identifying SNPs in polyploids, both outbred and inbred, but would tend to be conservative for diploid organisms. BioMed Central 2013-11-16 /pmc/articles/PMC3840595/ /pubmed/24237905 http://dx.doi.org/10.1186/1471-2105-14-325 Text en Copyright © 2013 Christensen et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Christensen, Kris A Brunelli, Joseph P Lambert, Matthew J DeKoning, Jenefer Phillips, Ruth B Thorgaard, Gary H Identification of single nucleotide polymorphisms from the transcriptome of an organism with a whole genome duplication |
title | Identification of single nucleotide polymorphisms from the transcriptome of an organism with a whole genome duplication |
title_full | Identification of single nucleotide polymorphisms from the transcriptome of an organism with a whole genome duplication |
title_fullStr | Identification of single nucleotide polymorphisms from the transcriptome of an organism with a whole genome duplication |
title_full_unstemmed | Identification of single nucleotide polymorphisms from the transcriptome of an organism with a whole genome duplication |
title_short | Identification of single nucleotide polymorphisms from the transcriptome of an organism with a whole genome duplication |
title_sort | identification of single nucleotide polymorphisms from the transcriptome of an organism with a whole genome duplication |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3840595/ https://www.ncbi.nlm.nih.gov/pubmed/24237905 http://dx.doi.org/10.1186/1471-2105-14-325 |
work_keys_str_mv | AT christensenkrisa identificationofsinglenucleotidepolymorphismsfromthetranscriptomeofanorganismwithawholegenomeduplication AT brunellijosephp identificationofsinglenucleotidepolymorphismsfromthetranscriptomeofanorganismwithawholegenomeduplication AT lambertmatthewj identificationofsinglenucleotidepolymorphismsfromthetranscriptomeofanorganismwithawholegenomeduplication AT dekoningjenefer identificationofsinglenucleotidepolymorphismsfromthetranscriptomeofanorganismwithawholegenomeduplication AT phillipsruthb identificationofsinglenucleotidepolymorphismsfromthetranscriptomeofanorganismwithawholegenomeduplication AT thorgaardgaryh identificationofsinglenucleotidepolymorphismsfromthetranscriptomeofanorganismwithawholegenomeduplication |