Cargando…
Universal seeds for cDNA-to-genome comparison
BACKGROUND: To meet the needs of gene annotation for newly sequenced organisms, optimized spaced seeds can be implemented into cross-species sequence alignment programs to accurately align gene sequences to the genome of a related species. So far, seed performance has been tested for comparisons bet...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2008
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2375135/ https://www.ncbi.nlm.nih.gov/pubmed/18215286 http://dx.doi.org/10.1186/1471-2105-9-36 |
_version_ | 1782154585731760128 |
---|---|
author | Zhou, Leming Stanton, Jonathan Florea, Liliana |
author_facet | Zhou, Leming Stanton, Jonathan Florea, Liliana |
author_sort | Zhou, Leming |
collection | PubMed |
description | BACKGROUND: To meet the needs of gene annotation for newly sequenced organisms, optimized spaced seeds can be implemented into cross-species sequence alignment programs to accurately align gene sequences to the genome of a related species. So far, seed performance has been tested for comparisons between closely related species, such as human and mouse, or on simulated data. As the number and variety of genomes increases, it becomes desirable to identify a small set of universal seeds that perform optimally or near-optimally on a large range of comparisons. RESULTS: Using statistical regression methods, we investigate the sensitivity of seeds, in particular good seeds, between four cDNA-to-genome comparisons at different evolutionary distances (human-dog, human-mouse, human-chicken and human-zebrafish), and identify classes of comparisons that show similar seed behavior and therefore can employ the same seed. In addition, we find that with high confidence good seeds for more distant comparisons perform well on closer comparisons, within 98–99% of the optimal seeds, and thus represent universal good seeds. CONCLUSION: We show for the first time that optimal and near-optimal seeds for distant species-to-species comparisons are more generally applicable to a wide range of comparisons. This finding will be instrumental in developing practical and user-friendly cDNA-to-genome alignment applications, to aid in the annotation of new model organisms. |
format | Text |
id | pubmed-2375135 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2008 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-23751352008-05-09 Universal seeds for cDNA-to-genome comparison Zhou, Leming Stanton, Jonathan Florea, Liliana BMC Bioinformatics Research Article BACKGROUND: To meet the needs of gene annotation for newly sequenced organisms, optimized spaced seeds can be implemented into cross-species sequence alignment programs to accurately align gene sequences to the genome of a related species. So far, seed performance has been tested for comparisons between closely related species, such as human and mouse, or on simulated data. As the number and variety of genomes increases, it becomes desirable to identify a small set of universal seeds that perform optimally or near-optimally on a large range of comparisons. RESULTS: Using statistical regression methods, we investigate the sensitivity of seeds, in particular good seeds, between four cDNA-to-genome comparisons at different evolutionary distances (human-dog, human-mouse, human-chicken and human-zebrafish), and identify classes of comparisons that show similar seed behavior and therefore can employ the same seed. In addition, we find that with high confidence good seeds for more distant comparisons perform well on closer comparisons, within 98–99% of the optimal seeds, and thus represent universal good seeds. CONCLUSION: We show for the first time that optimal and near-optimal seeds for distant species-to-species comparisons are more generally applicable to a wide range of comparisons. This finding will be instrumental in developing practical and user-friendly cDNA-to-genome alignment applications, to aid in the annotation of new model organisms. BioMed Central 2008-01-23 /pmc/articles/PMC2375135/ /pubmed/18215286 http://dx.doi.org/10.1186/1471-2105-9-36 Text en Copyright © 2008 Zhou et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Zhou, Leming Stanton, Jonathan Florea, Liliana Universal seeds for cDNA-to-genome comparison |
title | Universal seeds for cDNA-to-genome comparison |
title_full | Universal seeds for cDNA-to-genome comparison |
title_fullStr | Universal seeds for cDNA-to-genome comparison |
title_full_unstemmed | Universal seeds for cDNA-to-genome comparison |
title_short | Universal seeds for cDNA-to-genome comparison |
title_sort | universal seeds for cdna-to-genome comparison |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2375135/ https://www.ncbi.nlm.nih.gov/pubmed/18215286 http://dx.doi.org/10.1186/1471-2105-9-36 |
work_keys_str_mv | AT zhouleming universalseedsforcdnatogenomecomparison AT stantonjonathan universalseedsforcdnatogenomecomparison AT florealiliana universalseedsforcdnatogenomecomparison |