Cargando…
FRAGS: estimation of coding sequence substitution rates from fragmentary data
BACKGROUND: Rates of substitution in protein-coding sequences can provide important insights into evolutionary processes that are of biomedical and theoretical interest. Increased availability of coding sequence data has enabled researchers to estimate more accurately the coding sequence divergence...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2004
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC344743/ https://www.ncbi.nlm.nih.gov/pubmed/15005802 http://dx.doi.org/10.1186/1471-2105-5-8 |
_version_ | 1782121238546612224 |
---|---|
author | Swart, Estienne C Hide, Winston A Seoighe, Cathal |
author_facet | Swart, Estienne C Hide, Winston A Seoighe, Cathal |
author_sort | Swart, Estienne C |
collection | PubMed |
description | BACKGROUND: Rates of substitution in protein-coding sequences can provide important insights into evolutionary processes that are of biomedical and theoretical interest. Increased availability of coding sequence data has enabled researchers to estimate more accurately the coding sequence divergence of pairs of organisms. However the use of different data sources, alignment protocols and methods to estimate substitution rates leads to widely varying estimates of key parameters that define the coding sequence divergence of orthologous genes. Although complete genome sequence data are not available for all organisms, fragmentary sequence data can provide accurate estimates of substitution rates provided that an appropriate and consistent methodology is used and that differences in the estimates obtainable from different data sources are taken into account. RESULTS: We have developed FRAGS, an application framework that uses existing, freely available software components to construct in-frame alignments and estimate coding substitution rates from fragmentary sequence data. Coding sequence substitution estimates for human and chimpanzee sequences, generated by FRAGS, reveal that methodological differences can give rise to significantly different estimates of important substitution parameters. The estimated substitution rates were also used to infer upper-bounds on the amount of sequencing error in the datasets that we have analysed. CONCLUSION: We have developed a system that performs robust estimation of substitution rates for orthologous sequences from a pair of organisms. Our system can be used when fragmentary genomic or transcript data is available from one of the organisms and the other is a completely sequenced genome within the Ensembl database. As well as estimating substitution statistics our system enables the user to manage and query alignment and substitution data. |
format | Text |
id | pubmed-344743 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2004 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-3447432004-02-25 FRAGS: estimation of coding sequence substitution rates from fragmentary data Swart, Estienne C Hide, Winston A Seoighe, Cathal BMC Bioinformatics Software BACKGROUND: Rates of substitution in protein-coding sequences can provide important insights into evolutionary processes that are of biomedical and theoretical interest. Increased availability of coding sequence data has enabled researchers to estimate more accurately the coding sequence divergence of pairs of organisms. However the use of different data sources, alignment protocols and methods to estimate substitution rates leads to widely varying estimates of key parameters that define the coding sequence divergence of orthologous genes. Although complete genome sequence data are not available for all organisms, fragmentary sequence data can provide accurate estimates of substitution rates provided that an appropriate and consistent methodology is used and that differences in the estimates obtainable from different data sources are taken into account. RESULTS: We have developed FRAGS, an application framework that uses existing, freely available software components to construct in-frame alignments and estimate coding substitution rates from fragmentary sequence data. Coding sequence substitution estimates for human and chimpanzee sequences, generated by FRAGS, reveal that methodological differences can give rise to significantly different estimates of important substitution parameters. The estimated substitution rates were also used to infer upper-bounds on the amount of sequencing error in the datasets that we have analysed. CONCLUSION: We have developed a system that performs robust estimation of substitution rates for orthologous sequences from a pair of organisms. Our system can be used when fragmentary genomic or transcript data is available from one of the organisms and the other is a completely sequenced genome within the Ensembl database. As well as estimating substitution statistics our system enables the user to manage and query alignment and substitution data. BioMed Central 2004-01-29 /pmc/articles/PMC344743/ /pubmed/15005802 http://dx.doi.org/10.1186/1471-2105-5-8 Text en Copyright © 2004 Swart et al; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL. |
spellingShingle | Software Swart, Estienne C Hide, Winston A Seoighe, Cathal FRAGS: estimation of coding sequence substitution rates from fragmentary data |
title | FRAGS: estimation of coding sequence substitution rates from fragmentary data |
title_full | FRAGS: estimation of coding sequence substitution rates from fragmentary data |
title_fullStr | FRAGS: estimation of coding sequence substitution rates from fragmentary data |
title_full_unstemmed | FRAGS: estimation of coding sequence substitution rates from fragmentary data |
title_short | FRAGS: estimation of coding sequence substitution rates from fragmentary data |
title_sort | frags: estimation of coding sequence substitution rates from fragmentary data |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC344743/ https://www.ncbi.nlm.nih.gov/pubmed/15005802 http://dx.doi.org/10.1186/1471-2105-5-8 |
work_keys_str_mv | AT swartestiennec fragsestimationofcodingsequencesubstitutionratesfromfragmentarydata AT hidewinstona fragsestimationofcodingsequencesubstitutionratesfromfragmentarydata AT seoighecathal fragsestimationofcodingsequencesubstitutionratesfromfragmentarydata |