Cargando…
CAFTAN: a tool for fast mapping, and quality assessment of cDNAs
BACKGROUND: The German cDNA Consortium has been cloning full length cDNAs and continued with their exploitation in protein localization experiments and cellular assays. However, the efficient use of large cDNA resources requires the development of strategies that are capable of a speedy selection of...
Autores principales: | , , , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2006
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1636072/ https://www.ncbi.nlm.nih.gov/pubmed/17064411 http://dx.doi.org/10.1186/1471-2105-7-473 |
_version_ | 1782130730565894144 |
---|---|
author | del Val, Coral Kuryshev, Vladimir Yurjevich Glatting, Karl-Heinz Ernst, Peter Hotz-Wagenblatt, Agnes Poustka, Annemarie Suhai, Sandor Wiemann, Stefan |
author_facet | del Val, Coral Kuryshev, Vladimir Yurjevich Glatting, Karl-Heinz Ernst, Peter Hotz-Wagenblatt, Agnes Poustka, Annemarie Suhai, Sandor Wiemann, Stefan |
author_sort | del Val, Coral |
collection | PubMed |
description | BACKGROUND: The German cDNA Consortium has been cloning full length cDNAs and continued with their exploitation in protein localization experiments and cellular assays. However, the efficient use of large cDNA resources requires the development of strategies that are capable of a speedy selection of truly useful cDNAs from biological and experimental noise. To this end we have developed a new high-throughput analysis tool, CAFTAN, which simplifies these efforts and thus fills the gap between large-scale cDNA collections and their systematic annotation and application in functional genomics. RESULTS: CAFTAN is built around the mapping of cDNAs to the genome assembly, and the subsequent analysis of their genomic context. It uses sequence features like the presence and type of PolyA signals, inner and flanking repeats, the GC-content, splice site types, etc. All these features are evaluated in individual tests and classify cDNAs according to their sequence quality and likelihood to have been generated from fully processed mRNAs. Additionally, CAFTAN compares the coordinates of mapped cDNAs with the genomic coordinates of reference sets from public available resources (e.g., VEGA, ENSEMBL). This provides detailed information about overlapping exons and the structural classification of cDNAs with respect to the reference set of splice variants. The evaluation of CAFTAN showed that is able to correctly classify more than 85% of 5950 selected "known protein-coding" VEGA cDNAs as high quality multi- or single-exon. It identified as good 80.6 % of the single exon cDNAs and 85 % of the multiple exon cDNAs. The program is written in Perl and in a modular way, allowing the adoption of this strategy to other tasks like EST-annotation, or to extend it by adding new classification rules and new organism databases as they become available. We think that it is a very useful program for the annotation and research of unfinished genomes. CONCLUSION: CAFTAN is a high-throughput sequence analysis tool, which performs a fast and reliable quality prediction of cDNAs. Several thousands of cDNAs can be analyzed in a short time, giving the curator/scientist a first quick overview about the quality and the already existing annotation of a set of cDNAs. It supports the rejection of low quality cDNAs and helps in the selection of likely novel splice variants, and/or completely novel transcripts for new experiments. |
format | Text |
id | pubmed-1636072 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2006 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-16360722006-11-15 CAFTAN: a tool for fast mapping, and quality assessment of cDNAs del Val, Coral Kuryshev, Vladimir Yurjevich Glatting, Karl-Heinz Ernst, Peter Hotz-Wagenblatt, Agnes Poustka, Annemarie Suhai, Sandor Wiemann, Stefan BMC Bioinformatics Software BACKGROUND: The German cDNA Consortium has been cloning full length cDNAs and continued with their exploitation in protein localization experiments and cellular assays. However, the efficient use of large cDNA resources requires the development of strategies that are capable of a speedy selection of truly useful cDNAs from biological and experimental noise. To this end we have developed a new high-throughput analysis tool, CAFTAN, which simplifies these efforts and thus fills the gap between large-scale cDNA collections and their systematic annotation and application in functional genomics. RESULTS: CAFTAN is built around the mapping of cDNAs to the genome assembly, and the subsequent analysis of their genomic context. It uses sequence features like the presence and type of PolyA signals, inner and flanking repeats, the GC-content, splice site types, etc. All these features are evaluated in individual tests and classify cDNAs according to their sequence quality and likelihood to have been generated from fully processed mRNAs. Additionally, CAFTAN compares the coordinates of mapped cDNAs with the genomic coordinates of reference sets from public available resources (e.g., VEGA, ENSEMBL). This provides detailed information about overlapping exons and the structural classification of cDNAs with respect to the reference set of splice variants. The evaluation of CAFTAN showed that is able to correctly classify more than 85% of 5950 selected "known protein-coding" VEGA cDNAs as high quality multi- or single-exon. It identified as good 80.6 % of the single exon cDNAs and 85 % of the multiple exon cDNAs. The program is written in Perl and in a modular way, allowing the adoption of this strategy to other tasks like EST-annotation, or to extend it by adding new classification rules and new organism databases as they become available. We think that it is a very useful program for the annotation and research of unfinished genomes. CONCLUSION: CAFTAN is a high-throughput sequence analysis tool, which performs a fast and reliable quality prediction of cDNAs. Several thousands of cDNAs can be analyzed in a short time, giving the curator/scientist a first quick overview about the quality and the already existing annotation of a set of cDNAs. It supports the rejection of low quality cDNAs and helps in the selection of likely novel splice variants, and/or completely novel transcripts for new experiments. BioMed Central 2006-10-25 /pmc/articles/PMC1636072/ /pubmed/17064411 http://dx.doi.org/10.1186/1471-2105-7-473 Text en Copyright © 2006 del Val et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Software del Val, Coral Kuryshev, Vladimir Yurjevich Glatting, Karl-Heinz Ernst, Peter Hotz-Wagenblatt, Agnes Poustka, Annemarie Suhai, Sandor Wiemann, Stefan CAFTAN: a tool for fast mapping, and quality assessment of cDNAs |
title | CAFTAN: a tool for fast mapping, and quality assessment of cDNAs |
title_full | CAFTAN: a tool for fast mapping, and quality assessment of cDNAs |
title_fullStr | CAFTAN: a tool for fast mapping, and quality assessment of cDNAs |
title_full_unstemmed | CAFTAN: a tool for fast mapping, and quality assessment of cDNAs |
title_short | CAFTAN: a tool for fast mapping, and quality assessment of cDNAs |
title_sort | caftan: a tool for fast mapping, and quality assessment of cdnas |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1636072/ https://www.ncbi.nlm.nih.gov/pubmed/17064411 http://dx.doi.org/10.1186/1471-2105-7-473 |
work_keys_str_mv | AT delvalcoral caftanatoolforfastmappingandqualityassessmentofcdnas AT kuryshevvladimiryurjevich caftanatoolforfastmappingandqualityassessmentofcdnas AT glattingkarlheinz caftanatoolforfastmappingandqualityassessmentofcdnas AT ernstpeter caftanatoolforfastmappingandqualityassessmentofcdnas AT hotzwagenblattagnes caftanatoolforfastmappingandqualityassessmentofcdnas AT poustkaannemarie caftanatoolforfastmappingandqualityassessmentofcdnas AT suhaisandor caftanatoolforfastmappingandqualityassessmentofcdnas AT wiemannstefan caftanatoolforfastmappingandqualityassessmentofcdnas |