Cargando…

Integrating heterogeneous sequence information for transcriptome-wide microarray design; a Zebrafish example

BACKGROUND: A complete gene-expression microarray should preferably detect all genomic sequences that can be expressed as RNA in an organism, i.e. the transcriptome. However, our knowledge of a transcriptome of any organism still is incomplete and transcriptome information is continuously being upda...

Descripción completa

Detalles Bibliográficos
Autores principales:	Rauwerda, Han, de Jong, Mark, de Leeuw, Wim C, Spaink, Herman P, Breit, Timo M
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2010
Materias:	Technical Note
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2913925/ https://www.ncbi.nlm.nih.gov/pubmed/20626891 http://dx.doi.org/10.1186/1756-0500-3-192

_version_	1782184706385641472
author	Rauwerda, Han de Jong, Mark de Leeuw, Wim C Spaink, Herman P Breit, Timo M
author_facet	Rauwerda, Han de Jong, Mark de Leeuw, Wim C Spaink, Herman P Breit, Timo M
author_sort	Rauwerda, Han
collection	PubMed
description	BACKGROUND: A complete gene-expression microarray should preferably detect all genomic sequences that can be expressed as RNA in an organism, i.e. the transcriptome. However, our knowledge of a transcriptome of any organism still is incomplete and transcriptome information is continuously being updated. Here, we present a strategy to integrate heterogeneous sequence information that can be used as input for an up-to-date microarray design. FINDINGS: Our algorithm consists of four steps. In the first step transcripts from different resources are grouped into Transcription Clusters (TCs) by looking at the similarity of all transcripts. TCs are groups of transcripts with a similar length. If a transcript is much smaller than a TC to which it is highly similar, it will be annotated as a subsequence of that TC and is used for probe design only if the probe designed for the TC does not query the subsequence. Secondly, all TCs are mapped to a genome assembly and gene information is added to the design. Thirdly TC members are ranked according to their trustworthiness and the most reliable sequence is used for the probe design. The last step is the actual array design. We have used this strategy to build an up-to-date zebrafish microarray. CONCLUSIONS: With our strategy and the software developed, it is possible to use a set of heterogeneous transcript resources for microarray design, reduce the number of candidate target sequences on which the design is based and reduce redundancy. By changing the parameters in the procedure it is possible to control the similarity within the TCs and thus the amount of candidate sequences for the design. The annotation of the microarray is carried out simultaneously with the design.
format	Text
id	pubmed-2913925
institution	National Center for Biotechnology Information
language	English
publishDate	2010
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-29139252010-08-03 Integrating heterogeneous sequence information for transcriptome-wide microarray design; a Zebrafish example Rauwerda, Han de Jong, Mark de Leeuw, Wim C Spaink, Herman P Breit, Timo M BMC Res Notes Technical Note BACKGROUND: A complete gene-expression microarray should preferably detect all genomic sequences that can be expressed as RNA in an organism, i.e. the transcriptome. However, our knowledge of a transcriptome of any organism still is incomplete and transcriptome information is continuously being updated. Here, we present a strategy to integrate heterogeneous sequence information that can be used as input for an up-to-date microarray design. FINDINGS: Our algorithm consists of four steps. In the first step transcripts from different resources are grouped into Transcription Clusters (TCs) by looking at the similarity of all transcripts. TCs are groups of transcripts with a similar length. If a transcript is much smaller than a TC to which it is highly similar, it will be annotated as a subsequence of that TC and is used for probe design only if the probe designed for the TC does not query the subsequence. Secondly, all TCs are mapped to a genome assembly and gene information is added to the design. Thirdly TC members are ranked according to their trustworthiness and the most reliable sequence is used for the probe design. The last step is the actual array design. We have used this strategy to build an up-to-date zebrafish microarray. CONCLUSIONS: With our strategy and the software developed, it is possible to use a set of heterogeneous transcript resources for microarray design, reduce the number of candidate target sequences on which the design is based and reduce redundancy. By changing the parameters in the procedure it is possible to control the similarity within the TCs and thus the amount of candidate sequences for the design. The annotation of the microarray is carried out simultaneously with the design. BioMed Central 2010-07-13 /pmc/articles/PMC2913925/ /pubmed/20626891 http://dx.doi.org/10.1186/1756-0500-3-192 Text en Copyright ©2010 Breit et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Technical Note Rauwerda, Han de Jong, Mark de Leeuw, Wim C Spaink, Herman P Breit, Timo M Integrating heterogeneous sequence information for transcriptome-wide microarray design; a Zebrafish example
title	Integrating heterogeneous sequence information for transcriptome-wide microarray design; a Zebrafish example
title_full	Integrating heterogeneous sequence information for transcriptome-wide microarray design; a Zebrafish example
title_fullStr	Integrating heterogeneous sequence information for transcriptome-wide microarray design; a Zebrafish example
title_full_unstemmed	Integrating heterogeneous sequence information for transcriptome-wide microarray design; a Zebrafish example
title_short	Integrating heterogeneous sequence information for transcriptome-wide microarray design; a Zebrafish example
title_sort	integrating heterogeneous sequence information for transcriptome-wide microarray design; a zebrafish example
topic	Technical Note
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2913925/ https://www.ncbi.nlm.nih.gov/pubmed/20626891 http://dx.doi.org/10.1186/1756-0500-3-192
work_keys_str_mv	AT rauwerdahan integratingheterogeneoussequenceinformationfortranscriptomewidemicroarraydesignazebrafishexample AT dejongmark integratingheterogeneoussequenceinformationfortranscriptomewidemicroarraydesignazebrafishexample AT deleeuwwimc integratingheterogeneoussequenceinformationfortranscriptomewidemicroarraydesignazebrafishexample AT spainkhermanp integratingheterogeneoussequenceinformationfortranscriptomewidemicroarraydesignazebrafishexample AT breittimom integratingheterogeneoussequenceinformationfortranscriptomewidemicroarraydesignazebrafishexample

Integrating heterogeneous sequence information for transcriptome-wide microarray design; a Zebrafish example

Ejemplares similares