Cargando…

Masking repeats while clustering ESTs

A problem in EST clustering is the presence of repeat sequences. To avoid false matches, repeats have to be masked. This can be a time-consuming process, and it depends on available repeat libraries. We present a fast and effective method that aims to eliminate the problems repeats cause in the proc...

Descripción completa

Detalles Bibliográficos
Autores principales: Schneeberger, Korbinian, Malde, Ketil, Coward, Eivind, Jonassen, Inge
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1079970/
https://www.ncbi.nlm.nih.gov/pubmed/15831790
http://dx.doi.org/10.1093/nar/gki511
Descripción
Sumario:A problem in EST clustering is the presence of repeat sequences. To avoid false matches, repeats have to be masked. This can be a time-consuming process, and it depends on available repeat libraries. We present a fast and effective method that aims to eliminate the problems repeats cause in the process of clustering. Unlike traditional methods, repeats are inferred directly from the EST data, we do not rely on any external library of known repeats. This makes the method especially suitable for analysing the ESTs from organisms without good repeat libraries. We demonstrate that the result is very similar to performing standard repeat masking before clustering.