Cargando…

Genome Signatures, Self-Organizing Maps and Higher Order Phylogenies: A Parametric Analysis

Genome signatures are data vectors derived from the compositional statistics of DNA. The self-organizing map (SOM) is a neural network method for the conceptualisation of relationships within complex data, such as genome signatures. The various parameters of the SOM training phase are investigated f...

Descripción completa

Detalles Bibliográficos
Autor principal: Gatherer, Derek
Formato: Texto
Lenguaje:English
Publicado: Libertas Academica 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2684143/
https://www.ncbi.nlm.nih.gov/pubmed/19468314
_version_ 1782167176005812224
author Gatherer, Derek
author_facet Gatherer, Derek
author_sort Gatherer, Derek
collection PubMed
description Genome signatures are data vectors derived from the compositional statistics of DNA. The self-organizing map (SOM) is a neural network method for the conceptualisation of relationships within complex data, such as genome signatures. The various parameters of the SOM training phase are investigated for their effect on the accuracy of the resulting output map. It is concluded that larger SOMs, as well as taking longer to train, are less sensitive in phylogenetic classification of unknown DNA sequences. However, where a classification can be made, a larger SOM is more accurate. Increasing the number of iterations in the training phase of the SOM only slightly increases accuracy, without improving sensitivity. The optimal length of the DNA sequence k-mer from which the genome signature should be derived is 4 or 5, but shorter values are almost as effective. In general, these results indicate that small, rapidly trained SOMs are generally as good as larger, longer trained ones for the analysis of genome signatures. These results may also be more generally applicable to the use of SOMs for other complex data sets, such as microarray data.
format Text
id pubmed-2684143
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher Libertas Academica
record_format MEDLINE/PubMed
spelling pubmed-26841432009-05-22 Genome Signatures, Self-Organizing Maps and Higher Order Phylogenies: A Parametric Analysis Gatherer, Derek Evol Bioinform Online Original Research Genome signatures are data vectors derived from the compositional statistics of DNA. The self-organizing map (SOM) is a neural network method for the conceptualisation of relationships within complex data, such as genome signatures. The various parameters of the SOM training phase are investigated for their effect on the accuracy of the resulting output map. It is concluded that larger SOMs, as well as taking longer to train, are less sensitive in phylogenetic classification of unknown DNA sequences. However, where a classification can be made, a larger SOM is more accurate. Increasing the number of iterations in the training phase of the SOM only slightly increases accuracy, without improving sensitivity. The optimal length of the DNA sequence k-mer from which the genome signature should be derived is 4 or 5, but shorter values are almost as effective. In general, these results indicate that small, rapidly trained SOMs are generally as good as larger, longer trained ones for the analysis of genome signatures. These results may also be more generally applicable to the use of SOMs for other complex data sets, such as microarray data. Libertas Academica 2007-09-17 /pmc/articles/PMC2684143/ /pubmed/19468314 Text en Copyright © 2007 The authors. http://creativecommons.org/licenses/by/3.0 This article is published under the Creative Commons Attribution By licence. For further information go to: http://creativecommons.org/licenses/by/3.0. (http://creativecommons.org/licenses/by/3.0)
spellingShingle Original Research
Gatherer, Derek
Genome Signatures, Self-Organizing Maps and Higher Order Phylogenies: A Parametric Analysis
title Genome Signatures, Self-Organizing Maps and Higher Order Phylogenies: A Parametric Analysis
title_full Genome Signatures, Self-Organizing Maps and Higher Order Phylogenies: A Parametric Analysis
title_fullStr Genome Signatures, Self-Organizing Maps and Higher Order Phylogenies: A Parametric Analysis
title_full_unstemmed Genome Signatures, Self-Organizing Maps and Higher Order Phylogenies: A Parametric Analysis
title_short Genome Signatures, Self-Organizing Maps and Higher Order Phylogenies: A Parametric Analysis
title_sort genome signatures, self-organizing maps and higher order phylogenies: a parametric analysis
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2684143/
https://www.ncbi.nlm.nih.gov/pubmed/19468314
work_keys_str_mv AT gathererderek genomesignaturesselforganizingmapsandhigherorderphylogeniesaparametricanalysis