Cargando…
Bios2mds: an R package for comparing orthologous protein families by metric multidimensional scaling
BACKGROUND: The distance matrix computed from multiple alignments of homologous sequences is widely used by distance-based phylogenetic methods to provide information on the evolution of protein families. This matrix can also be visualized in a low dimensional space by metric multidimensional scalin...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3403911/ https://www.ncbi.nlm.nih.gov/pubmed/22702410 http://dx.doi.org/10.1186/1471-2105-13-133 |
_version_ | 1782238943598608384 |
---|---|
author | Pelé, Julien Bécu, Jean-Michel Abdi, Hervé Chabbert, Marie |
author_facet | Pelé, Julien Bécu, Jean-Michel Abdi, Hervé Chabbert, Marie |
author_sort | Pelé, Julien |
collection | PubMed |
description | BACKGROUND: The distance matrix computed from multiple alignments of homologous sequences is widely used by distance-based phylogenetic methods to provide information on the evolution of protein families. This matrix can also be visualized in a low dimensional space by metric multidimensional scaling (MDS). Applied to protein families, MDS provides information complementary to the information derived from tree-based methods. Moreover, MDS gives a unique opportunity to compare orthologous sequence sets because it can add supplementary elements to a reference space. RESULTS: The R package bios2mds (from BIOlogical Sequences to MultiDimensional Scaling) has been designed to analyze multiple sequence alignments by MDS. Bios2mds starts with a sequence alignment, builds a matrix of distances between the aligned sequences, and represents this matrix by MDS to visualize a sequence space. This package also offers the possibility of performing K-means clustering in the MDS derived sequence space. Most importantly, bios2mds includes a function that projects supplementary elements (a.k.a. “out of sample” elements) onto the space defined by reference or “active” elements. Orthologous sequence sets can thus be compared in a straightforward way. The data analysis and visualization tools have been specifically designed for an easy monitoring of the evolutionary drift of protein sub-families. CONCLUSIONS: The bios2mds package provides the tools for a complete integrated pipeline aimed at the MDS analysis of multiple sets of orthologous sequences in the R statistical environment. In addition, as the analysis can be carried out from user provided matrices, the projection function can be widely used on any kind of data. |
format | Online Article Text |
id | pubmed-3403911 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-34039112012-07-25 Bios2mds: an R package for comparing orthologous protein families by metric multidimensional scaling Pelé, Julien Bécu, Jean-Michel Abdi, Hervé Chabbert, Marie BMC Bioinformatics Software BACKGROUND: The distance matrix computed from multiple alignments of homologous sequences is widely used by distance-based phylogenetic methods to provide information on the evolution of protein families. This matrix can also be visualized in a low dimensional space by metric multidimensional scaling (MDS). Applied to protein families, MDS provides information complementary to the information derived from tree-based methods. Moreover, MDS gives a unique opportunity to compare orthologous sequence sets because it can add supplementary elements to a reference space. RESULTS: The R package bios2mds (from BIOlogical Sequences to MultiDimensional Scaling) has been designed to analyze multiple sequence alignments by MDS. Bios2mds starts with a sequence alignment, builds a matrix of distances between the aligned sequences, and represents this matrix by MDS to visualize a sequence space. This package also offers the possibility of performing K-means clustering in the MDS derived sequence space. Most importantly, bios2mds includes a function that projects supplementary elements (a.k.a. “out of sample” elements) onto the space defined by reference or “active” elements. Orthologous sequence sets can thus be compared in a straightforward way. The data analysis and visualization tools have been specifically designed for an easy monitoring of the evolutionary drift of protein sub-families. CONCLUSIONS: The bios2mds package provides the tools for a complete integrated pipeline aimed at the MDS analysis of multiple sets of orthologous sequences in the R statistical environment. In addition, as the analysis can be carried out from user provided matrices, the projection function can be widely used on any kind of data. BioMed Central 2012-06-15 /pmc/articles/PMC3403911/ /pubmed/22702410 http://dx.doi.org/10.1186/1471-2105-13-133 Text en Copyright ©2012 Pelé et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Software Pelé, Julien Bécu, Jean-Michel Abdi, Hervé Chabbert, Marie Bios2mds: an R package for comparing orthologous protein families by metric multidimensional scaling |
title | Bios2mds: an R package for comparing orthologous protein families by metric multidimensional scaling |
title_full | Bios2mds: an R package for comparing orthologous protein families by metric multidimensional scaling |
title_fullStr | Bios2mds: an R package for comparing orthologous protein families by metric multidimensional scaling |
title_full_unstemmed | Bios2mds: an R package for comparing orthologous protein families by metric multidimensional scaling |
title_short | Bios2mds: an R package for comparing orthologous protein families by metric multidimensional scaling |
title_sort | bios2mds: an r package for comparing orthologous protein families by metric multidimensional scaling |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3403911/ https://www.ncbi.nlm.nih.gov/pubmed/22702410 http://dx.doi.org/10.1186/1471-2105-13-133 |
work_keys_str_mv | AT pelejulien bios2mdsanrpackageforcomparingorthologousproteinfamiliesbymetricmultidimensionalscaling AT becujeanmichel bios2mdsanrpackageforcomparingorthologousproteinfamiliesbymetricmultidimensionalscaling AT abdiherve bios2mdsanrpackageforcomparingorthologousproteinfamiliesbymetricmultidimensionalscaling AT chabbertmarie bios2mdsanrpackageforcomparingorthologousproteinfamiliesbymetricmultidimensionalscaling |