Cargando…

MACSIMS : multiple alignment of complete sequences information management system

BACKGROUND: In the post-genomic era, systems-level studies are being performed that seek to explain complex biological systems by integrating diverse resources from fields such as genomics, proteomics or transcriptomics. New information management systems are now needed for the collection, validatio...

Descripción completa

Detalles Bibliográficos
Autores principales: Thompson, Julie D, Muller, Arnaud, Waterhouse, Andrew, Procter, Jim, Barton, Geoffrey J, Plewniak, Frédéric, Poch, Olivier
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1539025/
https://www.ncbi.nlm.nih.gov/pubmed/16792820
http://dx.doi.org/10.1186/1471-2105-7-318
_version_ 1782129161480962048
author Thompson, Julie D
Muller, Arnaud
Waterhouse, Andrew
Procter, Jim
Barton, Geoffrey J
Plewniak, Frédéric
Poch, Olivier
author_facet Thompson, Julie D
Muller, Arnaud
Waterhouse, Andrew
Procter, Jim
Barton, Geoffrey J
Plewniak, Frédéric
Poch, Olivier
author_sort Thompson, Julie D
collection PubMed
description BACKGROUND: In the post-genomic era, systems-level studies are being performed that seek to explain complex biological systems by integrating diverse resources from fields such as genomics, proteomics or transcriptomics. New information management systems are now needed for the collection, validation and analysis of the vast amount of heterogeneous data available. Multiple alignments of complete sequences provide an ideal environment for the integration of this information in the context of the protein family. RESULTS: MACSIMS is a multiple alignment-based information management program that combines the advantages of both knowledge-based and ab initio sequence analysis methods. Structural and functional information is retrieved automatically from the public databases. In the multiple alignment, homologous regions are identified and the retrieved data is evaluated and propagated from known to unknown sequences with these reliable regions. In a large-scale evaluation, the specificity of the propagated sequence features is estimated to be >99%, i.e. very few false positive predictions are made. MACSIMS is then used to characterise mutations in a test set of 100 proteins that are known to be involved in human genetic diseases. The number of sequence features associated with these proteins was increased by 60%, compared to the features available in the public databases. An XML format output file allows automatic parsing of the MACSIM results, while a graphical display using the JalView program allows manual analysis. CONCLUSION: MACSIMS is a new information management system that incorporates detailed analyses of protein families at the structural, functional and evolutionary levels. MACSIMS thus provides a unique environment that facilitates knowledge extraction and the presentation of the most pertinent information to the biologist. A web server and the source code are available at .
format Text
id pubmed-1539025
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-15390252006-08-11 MACSIMS : multiple alignment of complete sequences information management system Thompson, Julie D Muller, Arnaud Waterhouse, Andrew Procter, Jim Barton, Geoffrey J Plewniak, Frédéric Poch, Olivier BMC Bioinformatics Software BACKGROUND: In the post-genomic era, systems-level studies are being performed that seek to explain complex biological systems by integrating diverse resources from fields such as genomics, proteomics or transcriptomics. New information management systems are now needed for the collection, validation and analysis of the vast amount of heterogeneous data available. Multiple alignments of complete sequences provide an ideal environment for the integration of this information in the context of the protein family. RESULTS: MACSIMS is a multiple alignment-based information management program that combines the advantages of both knowledge-based and ab initio sequence analysis methods. Structural and functional information is retrieved automatically from the public databases. In the multiple alignment, homologous regions are identified and the retrieved data is evaluated and propagated from known to unknown sequences with these reliable regions. In a large-scale evaluation, the specificity of the propagated sequence features is estimated to be >99%, i.e. very few false positive predictions are made. MACSIMS is then used to characterise mutations in a test set of 100 proteins that are known to be involved in human genetic diseases. The number of sequence features associated with these proteins was increased by 60%, compared to the features available in the public databases. An XML format output file allows automatic parsing of the MACSIM results, while a graphical display using the JalView program allows manual analysis. CONCLUSION: MACSIMS is a new information management system that incorporates detailed analyses of protein families at the structural, functional and evolutionary levels. MACSIMS thus provides a unique environment that facilitates knowledge extraction and the presentation of the most pertinent information to the biologist. A web server and the source code are available at . BioMed Central 2006-06-23 /pmc/articles/PMC1539025/ /pubmed/16792820 http://dx.doi.org/10.1186/1471-2105-7-318 Text en Copyright © 2006 Thompson et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Thompson, Julie D
Muller, Arnaud
Waterhouse, Andrew
Procter, Jim
Barton, Geoffrey J
Plewniak, Frédéric
Poch, Olivier
MACSIMS : multiple alignment of complete sequences information management system
title MACSIMS : multiple alignment of complete sequences information management system
title_full MACSIMS : multiple alignment of complete sequences information management system
title_fullStr MACSIMS : multiple alignment of complete sequences information management system
title_full_unstemmed MACSIMS : multiple alignment of complete sequences information management system
title_short MACSIMS : multiple alignment of complete sequences information management system
title_sort macsims : multiple alignment of complete sequences information management system
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1539025/
https://www.ncbi.nlm.nih.gov/pubmed/16792820
http://dx.doi.org/10.1186/1471-2105-7-318
work_keys_str_mv AT thompsonjulied macsimsmultiplealignmentofcompletesequencesinformationmanagementsystem
AT mullerarnaud macsimsmultiplealignmentofcompletesequencesinformationmanagementsystem
AT waterhouseandrew macsimsmultiplealignmentofcompletesequencesinformationmanagementsystem
AT procterjim macsimsmultiplealignmentofcompletesequencesinformationmanagementsystem
AT bartongeoffreyj macsimsmultiplealignmentofcompletesequencesinformationmanagementsystem
AT plewniakfrederic macsimsmultiplealignmentofcompletesequencesinformationmanagementsystem
AT pocholivier macsimsmultiplealignmentofcompletesequencesinformationmanagementsystem