Cargando…

MacSyFinder: A Program to Mine Genomes for Molecular Systems with an Application to CRISPR-Cas Systems

MOTIVATION: Biologists often wish to use their knowledge on a few experimental models of a given molecular system to identify homologs in genomic data. We developed a generic tool for this purpose. RESULTS: Macromolecular System Finder (MacSyFinder) provides a flexible framework to model the propert...

Descripción completa

Detalles Bibliográficos
Autores principales: Abby, Sophie S., Néron, Bertrand, Ménager, Hervé, Touchon, Marie, Rocha, Eduardo P. C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4201578/
https://www.ncbi.nlm.nih.gov/pubmed/25330359
http://dx.doi.org/10.1371/journal.pone.0110726
_version_ 1782340199291813888
author Abby, Sophie S.
Néron, Bertrand
Ménager, Hervé
Touchon, Marie
Rocha, Eduardo P. C.
author_facet Abby, Sophie S.
Néron, Bertrand
Ménager, Hervé
Touchon, Marie
Rocha, Eduardo P. C.
author_sort Abby, Sophie S.
collection PubMed
description MOTIVATION: Biologists often wish to use their knowledge on a few experimental models of a given molecular system to identify homologs in genomic data. We developed a generic tool for this purpose. RESULTS: Macromolecular System Finder (MacSyFinder) provides a flexible framework to model the properties of molecular systems (cellular machinery or pathway) including their components, evolutionary associations with other systems and genetic architecture. Modelled features also include functional analogs, and the multiple uses of a same component by different systems. Models are used to search for molecular systems in complete genomes or in unstructured data like metagenomes. The components of the systems are searched by sequence similarity using Hidden Markov model (HMM) protein profiles. The assignment of hits to a given system is decided based on compliance with the content and organization of the system model. A graphical interface, MacSyView, facilitates the analysis of the results by showing overviews of component content and genomic context. To exemplify the use of MacSyFinder we built models to detect and class CRISPR-Cas systems following a previously established classification. We show that MacSyFinder allows to easily define an accurate “Cas-finder” using publicly available protein profiles. AVAILABILITY AND IMPLEMENTATION: MacSyFinder is a standalone application implemented in Python. It requires Python 2.7, Hmmer and makeblastdb (version 2.2.28 or higher). It is freely available with its source code under a GPLv3 license at https://github.com/gem-pasteur/macsyfinder. It is compatible with all platforms supporting Python and Hmmer/makeblastdb. The “Cas-finder” (models and HMM profiles) is distributed as a compressed tarball archive as Supporting Information.
format Online
Article
Text
id pubmed-4201578
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-42015782014-10-21 MacSyFinder: A Program to Mine Genomes for Molecular Systems with an Application to CRISPR-Cas Systems Abby, Sophie S. Néron, Bertrand Ménager, Hervé Touchon, Marie Rocha, Eduardo P. C. PLoS One Research Article MOTIVATION: Biologists often wish to use their knowledge on a few experimental models of a given molecular system to identify homologs in genomic data. We developed a generic tool for this purpose. RESULTS: Macromolecular System Finder (MacSyFinder) provides a flexible framework to model the properties of molecular systems (cellular machinery or pathway) including their components, evolutionary associations with other systems and genetic architecture. Modelled features also include functional analogs, and the multiple uses of a same component by different systems. Models are used to search for molecular systems in complete genomes or in unstructured data like metagenomes. The components of the systems are searched by sequence similarity using Hidden Markov model (HMM) protein profiles. The assignment of hits to a given system is decided based on compliance with the content and organization of the system model. A graphical interface, MacSyView, facilitates the analysis of the results by showing overviews of component content and genomic context. To exemplify the use of MacSyFinder we built models to detect and class CRISPR-Cas systems following a previously established classification. We show that MacSyFinder allows to easily define an accurate “Cas-finder” using publicly available protein profiles. AVAILABILITY AND IMPLEMENTATION: MacSyFinder is a standalone application implemented in Python. It requires Python 2.7, Hmmer and makeblastdb (version 2.2.28 or higher). It is freely available with its source code under a GPLv3 license at https://github.com/gem-pasteur/macsyfinder. It is compatible with all platforms supporting Python and Hmmer/makeblastdb. The “Cas-finder” (models and HMM profiles) is distributed as a compressed tarball archive as Supporting Information. Public Library of Science 2014-10-17 /pmc/articles/PMC4201578/ /pubmed/25330359 http://dx.doi.org/10.1371/journal.pone.0110726 Text en © 2014 Abby et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Abby, Sophie S.
Néron, Bertrand
Ménager, Hervé
Touchon, Marie
Rocha, Eduardo P. C.
MacSyFinder: A Program to Mine Genomes for Molecular Systems with an Application to CRISPR-Cas Systems
title MacSyFinder: A Program to Mine Genomes for Molecular Systems with an Application to CRISPR-Cas Systems
title_full MacSyFinder: A Program to Mine Genomes for Molecular Systems with an Application to CRISPR-Cas Systems
title_fullStr MacSyFinder: A Program to Mine Genomes for Molecular Systems with an Application to CRISPR-Cas Systems
title_full_unstemmed MacSyFinder: A Program to Mine Genomes for Molecular Systems with an Application to CRISPR-Cas Systems
title_short MacSyFinder: A Program to Mine Genomes for Molecular Systems with an Application to CRISPR-Cas Systems
title_sort macsyfinder: a program to mine genomes for molecular systems with an application to crispr-cas systems
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4201578/
https://www.ncbi.nlm.nih.gov/pubmed/25330359
http://dx.doi.org/10.1371/journal.pone.0110726
work_keys_str_mv AT abbysophies macsyfinderaprogramtominegenomesformolecularsystemswithanapplicationtocrisprcassystems
AT neronbertrand macsyfinderaprogramtominegenomesformolecularsystemswithanapplicationtocrisprcassystems
AT menagerherve macsyfinderaprogramtominegenomesformolecularsystemswithanapplicationtocrisprcassystems
AT touchonmarie macsyfinderaprogramtominegenomesformolecularsystemswithanapplicationtocrisprcassystems
AT rochaeduardopc macsyfinderaprogramtominegenomesformolecularsystemswithanapplicationtocrisprcassystems