Cargando…

Databases of homologous gene families for comparative genomics

BACKGROUND: Comparative genomics is a central step in many sequence analysis studies, from gene annotation and the identification of new functional regions in genomes, to the study of evolutionary processes at the molecular level (speciation, single gene or whole genome duplications, etc.) and phylo...

Descripción completa

Detalles Bibliográficos
Autores principales: Penel, Simon, Arigon, Anne-Muriel, Dufayard, Jean-François, Sertier, Anne-Sophie, Daubin, Vincent, Duret, Laurent, Gouy, Manolo, Perrière, Guy
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2697650/
https://www.ncbi.nlm.nih.gov/pubmed/19534752
http://dx.doi.org/10.1186/1471-2105-10-S6-S3
_version_ 1782168348510912512
author Penel, Simon
Arigon, Anne-Muriel
Dufayard, Jean-François
Sertier, Anne-Sophie
Daubin, Vincent
Duret, Laurent
Gouy, Manolo
Perrière, Guy
author_facet Penel, Simon
Arigon, Anne-Muriel
Dufayard, Jean-François
Sertier, Anne-Sophie
Daubin, Vincent
Duret, Laurent
Gouy, Manolo
Perrière, Guy
author_sort Penel, Simon
collection PubMed
description BACKGROUND: Comparative genomics is a central step in many sequence analysis studies, from gene annotation and the identification of new functional regions in genomes, to the study of evolutionary processes at the molecular level (speciation, single gene or whole genome duplications, etc.) and phylogenetics. In that context, databases providing users high quality homologous families and sequence alignments as well as phylogenetic trees based on state of the art algorithms are becoming indispensable. METHODS: We developed an automated procedure allowing massive all-against-all similarity searches, gene clustering, multiple alignments computation, and phylogenetic trees construction and reconciliation. The application of this procedure to a very large set of sequences is possible through parallel computing on a large computer cluster. RESULTS: Three databases were developed using this procedure: HOVERGEN, HOGENOM and HOMOLENS. These databases share the same architecture but differ in their content. HOVERGEN contains sequences from vertebrates, HOGENOM is mainly devoted to completely sequenced microbial organisms, and HOMOLENS is devoted to metazoan genomes from Ensembl. Access to the databases is provided through Web query forms, a general retrieval system and a client-server graphical interface. The later can be used to perform tree-pattern based searches allowing, among other uses, to retrieve sets of orthologous genes. The three databases, as well as the software required to build and query them, can be used or downloaded from the PBIL (Pôle Bioinformatique Lyonnais) site at .
format Text
id pubmed-2697650
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26976502009-06-16 Databases of homologous gene families for comparative genomics Penel, Simon Arigon, Anne-Muriel Dufayard, Jean-François Sertier, Anne-Sophie Daubin, Vincent Duret, Laurent Gouy, Manolo Perrière, Guy BMC Bioinformatics Proceedings BACKGROUND: Comparative genomics is a central step in many sequence analysis studies, from gene annotation and the identification of new functional regions in genomes, to the study of evolutionary processes at the molecular level (speciation, single gene or whole genome duplications, etc.) and phylogenetics. In that context, databases providing users high quality homologous families and sequence alignments as well as phylogenetic trees based on state of the art algorithms are becoming indispensable. METHODS: We developed an automated procedure allowing massive all-against-all similarity searches, gene clustering, multiple alignments computation, and phylogenetic trees construction and reconciliation. The application of this procedure to a very large set of sequences is possible through parallel computing on a large computer cluster. RESULTS: Three databases were developed using this procedure: HOVERGEN, HOGENOM and HOMOLENS. These databases share the same architecture but differ in their content. HOVERGEN contains sequences from vertebrates, HOGENOM is mainly devoted to completely sequenced microbial organisms, and HOMOLENS is devoted to metazoan genomes from Ensembl. Access to the databases is provided through Web query forms, a general retrieval system and a client-server graphical interface. The later can be used to perform tree-pattern based searches allowing, among other uses, to retrieve sets of orthologous genes. The three databases, as well as the software required to build and query them, can be used or downloaded from the PBIL (Pôle Bioinformatique Lyonnais) site at . BioMed Central 2009-06-16 /pmc/articles/PMC2697650/ /pubmed/19534752 http://dx.doi.org/10.1186/1471-2105-10-S6-S3 Text en Copyright © 2009 Penel et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Penel, Simon
Arigon, Anne-Muriel
Dufayard, Jean-François
Sertier, Anne-Sophie
Daubin, Vincent
Duret, Laurent
Gouy, Manolo
Perrière, Guy
Databases of homologous gene families for comparative genomics
title Databases of homologous gene families for comparative genomics
title_full Databases of homologous gene families for comparative genomics
title_fullStr Databases of homologous gene families for comparative genomics
title_full_unstemmed Databases of homologous gene families for comparative genomics
title_short Databases of homologous gene families for comparative genomics
title_sort databases of homologous gene families for comparative genomics
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2697650/
https://www.ncbi.nlm.nih.gov/pubmed/19534752
http://dx.doi.org/10.1186/1471-2105-10-S6-S3
work_keys_str_mv AT penelsimon databasesofhomologousgenefamiliesforcomparativegenomics
AT arigonannemuriel databasesofhomologousgenefamiliesforcomparativegenomics
AT dufayardjeanfrancois databasesofhomologousgenefamiliesforcomparativegenomics
AT sertierannesophie databasesofhomologousgenefamiliesforcomparativegenomics
AT daubinvincent databasesofhomologousgenefamiliesforcomparativegenomics
AT duretlaurent databasesofhomologousgenefamiliesforcomparativegenomics
AT gouymanolo databasesofhomologousgenefamiliesforcomparativegenomics
AT perriereguy databasesofhomologousgenefamiliesforcomparativegenomics