Cargando…

Reference-independent comparative metagenomics using cross-assembly: crAss

Motivation: Metagenomes are often characterized by high levels of unknown sequences. Reads derived from known microorganisms can easily be identified and analyzed using fast homology search algorithms and a suitable reference database, but the unknown sequences are often ignored in further analyses,...

Descripción completa

Detalles Bibliográficos
Autores principales: Dutilh, Bas E., Schmieder, Robert, Nulton, Jim, Felts, Ben, Salamon, Peter, Edwards, Robert A., Mokili, John L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3519457/
https://www.ncbi.nlm.nih.gov/pubmed/23074261
http://dx.doi.org/10.1093/bioinformatics/bts613
Descripción
Sumario:Motivation: Metagenomes are often characterized by high levels of unknown sequences. Reads derived from known microorganisms can easily be identified and analyzed using fast homology search algorithms and a suitable reference database, but the unknown sequences are often ignored in further analyses, biasing conclusions. Nevertheless, it is possible to use more data in a comparative metagenomic analysis by creating a cross-assembly of all reads, i.e. a single assembly of reads from different samples. Comparative metagenomics studies the interrelationships between metagenomes from different samples. Using an assembly algorithm is a fast and intuitive way to link (partially) homologous reads without requiring a database of reference sequences. Results: Here, we introduce crAss, a novel bioinformatic tool that enables fast simple analysis of cross-assembly files, yielding distances between all metagenomic sample pairs and an insightful image displaying the similarities. Availability and implementation: crAss is available as a web server at http://edwards.sdsu.edu/crass/, and the Perl source code can be downloaded to run as a stand-alone command line tool. Contact: dutilh@cmbi.ru.nl Supplementary information: Supplementary data are available at Bioinformatics online.