Cargando…

Colib’read on galaxy: a tools suite dedicated to biological information extraction from raw NGS reads

BACKGROUND: With next-generation sequencing (NGS) technologies, the life sciences face a deluge of raw data. Classical analysis processes for such data often begin with an assembly step, needing large amounts of computing resources, and potentially removing or modifying parts of the biological infor...

Descripción completa

Detalles Bibliográficos
Autores principales: Le Bras, Yvan, Collin, Olivier, Monjeaud, Cyril, Lacroix, Vincent, Rivals, Éric, Lemaitre, Claire, Miele, Vincent, Sacomoto, Gustavo, Marchet, Camille, Cazaux, Bastien, Zine El Aabidine, Amal, Salmela, Leena, Alves-Carvalho, Susete, Andrieux, Alexan, Uricaru, Raluca, Peterlongo, Pierre
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4750246/
https://www.ncbi.nlm.nih.gov/pubmed/26870323
http://dx.doi.org/10.1186/s13742-015-0105-2
_version_ 1782415403572527104
author Le Bras, Yvan
Collin, Olivier
Monjeaud, Cyril
Lacroix, Vincent
Rivals, Éric
Lemaitre, Claire
Miele, Vincent
Sacomoto, Gustavo
Marchet, Camille
Cazaux, Bastien
Zine El Aabidine, Amal
Salmela, Leena
Alves-Carvalho, Susete
Andrieux, Alexan
Uricaru, Raluca
Peterlongo, Pierre
author_facet Le Bras, Yvan
Collin, Olivier
Monjeaud, Cyril
Lacroix, Vincent
Rivals, Éric
Lemaitre, Claire
Miele, Vincent
Sacomoto, Gustavo
Marchet, Camille
Cazaux, Bastien
Zine El Aabidine, Amal
Salmela, Leena
Alves-Carvalho, Susete
Andrieux, Alexan
Uricaru, Raluca
Peterlongo, Pierre
author_sort Le Bras, Yvan
collection PubMed
description BACKGROUND: With next-generation sequencing (NGS) technologies, the life sciences face a deluge of raw data. Classical analysis processes for such data often begin with an assembly step, needing large amounts of computing resources, and potentially removing or modifying parts of the biological information contained in the data. Our approach proposes to focus directly on biological questions, by considering raw unassembled NGS data, through a suite of six command-line tools. FINDINGS: Dedicated to ‘whole-genome assembly-free’ treatments, the Colib’read tools suite uses optimized algorithms for various analyses of NGS datasets, such as variant calling or read set comparisons. Based on the use of a de Bruijn graph and bloom filter, such analyses can be performed in a few hours, using small amounts of memory. Applications using real data demonstrate the good accuracy of these tools compared to classical approaches. To facilitate data analysis and tools dissemination, we developed Galaxy tools and tool shed repositories. CONCLUSIONS: With the Colib’read Galaxy tools suite, we enable a broad range of life scientists to analyze raw NGS data. More importantly, our approach allows the maximum biological information to be retained in the data, and uses a very low memory footprint. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13742-015-0105-2) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4750246
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-47502462016-02-12 Colib’read on galaxy: a tools suite dedicated to biological information extraction from raw NGS reads Le Bras, Yvan Collin, Olivier Monjeaud, Cyril Lacroix, Vincent Rivals, Éric Lemaitre, Claire Miele, Vincent Sacomoto, Gustavo Marchet, Camille Cazaux, Bastien Zine El Aabidine, Amal Salmela, Leena Alves-Carvalho, Susete Andrieux, Alexan Uricaru, Raluca Peterlongo, Pierre Gigascience Technical Note BACKGROUND: With next-generation sequencing (NGS) technologies, the life sciences face a deluge of raw data. Classical analysis processes for such data often begin with an assembly step, needing large amounts of computing resources, and potentially removing or modifying parts of the biological information contained in the data. Our approach proposes to focus directly on biological questions, by considering raw unassembled NGS data, through a suite of six command-line tools. FINDINGS: Dedicated to ‘whole-genome assembly-free’ treatments, the Colib’read tools suite uses optimized algorithms for various analyses of NGS datasets, such as variant calling or read set comparisons. Based on the use of a de Bruijn graph and bloom filter, such analyses can be performed in a few hours, using small amounts of memory. Applications using real data demonstrate the good accuracy of these tools compared to classical approaches. To facilitate data analysis and tools dissemination, we developed Galaxy tools and tool shed repositories. CONCLUSIONS: With the Colib’read Galaxy tools suite, we enable a broad range of life scientists to analyze raw NGS data. More importantly, our approach allows the maximum biological information to be retained in the data, and uses a very low memory footprint. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13742-015-0105-2) contains supplementary material, which is available to authorized users. BioMed Central 2016-02-11 /pmc/articles/PMC4750246/ /pubmed/26870323 http://dx.doi.org/10.1186/s13742-015-0105-2 Text en © Le Bras et al. 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Technical Note
Le Bras, Yvan
Collin, Olivier
Monjeaud, Cyril
Lacroix, Vincent
Rivals, Éric
Lemaitre, Claire
Miele, Vincent
Sacomoto, Gustavo
Marchet, Camille
Cazaux, Bastien
Zine El Aabidine, Amal
Salmela, Leena
Alves-Carvalho, Susete
Andrieux, Alexan
Uricaru, Raluca
Peterlongo, Pierre
Colib’read on galaxy: a tools suite dedicated to biological information extraction from raw NGS reads
title Colib’read on galaxy: a tools suite dedicated to biological information extraction from raw NGS reads
title_full Colib’read on galaxy: a tools suite dedicated to biological information extraction from raw NGS reads
title_fullStr Colib’read on galaxy: a tools suite dedicated to biological information extraction from raw NGS reads
title_full_unstemmed Colib’read on galaxy: a tools suite dedicated to biological information extraction from raw NGS reads
title_short Colib’read on galaxy: a tools suite dedicated to biological information extraction from raw NGS reads
title_sort colib’read on galaxy: a tools suite dedicated to biological information extraction from raw ngs reads
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4750246/
https://www.ncbi.nlm.nih.gov/pubmed/26870323
http://dx.doi.org/10.1186/s13742-015-0105-2
work_keys_str_mv AT lebrasyvan colibreadongalaxyatoolssuitededicatedtobiologicalinformationextractionfromrawngsreads
AT collinolivier colibreadongalaxyatoolssuitededicatedtobiologicalinformationextractionfromrawngsreads
AT monjeaudcyril colibreadongalaxyatoolssuitededicatedtobiologicalinformationextractionfromrawngsreads
AT lacroixvincent colibreadongalaxyatoolssuitededicatedtobiologicalinformationextractionfromrawngsreads
AT rivalseric colibreadongalaxyatoolssuitededicatedtobiologicalinformationextractionfromrawngsreads
AT lemaitreclaire colibreadongalaxyatoolssuitededicatedtobiologicalinformationextractionfromrawngsreads
AT mielevincent colibreadongalaxyatoolssuitededicatedtobiologicalinformationextractionfromrawngsreads
AT sacomotogustavo colibreadongalaxyatoolssuitededicatedtobiologicalinformationextractionfromrawngsreads
AT marchetcamille colibreadongalaxyatoolssuitededicatedtobiologicalinformationextractionfromrawngsreads
AT cazauxbastien colibreadongalaxyatoolssuitededicatedtobiologicalinformationextractionfromrawngsreads
AT zineelaabidineamal colibreadongalaxyatoolssuitededicatedtobiologicalinformationextractionfromrawngsreads
AT salmelaleena colibreadongalaxyatoolssuitededicatedtobiologicalinformationextractionfromrawngsreads
AT alvescarvalhosusete colibreadongalaxyatoolssuitededicatedtobiologicalinformationextractionfromrawngsreads
AT andrieuxalexan colibreadongalaxyatoolssuitededicatedtobiologicalinformationextractionfromrawngsreads
AT uricaruraluca colibreadongalaxyatoolssuitededicatedtobiologicalinformationextractionfromrawngsreads
AT peterlongopierre colibreadongalaxyatoolssuitededicatedtobiologicalinformationextractionfromrawngsreads