Cargando…
QUARTIC: QUick pArallel algoRithms for high-Throughput sequencIng data proCessing
Life science has entered the so-called 'big data era' where biologists, clinicians and bioinformaticians are overwhelmed with high-throughput sequencing data. While they offer new insights to decipher the genome structure they also raise major challenges to use them for daily clinical prac...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
F1000 Research Limited
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7429925/ https://www.ncbi.nlm.nih.gov/pubmed/32913637 http://dx.doi.org/10.12688/f1000research.22954.3 |
_version_ | 1783571345256218624 |
---|---|
author | Jarlier, Frédéric Joly, Nicolas Fedy, Nicolas Magalhaes, Thomas Sirotti, Leonor Paganiban, Paul Martin, Firmin McManus, Michael Hupé, Philippe |
author_facet | Jarlier, Frédéric Joly, Nicolas Fedy, Nicolas Magalhaes, Thomas Sirotti, Leonor Paganiban, Paul Martin, Firmin McManus, Michael Hupé, Philippe |
author_sort | Jarlier, Frédéric |
collection | PubMed |
description | Life science has entered the so-called 'big data era' where biologists, clinicians and bioinformaticians are overwhelmed with high-throughput sequencing data. While they offer new insights to decipher the genome structure they also raise major challenges to use them for daily clinical practice care and diagnosis purposes as they are bigger and bigger. Therefore, we implemented a software to reduce the time to delivery for the alignment and the sorting of high-throughput sequencing data. Our solution is implemented using Message Passing Interface and is intended for high-performance computing architecture. The software scales linearly with respect to the size of the data and ensures a total reproducibility with the traditional tools. For example, a 300X whole genome can be aligned and sorted within less than 9 hours with 128 cores. The software offers significant speed-up using multi-cores and multi-nodes parallelization. |
format | Online Article Text |
id | pubmed-7429925 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | F1000 Research Limited |
record_format | MEDLINE/PubMed |
spelling | pubmed-74299252020-09-09 QUARTIC: QUick pArallel algoRithms for high-Throughput sequencIng data proCessing Jarlier, Frédéric Joly, Nicolas Fedy, Nicolas Magalhaes, Thomas Sirotti, Leonor Paganiban, Paul Martin, Firmin McManus, Michael Hupé, Philippe F1000Res Software Tool Article Life science has entered the so-called 'big data era' where biologists, clinicians and bioinformaticians are overwhelmed with high-throughput sequencing data. While they offer new insights to decipher the genome structure they also raise major challenges to use them for daily clinical practice care and diagnosis purposes as they are bigger and bigger. Therefore, we implemented a software to reduce the time to delivery for the alignment and the sorting of high-throughput sequencing data. Our solution is implemented using Message Passing Interface and is intended for high-performance computing architecture. The software scales linearly with respect to the size of the data and ensures a total reproducibility with the traditional tools. For example, a 300X whole genome can be aligned and sorted within less than 9 hours with 128 cores. The software offers significant speed-up using multi-cores and multi-nodes parallelization. F1000 Research Limited 2020-10-08 /pmc/articles/PMC7429925/ /pubmed/32913637 http://dx.doi.org/10.12688/f1000research.22954.3 Text en Copyright: © 2020 Jarlier F et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Software Tool Article Jarlier, Frédéric Joly, Nicolas Fedy, Nicolas Magalhaes, Thomas Sirotti, Leonor Paganiban, Paul Martin, Firmin McManus, Michael Hupé, Philippe QUARTIC: QUick pArallel algoRithms for high-Throughput sequencIng data proCessing |
title | QUARTIC: QUick pArallel algoRithms for high-Throughput sequencIng data proCessing |
title_full | QUARTIC: QUick pArallel algoRithms for high-Throughput sequencIng data proCessing |
title_fullStr | QUARTIC: QUick pArallel algoRithms for high-Throughput sequencIng data proCessing |
title_full_unstemmed | QUARTIC: QUick pArallel algoRithms for high-Throughput sequencIng data proCessing |
title_short | QUARTIC: QUick pArallel algoRithms for high-Throughput sequencIng data proCessing |
title_sort | quartic: quick parallel algorithms for high-throughput sequencing data processing |
topic | Software Tool Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7429925/ https://www.ncbi.nlm.nih.gov/pubmed/32913637 http://dx.doi.org/10.12688/f1000research.22954.3 |
work_keys_str_mv | AT jarlierfrederic quarticquickparallelalgorithmsforhighthroughputsequencingdataprocessing AT jolynicolas quarticquickparallelalgorithmsforhighthroughputsequencingdataprocessing AT fedynicolas quarticquickparallelalgorithmsforhighthroughputsequencingdataprocessing AT magalhaesthomas quarticquickparallelalgorithmsforhighthroughputsequencingdataprocessing AT sirottileonor quarticquickparallelalgorithmsforhighthroughputsequencingdataprocessing AT paganibanpaul quarticquickparallelalgorithmsforhighthroughputsequencingdataprocessing AT martinfirmin quarticquickparallelalgorithmsforhighthroughputsequencingdataprocessing AT mcmanusmichael quarticquickparallelalgorithmsforhighthroughputsequencingdataprocessing AT hupephilippe quarticquickparallelalgorithmsforhighthroughputsequencingdataprocessing |