Cargando…

HaTSPiL: A modular pipeline for high-throughput sequencing data analysis

BACKGROUND: Next generation sequencing methods are widely adopted for a large amount of scientific purposes, from pure research to health-related studies. The decreasing costs per analysis led to big amounts of generated data and to the subsequent improvement of software for the respective analyses....

Descripción completa

Detalles Bibliográficos
Autores principales: Morandi, Edoardo, Cereda, Matteo, Incarnato, Danny, Parlato, Caterina, Basile, Giulia, Anselmi, Francesca, Lauria, Andrea, Simon, Lisa Marie, Laurence Polignano, Isabelle, Arruga, Francesca, Deaglio, Silvia, Tirtei, Elisa, Fagioli, Franca, Oliviero, Salvatore
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6793853/
https://www.ncbi.nlm.nih.gov/pubmed/31613890
http://dx.doi.org/10.1371/journal.pone.0222512
_version_ 1783459203153657856
author Morandi, Edoardo
Cereda, Matteo
Incarnato, Danny
Parlato, Caterina
Basile, Giulia
Anselmi, Francesca
Lauria, Andrea
Simon, Lisa Marie
Laurence Polignano, Isabelle
Arruga, Francesca
Deaglio, Silvia
Tirtei, Elisa
Fagioli, Franca
Oliviero, Salvatore
author_facet Morandi, Edoardo
Cereda, Matteo
Incarnato, Danny
Parlato, Caterina
Basile, Giulia
Anselmi, Francesca
Lauria, Andrea
Simon, Lisa Marie
Laurence Polignano, Isabelle
Arruga, Francesca
Deaglio, Silvia
Tirtei, Elisa
Fagioli, Franca
Oliviero, Salvatore
author_sort Morandi, Edoardo
collection PubMed
description BACKGROUND: Next generation sequencing methods are widely adopted for a large amount of scientific purposes, from pure research to health-related studies. The decreasing costs per analysis led to big amounts of generated data and to the subsequent improvement of software for the respective analyses. As a consequence, many approaches have been developed to chain different software in order to obtain reliable and reproducible workflows. However, the large range of applications for NGS approaches entails the challenge to manage many different workflows without losing reliability. METHODS: We here present a high-throughput sequencing pipeline (HaTSPiL), a Python-powered CLI tool designed to handle different approaches for data analysis with a high level of reliability. The software relies on the barcoding of filenames using a human readable naming convention that contains any information regarding the sample needed by the software to automatically choose different workflows and parameters. HaTSPiL is highly modular and customisable, allowing the users to extend its features for any specific need. CONCLUSIONS: HaTSPiL is licensed as Free Software under the MIT license and it is available at https://github.com/dodomorandi/hatspil.
format Online
Article
Text
id pubmed-6793853
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-67938532019-10-25 HaTSPiL: A modular pipeline for high-throughput sequencing data analysis Morandi, Edoardo Cereda, Matteo Incarnato, Danny Parlato, Caterina Basile, Giulia Anselmi, Francesca Lauria, Andrea Simon, Lisa Marie Laurence Polignano, Isabelle Arruga, Francesca Deaglio, Silvia Tirtei, Elisa Fagioli, Franca Oliviero, Salvatore PLoS One Research Article BACKGROUND: Next generation sequencing methods are widely adopted for a large amount of scientific purposes, from pure research to health-related studies. The decreasing costs per analysis led to big amounts of generated data and to the subsequent improvement of software for the respective analyses. As a consequence, many approaches have been developed to chain different software in order to obtain reliable and reproducible workflows. However, the large range of applications for NGS approaches entails the challenge to manage many different workflows without losing reliability. METHODS: We here present a high-throughput sequencing pipeline (HaTSPiL), a Python-powered CLI tool designed to handle different approaches for data analysis with a high level of reliability. The software relies on the barcoding of filenames using a human readable naming convention that contains any information regarding the sample needed by the software to automatically choose different workflows and parameters. HaTSPiL is highly modular and customisable, allowing the users to extend its features for any specific need. CONCLUSIONS: HaTSPiL is licensed as Free Software under the MIT license and it is available at https://github.com/dodomorandi/hatspil. Public Library of Science 2019-10-15 /pmc/articles/PMC6793853/ /pubmed/31613890 http://dx.doi.org/10.1371/journal.pone.0222512 Text en © 2019 Morandi et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Morandi, Edoardo
Cereda, Matteo
Incarnato, Danny
Parlato, Caterina
Basile, Giulia
Anselmi, Francesca
Lauria, Andrea
Simon, Lisa Marie
Laurence Polignano, Isabelle
Arruga, Francesca
Deaglio, Silvia
Tirtei, Elisa
Fagioli, Franca
Oliviero, Salvatore
HaTSPiL: A modular pipeline for high-throughput sequencing data analysis
title HaTSPiL: A modular pipeline for high-throughput sequencing data analysis
title_full HaTSPiL: A modular pipeline for high-throughput sequencing data analysis
title_fullStr HaTSPiL: A modular pipeline for high-throughput sequencing data analysis
title_full_unstemmed HaTSPiL: A modular pipeline for high-throughput sequencing data analysis
title_short HaTSPiL: A modular pipeline for high-throughput sequencing data analysis
title_sort hatspil: a modular pipeline for high-throughput sequencing data analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6793853/
https://www.ncbi.nlm.nih.gov/pubmed/31613890
http://dx.doi.org/10.1371/journal.pone.0222512
work_keys_str_mv AT morandiedoardo hatspilamodularpipelineforhighthroughputsequencingdataanalysis
AT ceredamatteo hatspilamodularpipelineforhighthroughputsequencingdataanalysis
AT incarnatodanny hatspilamodularpipelineforhighthroughputsequencingdataanalysis
AT parlatocaterina hatspilamodularpipelineforhighthroughputsequencingdataanalysis
AT basilegiulia hatspilamodularpipelineforhighthroughputsequencingdataanalysis
AT anselmifrancesca hatspilamodularpipelineforhighthroughputsequencingdataanalysis
AT lauriaandrea hatspilamodularpipelineforhighthroughputsequencingdataanalysis
AT simonlisamarie hatspilamodularpipelineforhighthroughputsequencingdataanalysis
AT laurencepolignanoisabelle hatspilamodularpipelineforhighthroughputsequencingdataanalysis
AT arrugafrancesca hatspilamodularpipelineforhighthroughputsequencingdataanalysis
AT deagliosilvia hatspilamodularpipelineforhighthroughputsequencingdataanalysis
AT tirteielisa hatspilamodularpipelineforhighthroughputsequencingdataanalysis
AT fagiolifranca hatspilamodularpipelineforhighthroughputsequencingdataanalysis
AT olivierosalvatore hatspilamodularpipelineforhighthroughputsequencingdataanalysis