Cargando…
SPLASH2 provides ultra-efficient, scalable, and unsupervised discovery on raw sequencing reads
SPLASH is an unsupervised, reference-free, and unifying algorithm that discovers regulated sequence variation through statistical analysis of k-mer composition, subsuming many application-specific algorithms. Here, we introduce SPLASH2, a fast, scalable implementation of SPLASH based on an efficient...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10055302/ https://www.ncbi.nlm.nih.gov/pubmed/36993432 http://dx.doi.org/10.1101/2023.03.17.533189 |
_version_ | 1785015852698959872 |
---|---|
author | Kokot, Marek Dehghannasiri, Roozbeh Baharav, Tavor Salzman, Julia Deorowicz, Sebastian |
author_facet | Kokot, Marek Dehghannasiri, Roozbeh Baharav, Tavor Salzman, Julia Deorowicz, Sebastian |
author_sort | Kokot, Marek |
collection | PubMed |
description | SPLASH is an unsupervised, reference-free, and unifying algorithm that discovers regulated sequence variation through statistical analysis of k-mer composition, subsuming many application-specific algorithms. Here, we introduce SPLASH2, a fast, scalable implementation of SPLASH based on an efficient k-mer counting approach. The pipeline has minimal installation requirements, and can be executed with a single command. SPLASH2 enables efficient analysis of massive datasets from a wide range of sequencing technologies and biological contexts at unmatched scale and speed, showcased by revealing new biology in rapid analysis of single-cell RNA-sequencing data from human muscle cells, and bulk RNA-seq from the entire Cancer Cell Line Encyclopedia (CCLE) and a study of Amyotrophic Lateral Sclerosis. |
format | Online Article Text |
id | pubmed-10055302 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-100553022023-03-30 SPLASH2 provides ultra-efficient, scalable, and unsupervised discovery on raw sequencing reads Kokot, Marek Dehghannasiri, Roozbeh Baharav, Tavor Salzman, Julia Deorowicz, Sebastian bioRxiv Article SPLASH is an unsupervised, reference-free, and unifying algorithm that discovers regulated sequence variation through statistical analysis of k-mer composition, subsuming many application-specific algorithms. Here, we introduce SPLASH2, a fast, scalable implementation of SPLASH based on an efficient k-mer counting approach. The pipeline has minimal installation requirements, and can be executed with a single command. SPLASH2 enables efficient analysis of massive datasets from a wide range of sequencing technologies and biological contexts at unmatched scale and speed, showcased by revealing new biology in rapid analysis of single-cell RNA-sequencing data from human muscle cells, and bulk RNA-seq from the entire Cancer Cell Line Encyclopedia (CCLE) and a study of Amyotrophic Lateral Sclerosis. Cold Spring Harbor Laboratory 2023-07-17 /pmc/articles/PMC10055302/ /pubmed/36993432 http://dx.doi.org/10.1101/2023.03.17.533189 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator. |
spellingShingle | Article Kokot, Marek Dehghannasiri, Roozbeh Baharav, Tavor Salzman, Julia Deorowicz, Sebastian SPLASH2 provides ultra-efficient, scalable, and unsupervised discovery on raw sequencing reads |
title | SPLASH2 provides ultra-efficient, scalable, and unsupervised discovery on raw sequencing reads |
title_full | SPLASH2 provides ultra-efficient, scalable, and unsupervised discovery on raw sequencing reads |
title_fullStr | SPLASH2 provides ultra-efficient, scalable, and unsupervised discovery on raw sequencing reads |
title_full_unstemmed | SPLASH2 provides ultra-efficient, scalable, and unsupervised discovery on raw sequencing reads |
title_short | SPLASH2 provides ultra-efficient, scalable, and unsupervised discovery on raw sequencing reads |
title_sort | splash2 provides ultra-efficient, scalable, and unsupervised discovery on raw sequencing reads |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10055302/ https://www.ncbi.nlm.nih.gov/pubmed/36993432 http://dx.doi.org/10.1101/2023.03.17.533189 |
work_keys_str_mv | AT kokotmarek splash2providesultraefficientscalableandunsuperviseddiscoveryonrawsequencingreads AT dehghannasiriroozbeh splash2providesultraefficientscalableandunsuperviseddiscoveryonrawsequencingreads AT baharavtavor splash2providesultraefficientscalableandunsuperviseddiscoveryonrawsequencingreads AT salzmanjulia splash2providesultraefficientscalableandunsuperviseddiscoveryonrawsequencingreads AT deorowiczsebastian splash2providesultraefficientscalableandunsuperviseddiscoveryonrawsequencingreads |