Cargando…

long-read-tools.org: an interactive catalogue of analysis methods for long-read sequencing data

BACKGROUND: The data produced by long-read third-generation sequencers have unique characteristics compared to short-read sequencing data, often requiring tailored analysis tools for tasks ranging from quality control to downstream processing. The rapid growth in software that addresses these challe...

Descripción completa

Detalles Bibliográficos
Autores principales: Amarasinghe, Shanika L, Ritchie, Matthew E, Gouil, Quentin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7931822/
https://www.ncbi.nlm.nih.gov/pubmed/33590862
http://dx.doi.org/10.1093/gigascience/giab003
_version_ 1783660365778780160
author Amarasinghe, Shanika L
Ritchie, Matthew E
Gouil, Quentin
author_facet Amarasinghe, Shanika L
Ritchie, Matthew E
Gouil, Quentin
author_sort Amarasinghe, Shanika L
collection PubMed
description BACKGROUND: The data produced by long-read third-generation sequencers have unique characteristics compared to short-read sequencing data, often requiring tailored analysis tools for tasks ranging from quality control to downstream processing. The rapid growth in software that addresses these challenges for different genomics applications is difficult to keep track of, which makes it hard for users to choose the most appropriate tool for their analysis goal and for developers to identify areas of need and existing solutions to benchmark against. FINDINGS: We describe the implementation of long-read-tools.org, an open-source database that organizes the rapidly expanding collection of long-read data analysis tools and allows its exploration through interactive browsing and filtering. The current database release contains 478 tools across 32 categories. Most tools are developed in Python, and the most frequent analysis tasks include base calling, de novo assembly, error correction, quality checking/filtering, and isoform detection, while long-read single-cell data analysis and transcriptomics are areas with the fewest tools available. CONCLUSION: Continued growth in the application of long-read sequencing in genomics research positions the long-read-tools.org database as an essential resource that allows researchers to keep abreast of both established and emerging software to help guide the selection of the most relevant tool for their analysis needs.
format Online
Article
Text
id pubmed-7931822
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-79318222021-03-09 long-read-tools.org: an interactive catalogue of analysis methods for long-read sequencing data Amarasinghe, Shanika L Ritchie, Matthew E Gouil, Quentin Gigascience Technical Note BACKGROUND: The data produced by long-read third-generation sequencers have unique characteristics compared to short-read sequencing data, often requiring tailored analysis tools for tasks ranging from quality control to downstream processing. The rapid growth in software that addresses these challenges for different genomics applications is difficult to keep track of, which makes it hard for users to choose the most appropriate tool for their analysis goal and for developers to identify areas of need and existing solutions to benchmark against. FINDINGS: We describe the implementation of long-read-tools.org, an open-source database that organizes the rapidly expanding collection of long-read data analysis tools and allows its exploration through interactive browsing and filtering. The current database release contains 478 tools across 32 categories. Most tools are developed in Python, and the most frequent analysis tasks include base calling, de novo assembly, error correction, quality checking/filtering, and isoform detection, while long-read single-cell data analysis and transcriptomics are areas with the fewest tools available. CONCLUSION: Continued growth in the application of long-read sequencing in genomics research positions the long-read-tools.org database as an essential resource that allows researchers to keep abreast of both established and emerging software to help guide the selection of the most relevant tool for their analysis needs. Oxford University Press 2021-02-16 /pmc/articles/PMC7931822/ /pubmed/33590862 http://dx.doi.org/10.1093/gigascience/giab003 Text en © The Author(s) 2021. Published by Oxford University Press GigaScience. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Note
Amarasinghe, Shanika L
Ritchie, Matthew E
Gouil, Quentin
long-read-tools.org: an interactive catalogue of analysis methods for long-read sequencing data
title long-read-tools.org: an interactive catalogue of analysis methods for long-read sequencing data
title_full long-read-tools.org: an interactive catalogue of analysis methods for long-read sequencing data
title_fullStr long-read-tools.org: an interactive catalogue of analysis methods for long-read sequencing data
title_full_unstemmed long-read-tools.org: an interactive catalogue of analysis methods for long-read sequencing data
title_short long-read-tools.org: an interactive catalogue of analysis methods for long-read sequencing data
title_sort long-read-tools.org: an interactive catalogue of analysis methods for long-read sequencing data
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7931822/
https://www.ncbi.nlm.nih.gov/pubmed/33590862
http://dx.doi.org/10.1093/gigascience/giab003
work_keys_str_mv AT amarasingheshanikal longreadtoolsorganinteractivecatalogueofanalysismethodsforlongreadsequencingdata
AT ritchiematthewe longreadtoolsorganinteractivecatalogueofanalysismethodsforlongreadsequencingdata
AT gouilquentin longreadtoolsorganinteractivecatalogueofanalysismethodsforlongreadsequencingdata