Cargando…
LoReTTA, a user-friendly tool for assembling viral genomes from PacBio sequence data
Long-read, single-molecule DNA sequencing technologies have triggered a revolution in genomics by enabling the determination of large, reference-quality genomes in ways that overcome some of the limitations of short-read sequencing. However, the greater length and higher error rate of the reads gene...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8111061/ https://www.ncbi.nlm.nih.gov/pubmed/33996146 http://dx.doi.org/10.1093/ve/veab042 |
_version_ | 1783690423483498496 |
---|---|
author | Al Qaffas, Ahmed Nichols, Jenna Davison, Andrew J Ourahmane, Amine Hertel, Laura McVoy, Michael A Camiolo, Salvatore |
author_facet | Al Qaffas, Ahmed Nichols, Jenna Davison, Andrew J Ourahmane, Amine Hertel, Laura McVoy, Michael A Camiolo, Salvatore |
author_sort | Al Qaffas, Ahmed |
collection | PubMed |
description | Long-read, single-molecule DNA sequencing technologies have triggered a revolution in genomics by enabling the determination of large, reference-quality genomes in ways that overcome some of the limitations of short-read sequencing. However, the greater length and higher error rate of the reads generated on long-read platforms make the tools used for assembling short reads unsuitable for use in data assembly and motivate the development of new approaches. We present LoReTTA (Long Read Template-Targeted Assembler), a tool designed for performing de novo assembly of long reads generated from viral genomes on the PacBio platform. LoReTTA exploits a reference genome to guide the assembly process, an approach that has been successful with short reads. The tool was designed to deal with reads originating from viral genomes, which feature high genetic variability, possible multiple isoforms, and the dominant presence of additional organisms in clinical or environmental samples. LoReTTA was tested on a range of simulated and experimental datasets and outperformed established long-read assemblers in terms of assembly contiguity and accuracy. The software runs under the Linux operating system, is designed for easy adaptation to alternative systems, and features an automatic installation pipeline that takes care of the required dependencies. A command-line version and a user-friendly graphical interface version are available under a GPLv3 license at https://bioinformatics.cvr.ac.uk/software/ with the manual and a test dataset. |
format | Online Article Text |
id | pubmed-8111061 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-81110612021-05-13 LoReTTA, a user-friendly tool for assembling viral genomes from PacBio sequence data Al Qaffas, Ahmed Nichols, Jenna Davison, Andrew J Ourahmane, Amine Hertel, Laura McVoy, Michael A Camiolo, Salvatore Virus Evol Resources Long-read, single-molecule DNA sequencing technologies have triggered a revolution in genomics by enabling the determination of large, reference-quality genomes in ways that overcome some of the limitations of short-read sequencing. However, the greater length and higher error rate of the reads generated on long-read platforms make the tools used for assembling short reads unsuitable for use in data assembly and motivate the development of new approaches. We present LoReTTA (Long Read Template-Targeted Assembler), a tool designed for performing de novo assembly of long reads generated from viral genomes on the PacBio platform. LoReTTA exploits a reference genome to guide the assembly process, an approach that has been successful with short reads. The tool was designed to deal with reads originating from viral genomes, which feature high genetic variability, possible multiple isoforms, and the dominant presence of additional organisms in clinical or environmental samples. LoReTTA was tested on a range of simulated and experimental datasets and outperformed established long-read assemblers in terms of assembly contiguity and accuracy. The software runs under the Linux operating system, is designed for easy adaptation to alternative systems, and features an automatic installation pipeline that takes care of the required dependencies. A command-line version and a user-friendly graphical interface version are available under a GPLv3 license at https://bioinformatics.cvr.ac.uk/software/ with the manual and a test dataset. Oxford University Press 2021-04-23 /pmc/articles/PMC8111061/ /pubmed/33996146 http://dx.doi.org/10.1093/ve/veab042 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Resources Al Qaffas, Ahmed Nichols, Jenna Davison, Andrew J Ourahmane, Amine Hertel, Laura McVoy, Michael A Camiolo, Salvatore LoReTTA, a user-friendly tool for assembling viral genomes from PacBio sequence data |
title | LoReTTA, a user-friendly tool for assembling viral genomes from PacBio sequence data |
title_full | LoReTTA, a user-friendly tool for assembling viral genomes from PacBio sequence data |
title_fullStr | LoReTTA, a user-friendly tool for assembling viral genomes from PacBio sequence data |
title_full_unstemmed | LoReTTA, a user-friendly tool for assembling viral genomes from PacBio sequence data |
title_short | LoReTTA, a user-friendly tool for assembling viral genomes from PacBio sequence data |
title_sort | loretta, a user-friendly tool for assembling viral genomes from pacbio sequence data |
topic | Resources |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8111061/ https://www.ncbi.nlm.nih.gov/pubmed/33996146 http://dx.doi.org/10.1093/ve/veab042 |
work_keys_str_mv | AT alqaffasahmed lorettaauserfriendlytoolforassemblingviralgenomesfrompacbiosequencedata AT nicholsjenna lorettaauserfriendlytoolforassemblingviralgenomesfrompacbiosequencedata AT davisonandrewj lorettaauserfriendlytoolforassemblingviralgenomesfrompacbiosequencedata AT ourahmaneamine lorettaauserfriendlytoolforassemblingviralgenomesfrompacbiosequencedata AT hertellaura lorettaauserfriendlytoolforassemblingviralgenomesfrompacbiosequencedata AT mcvoymichaela lorettaauserfriendlytoolforassemblingviralgenomesfrompacbiosequencedata AT camiolosalvatore lorettaauserfriendlytoolforassemblingviralgenomesfrompacbiosequencedata |