Cargando…
Advantages of long- and short-reads sequencing for the hybrid investigation of the Mycobacterium tuberculosis genome
INTRODUCTION: In the fight to limit the global spread of antibiotic resistance, computational challenges associated with sequencing technology can impact the accuracy of downstream analysis, including drug resistance identification, transmission, and genome resolution. About 10% of Mycobacterium tub...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9932330/ https://www.ncbi.nlm.nih.gov/pubmed/36819039 http://dx.doi.org/10.3389/fmicb.2023.1104456 |
_version_ | 1784889432880447488 |
---|---|
author | Di Marco, Federico Spitaleri, Andrea Battaglia, Simone Batignani, Virginia Cabibbe, Andrea Maurizio Cirillo, Daniela Maria |
author_facet | Di Marco, Federico Spitaleri, Andrea Battaglia, Simone Batignani, Virginia Cabibbe, Andrea Maurizio Cirillo, Daniela Maria |
author_sort | Di Marco, Federico |
collection | PubMed |
description | INTRODUCTION: In the fight to limit the global spread of antibiotic resistance, computational challenges associated with sequencing technology can impact the accuracy of downstream analysis, including drug resistance identification, transmission, and genome resolution. About 10% of Mycobacterium tuberculosis (MTB) genome is constituted by the PE/PPE family, a GC-rich repetitive genome region. Although sequencing using short read technology is widely used, it is well recognized its limit in the PE/PPE regions due to the unambiguously mapping process onto the reference genome. The aim of this study was to compare the performances of short-reads (SRS), long-reads (LRS) and hybrid-reads (HYBR) based analysis over different common investigative tasks: genome coverage estimation, variant calling and cluster analysis, drug resistance detection and de novo assembly. METHODS: For the study 13 model MTB clinical isolates were sequenced with both SRS and LRS. HYBR were produced correcting the long reads with the short reads. The fastq from the three approaches were then processed using a customized version of MTBseq for genome coverage estimation and variant calling and using two different assemblers for de novo assembly evaluation. RESULTS: Estimation of genome coverage performances showed lower 8X breadth coverage for SRS respect to LRS and HYBR: considering the PE/PPE genes, SRS showed low results for the PE_PGRS family, while obtained acceptable coverage in PE and PPE genes; LRS and HYBR reached optimal coverages in PE/PPE genes. For variant calling HYBR showed the highest resolution, detecting the highest percentage of uniquely identified mutations compared to LRS and SRS. All three approaches agreed on the identification of two major clusters, with HYBR identifying an higher number of SNPs between the two clusters. Comparing the quality of the assemblies, HYBR and LRS obtained better results than SRS. DISCUSSION: In conclusion, depending on the aim of the investigation, both SRS and LRS present complementary advantages and limitations implying that for a full resolution of MTB genomes, where all the mentioned analyses and both technologies are needed, the use of the HYBR approach represents a valid option and a well-rounded strategy. |
format | Online Article Text |
id | pubmed-9932330 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-99323302023-02-17 Advantages of long- and short-reads sequencing for the hybrid investigation of the Mycobacterium tuberculosis genome Di Marco, Federico Spitaleri, Andrea Battaglia, Simone Batignani, Virginia Cabibbe, Andrea Maurizio Cirillo, Daniela Maria Front Microbiol Microbiology INTRODUCTION: In the fight to limit the global spread of antibiotic resistance, computational challenges associated with sequencing technology can impact the accuracy of downstream analysis, including drug resistance identification, transmission, and genome resolution. About 10% of Mycobacterium tuberculosis (MTB) genome is constituted by the PE/PPE family, a GC-rich repetitive genome region. Although sequencing using short read technology is widely used, it is well recognized its limit in the PE/PPE regions due to the unambiguously mapping process onto the reference genome. The aim of this study was to compare the performances of short-reads (SRS), long-reads (LRS) and hybrid-reads (HYBR) based analysis over different common investigative tasks: genome coverage estimation, variant calling and cluster analysis, drug resistance detection and de novo assembly. METHODS: For the study 13 model MTB clinical isolates were sequenced with both SRS and LRS. HYBR were produced correcting the long reads with the short reads. The fastq from the three approaches were then processed using a customized version of MTBseq for genome coverage estimation and variant calling and using two different assemblers for de novo assembly evaluation. RESULTS: Estimation of genome coverage performances showed lower 8X breadth coverage for SRS respect to LRS and HYBR: considering the PE/PPE genes, SRS showed low results for the PE_PGRS family, while obtained acceptable coverage in PE and PPE genes; LRS and HYBR reached optimal coverages in PE/PPE genes. For variant calling HYBR showed the highest resolution, detecting the highest percentage of uniquely identified mutations compared to LRS and SRS. All three approaches agreed on the identification of two major clusters, with HYBR identifying an higher number of SNPs between the two clusters. Comparing the quality of the assemblies, HYBR and LRS obtained better results than SRS. DISCUSSION: In conclusion, depending on the aim of the investigation, both SRS and LRS present complementary advantages and limitations implying that for a full resolution of MTB genomes, where all the mentioned analyses and both technologies are needed, the use of the HYBR approach represents a valid option and a well-rounded strategy. Frontiers Media S.A. 2023-02-02 /pmc/articles/PMC9932330/ /pubmed/36819039 http://dx.doi.org/10.3389/fmicb.2023.1104456 Text en Copyright © 2023 Di Marco, Spitaleri, Battaglia, Batignani, Cabibbe and Cirillo. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Microbiology Di Marco, Federico Spitaleri, Andrea Battaglia, Simone Batignani, Virginia Cabibbe, Andrea Maurizio Cirillo, Daniela Maria Advantages of long- and short-reads sequencing for the hybrid investigation of the Mycobacterium tuberculosis genome |
title | Advantages of long- and short-reads sequencing for the hybrid investigation of the Mycobacterium tuberculosis genome |
title_full | Advantages of long- and short-reads sequencing for the hybrid investigation of the Mycobacterium tuberculosis genome |
title_fullStr | Advantages of long- and short-reads sequencing for the hybrid investigation of the Mycobacterium tuberculosis genome |
title_full_unstemmed | Advantages of long- and short-reads sequencing for the hybrid investigation of the Mycobacterium tuberculosis genome |
title_short | Advantages of long- and short-reads sequencing for the hybrid investigation of the Mycobacterium tuberculosis genome |
title_sort | advantages of long- and short-reads sequencing for the hybrid investigation of the mycobacterium tuberculosis genome |
topic | Microbiology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9932330/ https://www.ncbi.nlm.nih.gov/pubmed/36819039 http://dx.doi.org/10.3389/fmicb.2023.1104456 |
work_keys_str_mv | AT dimarcofederico advantagesoflongandshortreadssequencingforthehybridinvestigationofthemycobacteriumtuberculosisgenome AT spitaleriandrea advantagesoflongandshortreadssequencingforthehybridinvestigationofthemycobacteriumtuberculosisgenome AT battagliasimone advantagesoflongandshortreadssequencingforthehybridinvestigationofthemycobacteriumtuberculosisgenome AT batignanivirginia advantagesoflongandshortreadssequencingforthehybridinvestigationofthemycobacteriumtuberculosisgenome AT cabibbeandreamaurizio advantagesoflongandshortreadssequencingforthehybridinvestigationofthemycobacteriumtuberculosisgenome AT cirillodanielamaria advantagesoflongandshortreadssequencingforthehybridinvestigationofthemycobacteriumtuberculosisgenome |