Cargando…
MixviR: an R Package for Exploring Variation Associated with Genomic Sequence Data from Environmental SARS-CoV-2 and Other Mixed Microbial Samples
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)/coronavirus disease 2019 (COVID-19) pandemic has highlighted an important role for efficient surveillance of microbial pathogens. High-throughput sequencing technologies provide valuable surveillance tools, offering opportunities to co...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Society for Microbiology
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9680627/ https://www.ncbi.nlm.nih.gov/pubmed/36286480 http://dx.doi.org/10.1128/aem.00874-22 |
Sumario: | The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)/coronavirus disease 2019 (COVID-19) pandemic has highlighted an important role for efficient surveillance of microbial pathogens. High-throughput sequencing technologies provide valuable surveillance tools, offering opportunities to conduct high-resolution monitoring from diverse sample types, including from environmental sources. However, given their large size and potential to contain mixtures of lineages within samples, such genomic data sets can present challenges for analyzing the data and communicating results with diverse stakeholders. Here, we report MixviR, an R package for exploring, analyzing, and visualizing genomic data from potentially mixed samples of a target microbial group. MixviR characterizes variation at both the nucleotide and amino acid levels and offers the RShiny interactive dashboard for exploring data. We demonstrate MixviR’s utility with validation studies using mixtures of known lineages from both SARS-CoV-2 and Mycobacterium tuberculosis and with a case study analyzing lineages of SARS-CoV-2 in wastewater samples over time at a sampling location in Ohio, USA. IMPORTANCE High-throughput sequencing technologies hold great potential for contributing to genomic-based surveillance of microbial diversity from environmental samples. However, the size of the data sets, along with the potential for environmental samples to contain multiple evolutionary lineages of interest, present challenges around analyzing and effectively communicating inferences from these data sets. The software described here provides a novel and valuable tool for exploring such data. Though originally designed and used for monitoring SARS-CoV-2 lineages in wastewater, it can also be applied to analyses of genomic diversity in other microbial groups. |
---|