Cargando…

Viral coinfection analysis using a MinHash toolkit

BACKGROUND: Human papillomavirus (HPV) is a common sexually transmitted infection associated with cervical cancer that frequently occurs as a coinfection of types and subtypes. Highly similar sublineages that show over 100-fold differences in cancer risk are not distinguishable in coinfections with...

Descripción completa

Detalles Bibliográficos
Autores principales: Dawson, Eric T., Wagner, Sarah, Roberson, David, Yeager, Meredith, Boland, Joseph, Garrison, Erik, Chanock, Stephen, Schiffman, Mark, Raine-Bennett, Tina, Lorey, Thomas, Castle, Phillip E., Mirabello, Lisa, Durbin, Richard
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6626348/
https://www.ncbi.nlm.nih.gov/pubmed/31299914
http://dx.doi.org/10.1186/s12859-019-2918-y
_version_ 1783434557714857984
author Dawson, Eric T.
Wagner, Sarah
Roberson, David
Yeager, Meredith
Boland, Joseph
Garrison, Erik
Chanock, Stephen
Schiffman, Mark
Raine-Bennett, Tina
Lorey, Thomas
Castle, Phillip E.
Mirabello, Lisa
Durbin, Richard
author_facet Dawson, Eric T.
Wagner, Sarah
Roberson, David
Yeager, Meredith
Boland, Joseph
Garrison, Erik
Chanock, Stephen
Schiffman, Mark
Raine-Bennett, Tina
Lorey, Thomas
Castle, Phillip E.
Mirabello, Lisa
Durbin, Richard
author_sort Dawson, Eric T.
collection PubMed
description BACKGROUND: Human papillomavirus (HPV) is a common sexually transmitted infection associated with cervical cancer that frequently occurs as a coinfection of types and subtypes. Highly similar sublineages that show over 100-fold differences in cancer risk are not distinguishable in coinfections with current typing methods. RESULTS: We describe an efficient set of computational tools, rkmh, for analyzing complex mixed infections of related viruses based on sequence data. rkmh makes extensive use of MinHash similarity measures, and includes utilities for removing host DNA and classifying reads by type, lineage, and sublineage. We show that rkmh is capable of assigning reads to their HPV type as well as HPV16 lineage and sublineages. CONCLUSIONS: Accurate read classification enables estimates of percent composition when there are multiple infecting lineages or sublineages. While we demonstrate rkmh for HPV with multiple sequencing technologies, it is also applicable to other mixtures of related sequences. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2918-y) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6626348
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-66263482019-07-23 Viral coinfection analysis using a MinHash toolkit Dawson, Eric T. Wagner, Sarah Roberson, David Yeager, Meredith Boland, Joseph Garrison, Erik Chanock, Stephen Schiffman, Mark Raine-Bennett, Tina Lorey, Thomas Castle, Phillip E. Mirabello, Lisa Durbin, Richard BMC Bioinformatics Software BACKGROUND: Human papillomavirus (HPV) is a common sexually transmitted infection associated with cervical cancer that frequently occurs as a coinfection of types and subtypes. Highly similar sublineages that show over 100-fold differences in cancer risk are not distinguishable in coinfections with current typing methods. RESULTS: We describe an efficient set of computational tools, rkmh, for analyzing complex mixed infections of related viruses based on sequence data. rkmh makes extensive use of MinHash similarity measures, and includes utilities for removing host DNA and classifying reads by type, lineage, and sublineage. We show that rkmh is capable of assigning reads to their HPV type as well as HPV16 lineage and sublineages. CONCLUSIONS: Accurate read classification enables estimates of percent composition when there are multiple infecting lineages or sublineages. While we demonstrate rkmh for HPV with multiple sequencing technologies, it is also applicable to other mixtures of related sequences. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2918-y) contains supplementary material, which is available to authorized users. BioMed Central 2019-07-12 /pmc/articles/PMC6626348/ /pubmed/31299914 http://dx.doi.org/10.1186/s12859-019-2918-y Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Dawson, Eric T.
Wagner, Sarah
Roberson, David
Yeager, Meredith
Boland, Joseph
Garrison, Erik
Chanock, Stephen
Schiffman, Mark
Raine-Bennett, Tina
Lorey, Thomas
Castle, Phillip E.
Mirabello, Lisa
Durbin, Richard
Viral coinfection analysis using a MinHash toolkit
title Viral coinfection analysis using a MinHash toolkit
title_full Viral coinfection analysis using a MinHash toolkit
title_fullStr Viral coinfection analysis using a MinHash toolkit
title_full_unstemmed Viral coinfection analysis using a MinHash toolkit
title_short Viral coinfection analysis using a MinHash toolkit
title_sort viral coinfection analysis using a minhash toolkit
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6626348/
https://www.ncbi.nlm.nih.gov/pubmed/31299914
http://dx.doi.org/10.1186/s12859-019-2918-y
work_keys_str_mv AT dawsonerict viralcoinfectionanalysisusingaminhashtoolkit
AT wagnersarah viralcoinfectionanalysisusingaminhashtoolkit
AT robersondavid viralcoinfectionanalysisusingaminhashtoolkit
AT yeagermeredith viralcoinfectionanalysisusingaminhashtoolkit
AT bolandjoseph viralcoinfectionanalysisusingaminhashtoolkit
AT garrisonerik viralcoinfectionanalysisusingaminhashtoolkit
AT chanockstephen viralcoinfectionanalysisusingaminhashtoolkit
AT schiffmanmark viralcoinfectionanalysisusingaminhashtoolkit
AT rainebennetttina viralcoinfectionanalysisusingaminhashtoolkit
AT loreythomas viralcoinfectionanalysisusingaminhashtoolkit
AT castlephillipe viralcoinfectionanalysisusingaminhashtoolkit
AT mirabellolisa viralcoinfectionanalysisusingaminhashtoolkit
AT durbinrichard viralcoinfectionanalysisusingaminhashtoolkit