Cargando…

1210. K-mer Profiling Powered by Reference-assisted Assembly of NGS Data: A Highly Sensitive Protocol to Infer the Plasma Microbiome Using Cell-free DNA Sequence Data

BACKGROUND: Cell-free DNA (cfDNA) has emerged as an important clinical specimen to probe for pathogenic microbes, especially in organ transplant patients where the same data can be used to predict allograft rejection. Recent reports described viral, bacterial or the complete microbial diversity in p...

Descripción completa

Detalles Bibliográficos
Autores principales: Sinha, Rohita, Kleiboeker, Steve, Altrich, Michelle, Bixler, Ellis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7776747/
http://dx.doi.org/10.1093/ofid/ofaa439.1395
_version_ 1783630754381561856
author Sinha, Rohita
Kleiboeker, Steve
Altrich, Michelle
Bixler, Ellis
author_facet Sinha, Rohita
Kleiboeker, Steve
Altrich, Michelle
Bixler, Ellis
author_sort Sinha, Rohita
collection PubMed
description BACKGROUND: Cell-free DNA (cfDNA) has emerged as an important clinical specimen to probe for pathogenic microbes, especially in organ transplant patients where the same data can be used to predict allograft rejection. Recent reports described viral, bacterial or the complete microbial diversity in plasma following cfDNA sequencing. The prevalence of certain viral families (anelloviridae) is associated with immunosuppressant dosage and the risk of antibody mediated rejection. While being informative, the cfDNA reads are inherently shorter in length (~160bp or 2x75bp) and predominated by the host DNA (~97-99%), causing challenges in their taxonomic annotation and lower specificity. Here we present a computational protocol which minimizes these challenges by merging the concept of “Reference-assisted Assembly” with K-mer profiles of NGS data, for highly sensitive and specific microbial detection. METHODS: We developed a pipeline in which non-host NGS data (reads not mapped to the human genome) undergo a reference-assisted assembly operation and then taxonomic annotation using KrakenUneq (a K-mer based classifier). We trained the KrakenUneq on an in-house and curated database of ~12,000 viral genomes. We used three different K-mer values (16, 21, 31) to train KrakenUneq, and final predictions are made by applying a majority-wins rule. Currently the default KrakenUneq database is used for bacterial & fungal metagenome analysis. We tested our method on 30 simulated and 124 clinical samples obtained from a biorepository. RESULTS: Our protocol currently screens for a targeted list of pathogens (15 viral species, 16 bacterial and 10 fungal genera). On a simulated set of viral sample mixes, our protocol had 100% accuracy. For 124 clinical samples, predictions were evaluated for specificity and sensitivity using qPCR assays for the following viral species: EBV, BKV, JCV, HSV1/2, HHV7, and CMV. Total 33/38 computational predictions (87%) were confirmed by qPCR. The prediction sensitivity in terms of cps/ml ranged from 6 - 10(6) copies/mL. CONCLUSION: Our efforts to perform ‘Reference-assisted assembly’ followed by K-mer based taxonomic annotation of cfDNA data, led to development of a novel and accurate pathogen detection protocol. DISCLOSURES: Rohita Sinha, PhD, Viracor-Eurofins (Employee) Steve Kleiboeker, DVM, PhD, Viracor-Eurofins (Employee) Michelle Altrich, PhD, Viracor-Eurofins (Employee) Ellis Bixler, MS, Viracor-Eurofins (Employee)
format Online
Article
Text
id pubmed-7776747
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-77767472021-01-07 1210. K-mer Profiling Powered by Reference-assisted Assembly of NGS Data: A Highly Sensitive Protocol to Infer the Plasma Microbiome Using Cell-free DNA Sequence Data Sinha, Rohita Kleiboeker, Steve Altrich, Michelle Bixler, Ellis Open Forum Infect Dis Poster Abstracts BACKGROUND: Cell-free DNA (cfDNA) has emerged as an important clinical specimen to probe for pathogenic microbes, especially in organ transplant patients where the same data can be used to predict allograft rejection. Recent reports described viral, bacterial or the complete microbial diversity in plasma following cfDNA sequencing. The prevalence of certain viral families (anelloviridae) is associated with immunosuppressant dosage and the risk of antibody mediated rejection. While being informative, the cfDNA reads are inherently shorter in length (~160bp or 2x75bp) and predominated by the host DNA (~97-99%), causing challenges in their taxonomic annotation and lower specificity. Here we present a computational protocol which minimizes these challenges by merging the concept of “Reference-assisted Assembly” with K-mer profiles of NGS data, for highly sensitive and specific microbial detection. METHODS: We developed a pipeline in which non-host NGS data (reads not mapped to the human genome) undergo a reference-assisted assembly operation and then taxonomic annotation using KrakenUneq (a K-mer based classifier). We trained the KrakenUneq on an in-house and curated database of ~12,000 viral genomes. We used three different K-mer values (16, 21, 31) to train KrakenUneq, and final predictions are made by applying a majority-wins rule. Currently the default KrakenUneq database is used for bacterial & fungal metagenome analysis. We tested our method on 30 simulated and 124 clinical samples obtained from a biorepository. RESULTS: Our protocol currently screens for a targeted list of pathogens (15 viral species, 16 bacterial and 10 fungal genera). On a simulated set of viral sample mixes, our protocol had 100% accuracy. For 124 clinical samples, predictions were evaluated for specificity and sensitivity using qPCR assays for the following viral species: EBV, BKV, JCV, HSV1/2, HHV7, and CMV. Total 33/38 computational predictions (87%) were confirmed by qPCR. The prediction sensitivity in terms of cps/ml ranged from 6 - 10(6) copies/mL. CONCLUSION: Our efforts to perform ‘Reference-assisted assembly’ followed by K-mer based taxonomic annotation of cfDNA data, led to development of a novel and accurate pathogen detection protocol. DISCLOSURES: Rohita Sinha, PhD, Viracor-Eurofins (Employee) Steve Kleiboeker, DVM, PhD, Viracor-Eurofins (Employee) Michelle Altrich, PhD, Viracor-Eurofins (Employee) Ellis Bixler, MS, Viracor-Eurofins (Employee) Oxford University Press 2020-12-31 /pmc/articles/PMC7776747/ http://dx.doi.org/10.1093/ofid/ofaa439.1395 Text en © The Author 2020. Published by Oxford University Press on behalf of Infectious Diseases Society of America. http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs licence (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial reproduction and distribution of the work, in any medium, provided the original work is not altered or transformed in any way, and that the work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Poster Abstracts
Sinha, Rohita
Kleiboeker, Steve
Altrich, Michelle
Bixler, Ellis
1210. K-mer Profiling Powered by Reference-assisted Assembly of NGS Data: A Highly Sensitive Protocol to Infer the Plasma Microbiome Using Cell-free DNA Sequence Data
title 1210. K-mer Profiling Powered by Reference-assisted Assembly of NGS Data: A Highly Sensitive Protocol to Infer the Plasma Microbiome Using Cell-free DNA Sequence Data
title_full 1210. K-mer Profiling Powered by Reference-assisted Assembly of NGS Data: A Highly Sensitive Protocol to Infer the Plasma Microbiome Using Cell-free DNA Sequence Data
title_fullStr 1210. K-mer Profiling Powered by Reference-assisted Assembly of NGS Data: A Highly Sensitive Protocol to Infer the Plasma Microbiome Using Cell-free DNA Sequence Data
title_full_unstemmed 1210. K-mer Profiling Powered by Reference-assisted Assembly of NGS Data: A Highly Sensitive Protocol to Infer the Plasma Microbiome Using Cell-free DNA Sequence Data
title_short 1210. K-mer Profiling Powered by Reference-assisted Assembly of NGS Data: A Highly Sensitive Protocol to Infer the Plasma Microbiome Using Cell-free DNA Sequence Data
title_sort 1210. k-mer profiling powered by reference-assisted assembly of ngs data: a highly sensitive protocol to infer the plasma microbiome using cell-free dna sequence data
topic Poster Abstracts
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7776747/
http://dx.doi.org/10.1093/ofid/ofaa439.1395
work_keys_str_mv AT sinharohita 1210kmerprofilingpoweredbyreferenceassistedassemblyofngsdataahighlysensitiveprotocoltoinfertheplasmamicrobiomeusingcellfreednasequencedata
AT kleiboekersteve 1210kmerprofilingpoweredbyreferenceassistedassemblyofngsdataahighlysensitiveprotocoltoinfertheplasmamicrobiomeusingcellfreednasequencedata
AT altrichmichelle 1210kmerprofilingpoweredbyreferenceassistedassemblyofngsdataahighlysensitiveprotocoltoinfertheplasmamicrobiomeusingcellfreednasequencedata
AT bixlerellis 1210kmerprofilingpoweredbyreferenceassistedassemblyofngsdataahighlysensitiveprotocoltoinfertheplasmamicrobiomeusingcellfreednasequencedata