Cargando…
1210. K-mer Profiling Powered by Reference-assisted Assembly of NGS Data: A Highly Sensitive Protocol to Infer the Plasma Microbiome Using Cell-free DNA Sequence Data
BACKGROUND: Cell-free DNA (cfDNA) has emerged as an important clinical specimen to probe for pathogenic microbes, especially in organ transplant patients where the same data can be used to predict allograft rejection. Recent reports described viral, bacterial or the complete microbial diversity in p...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7776747/ http://dx.doi.org/10.1093/ofid/ofaa439.1395 |
_version_ | 1783630754381561856 |
---|---|
author | Sinha, Rohita Kleiboeker, Steve Altrich, Michelle Bixler, Ellis |
author_facet | Sinha, Rohita Kleiboeker, Steve Altrich, Michelle Bixler, Ellis |
author_sort | Sinha, Rohita |
collection | PubMed |
description | BACKGROUND: Cell-free DNA (cfDNA) has emerged as an important clinical specimen to probe for pathogenic microbes, especially in organ transplant patients where the same data can be used to predict allograft rejection. Recent reports described viral, bacterial or the complete microbial diversity in plasma following cfDNA sequencing. The prevalence of certain viral families (anelloviridae) is associated with immunosuppressant dosage and the risk of antibody mediated rejection. While being informative, the cfDNA reads are inherently shorter in length (~160bp or 2x75bp) and predominated by the host DNA (~97-99%), causing challenges in their taxonomic annotation and lower specificity. Here we present a computational protocol which minimizes these challenges by merging the concept of “Reference-assisted Assembly” with K-mer profiles of NGS data, for highly sensitive and specific microbial detection. METHODS: We developed a pipeline in which non-host NGS data (reads not mapped to the human genome) undergo a reference-assisted assembly operation and then taxonomic annotation using KrakenUneq (a K-mer based classifier). We trained the KrakenUneq on an in-house and curated database of ~12,000 viral genomes. We used three different K-mer values (16, 21, 31) to train KrakenUneq, and final predictions are made by applying a majority-wins rule. Currently the default KrakenUneq database is used for bacterial & fungal metagenome analysis. We tested our method on 30 simulated and 124 clinical samples obtained from a biorepository. RESULTS: Our protocol currently screens for a targeted list of pathogens (15 viral species, 16 bacterial and 10 fungal genera). On a simulated set of viral sample mixes, our protocol had 100% accuracy. For 124 clinical samples, predictions were evaluated for specificity and sensitivity using qPCR assays for the following viral species: EBV, BKV, JCV, HSV1/2, HHV7, and CMV. Total 33/38 computational predictions (87%) were confirmed by qPCR. The prediction sensitivity in terms of cps/ml ranged from 6 - 10(6) copies/mL. CONCLUSION: Our efforts to perform ‘Reference-assisted assembly’ followed by K-mer based taxonomic annotation of cfDNA data, led to development of a novel and accurate pathogen detection protocol. DISCLOSURES: Rohita Sinha, PhD, Viracor-Eurofins (Employee) Steve Kleiboeker, DVM, PhD, Viracor-Eurofins (Employee) Michelle Altrich, PhD, Viracor-Eurofins (Employee) Ellis Bixler, MS, Viracor-Eurofins (Employee) |
format | Online Article Text |
id | pubmed-7776747 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-77767472021-01-07 1210. K-mer Profiling Powered by Reference-assisted Assembly of NGS Data: A Highly Sensitive Protocol to Infer the Plasma Microbiome Using Cell-free DNA Sequence Data Sinha, Rohita Kleiboeker, Steve Altrich, Michelle Bixler, Ellis Open Forum Infect Dis Poster Abstracts BACKGROUND: Cell-free DNA (cfDNA) has emerged as an important clinical specimen to probe for pathogenic microbes, especially in organ transplant patients where the same data can be used to predict allograft rejection. Recent reports described viral, bacterial or the complete microbial diversity in plasma following cfDNA sequencing. The prevalence of certain viral families (anelloviridae) is associated with immunosuppressant dosage and the risk of antibody mediated rejection. While being informative, the cfDNA reads are inherently shorter in length (~160bp or 2x75bp) and predominated by the host DNA (~97-99%), causing challenges in their taxonomic annotation and lower specificity. Here we present a computational protocol which minimizes these challenges by merging the concept of “Reference-assisted Assembly” with K-mer profiles of NGS data, for highly sensitive and specific microbial detection. METHODS: We developed a pipeline in which non-host NGS data (reads not mapped to the human genome) undergo a reference-assisted assembly operation and then taxonomic annotation using KrakenUneq (a K-mer based classifier). We trained the KrakenUneq on an in-house and curated database of ~12,000 viral genomes. We used three different K-mer values (16, 21, 31) to train KrakenUneq, and final predictions are made by applying a majority-wins rule. Currently the default KrakenUneq database is used for bacterial & fungal metagenome analysis. We tested our method on 30 simulated and 124 clinical samples obtained from a biorepository. RESULTS: Our protocol currently screens for a targeted list of pathogens (15 viral species, 16 bacterial and 10 fungal genera). On a simulated set of viral sample mixes, our protocol had 100% accuracy. For 124 clinical samples, predictions were evaluated for specificity and sensitivity using qPCR assays for the following viral species: EBV, BKV, JCV, HSV1/2, HHV7, and CMV. Total 33/38 computational predictions (87%) were confirmed by qPCR. The prediction sensitivity in terms of cps/ml ranged from 6 - 10(6) copies/mL. CONCLUSION: Our efforts to perform ‘Reference-assisted assembly’ followed by K-mer based taxonomic annotation of cfDNA data, led to development of a novel and accurate pathogen detection protocol. DISCLOSURES: Rohita Sinha, PhD, Viracor-Eurofins (Employee) Steve Kleiboeker, DVM, PhD, Viracor-Eurofins (Employee) Michelle Altrich, PhD, Viracor-Eurofins (Employee) Ellis Bixler, MS, Viracor-Eurofins (Employee) Oxford University Press 2020-12-31 /pmc/articles/PMC7776747/ http://dx.doi.org/10.1093/ofid/ofaa439.1395 Text en © The Author 2020. Published by Oxford University Press on behalf of Infectious Diseases Society of America. http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs licence (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial reproduction and distribution of the work, in any medium, provided the original work is not altered or transformed in any way, and that the work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Poster Abstracts Sinha, Rohita Kleiboeker, Steve Altrich, Michelle Bixler, Ellis 1210. K-mer Profiling Powered by Reference-assisted Assembly of NGS Data: A Highly Sensitive Protocol to Infer the Plasma Microbiome Using Cell-free DNA Sequence Data |
title | 1210. K-mer Profiling Powered by Reference-assisted Assembly of NGS Data: A Highly Sensitive Protocol to Infer the Plasma Microbiome Using Cell-free DNA Sequence Data |
title_full | 1210. K-mer Profiling Powered by Reference-assisted Assembly of NGS Data: A Highly Sensitive Protocol to Infer the Plasma Microbiome Using Cell-free DNA Sequence Data |
title_fullStr | 1210. K-mer Profiling Powered by Reference-assisted Assembly of NGS Data: A Highly Sensitive Protocol to Infer the Plasma Microbiome Using Cell-free DNA Sequence Data |
title_full_unstemmed | 1210. K-mer Profiling Powered by Reference-assisted Assembly of NGS Data: A Highly Sensitive Protocol to Infer the Plasma Microbiome Using Cell-free DNA Sequence Data |
title_short | 1210. K-mer Profiling Powered by Reference-assisted Assembly of NGS Data: A Highly Sensitive Protocol to Infer the Plasma Microbiome Using Cell-free DNA Sequence Data |
title_sort | 1210. k-mer profiling powered by reference-assisted assembly of ngs data: a highly sensitive protocol to infer the plasma microbiome using cell-free dna sequence data |
topic | Poster Abstracts |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7776747/ http://dx.doi.org/10.1093/ofid/ofaa439.1395 |
work_keys_str_mv | AT sinharohita 1210kmerprofilingpoweredbyreferenceassistedassemblyofngsdataahighlysensitiveprotocoltoinfertheplasmamicrobiomeusingcellfreednasequencedata AT kleiboekersteve 1210kmerprofilingpoweredbyreferenceassistedassemblyofngsdataahighlysensitiveprotocoltoinfertheplasmamicrobiomeusingcellfreednasequencedata AT altrichmichelle 1210kmerprofilingpoweredbyreferenceassistedassemblyofngsdataahighlysensitiveprotocoltoinfertheplasmamicrobiomeusingcellfreednasequencedata AT bixlerellis 1210kmerprofilingpoweredbyreferenceassistedassemblyofngsdataahighlysensitiveprotocoltoinfertheplasmamicrobiomeusingcellfreednasequencedata |