Cargando…

Origin Sample Prediction and Spatial Modeling of Antimicrobial Resistance in Metagenomic Sequencing Data

The steady elaboration of the Metagenomic and Metadesign of Subways and Urban Biomes (MetaSUB) international consortium project raises important new questions about the origin, variation, and antimicrobial resistance of the collected samples. CAMDA (Critical Assessment of Massive Data Analysis, http...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhelyazkova, Maya, Yordanova, Roumyana, Mihaylov, Iliyan, Kirov, Stefan, Tsonev, Stefan, Danko, David, Mason, Christopher, Vassilev, Dimitar
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7983949/
https://www.ncbi.nlm.nih.gov/pubmed/33763122
http://dx.doi.org/10.3389/fgene.2021.642991
_version_ 1783667971487432704
author Zhelyazkova, Maya
Yordanova, Roumyana
Mihaylov, Iliyan
Kirov, Stefan
Tsonev, Stefan
Danko, David
Mason, Christopher
Vassilev, Dimitar
author_facet Zhelyazkova, Maya
Yordanova, Roumyana
Mihaylov, Iliyan
Kirov, Stefan
Tsonev, Stefan
Danko, David
Mason, Christopher
Vassilev, Dimitar
author_sort Zhelyazkova, Maya
collection PubMed
description The steady elaboration of the Metagenomic and Metadesign of Subways and Urban Biomes (MetaSUB) international consortium project raises important new questions about the origin, variation, and antimicrobial resistance of the collected samples. CAMDA (Critical Assessment of Massive Data Analysis, http://camda.info/) forum organizes annual challenges where different bioinformatics and statistical approaches are tested on samples collected around the world for bacterial classification and prediction of geographical origin. This work proposes a method which not only predicts the locations of unknown samples, but also estimates the relative risk of antimicrobial resistance through spatial modeling. We introduce a new component in the standard analysis as we apply a Bayesian spatial convolution model which accounts for spatial structure of the data as defined by the longitude and latitude of the samples and assess the relative risk of antimicrobial resistance taxa across regions which is relevant to public health. We can then use the estimated relative risk as a new measure for antimicrobial resistance. We also compare the performance of several machine learning methods, such as Gradient Boosting Machine, Random Forest, and Neural Network to predict the geographical origin of the mystery samples. All three methods show consistent results with some superiority of Random Forest classifier. In our future work we can consider a broader class of spatial models and incorporate covariates related to the environment and climate profiles of the samples to achieve more reliable estimation of the relative risk related to antimicrobial resistance.
format Online
Article
Text
id pubmed-7983949
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-79839492021-03-23 Origin Sample Prediction and Spatial Modeling of Antimicrobial Resistance in Metagenomic Sequencing Data Zhelyazkova, Maya Yordanova, Roumyana Mihaylov, Iliyan Kirov, Stefan Tsonev, Stefan Danko, David Mason, Christopher Vassilev, Dimitar Front Genet Genetics The steady elaboration of the Metagenomic and Metadesign of Subways and Urban Biomes (MetaSUB) international consortium project raises important new questions about the origin, variation, and antimicrobial resistance of the collected samples. CAMDA (Critical Assessment of Massive Data Analysis, http://camda.info/) forum organizes annual challenges where different bioinformatics and statistical approaches are tested on samples collected around the world for bacterial classification and prediction of geographical origin. This work proposes a method which not only predicts the locations of unknown samples, but also estimates the relative risk of antimicrobial resistance through spatial modeling. We introduce a new component in the standard analysis as we apply a Bayesian spatial convolution model which accounts for spatial structure of the data as defined by the longitude and latitude of the samples and assess the relative risk of antimicrobial resistance taxa across regions which is relevant to public health. We can then use the estimated relative risk as a new measure for antimicrobial resistance. We also compare the performance of several machine learning methods, such as Gradient Boosting Machine, Random Forest, and Neural Network to predict the geographical origin of the mystery samples. All three methods show consistent results with some superiority of Random Forest classifier. In our future work we can consider a broader class of spatial models and incorporate covariates related to the environment and climate profiles of the samples to achieve more reliable estimation of the relative risk related to antimicrobial resistance. Frontiers Media S.A. 2021-03-04 /pmc/articles/PMC7983949/ /pubmed/33763122 http://dx.doi.org/10.3389/fgene.2021.642991 Text en Copyright © 2021 Zhelyazkova, Yordanova, Mihaylov, Kirov, Tsonev, Danko, Mason and Vassilev. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Zhelyazkova, Maya
Yordanova, Roumyana
Mihaylov, Iliyan
Kirov, Stefan
Tsonev, Stefan
Danko, David
Mason, Christopher
Vassilev, Dimitar
Origin Sample Prediction and Spatial Modeling of Antimicrobial Resistance in Metagenomic Sequencing Data
title Origin Sample Prediction and Spatial Modeling of Antimicrobial Resistance in Metagenomic Sequencing Data
title_full Origin Sample Prediction and Spatial Modeling of Antimicrobial Resistance in Metagenomic Sequencing Data
title_fullStr Origin Sample Prediction and Spatial Modeling of Antimicrobial Resistance in Metagenomic Sequencing Data
title_full_unstemmed Origin Sample Prediction and Spatial Modeling of Antimicrobial Resistance in Metagenomic Sequencing Data
title_short Origin Sample Prediction and Spatial Modeling of Antimicrobial Resistance in Metagenomic Sequencing Data
title_sort origin sample prediction and spatial modeling of antimicrobial resistance in metagenomic sequencing data
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7983949/
https://www.ncbi.nlm.nih.gov/pubmed/33763122
http://dx.doi.org/10.3389/fgene.2021.642991
work_keys_str_mv AT zhelyazkovamaya originsamplepredictionandspatialmodelingofantimicrobialresistanceinmetagenomicsequencingdata
AT yordanovaroumyana originsamplepredictionandspatialmodelingofantimicrobialresistanceinmetagenomicsequencingdata
AT mihayloviliyan originsamplepredictionandspatialmodelingofantimicrobialresistanceinmetagenomicsequencingdata
AT kirovstefan originsamplepredictionandspatialmodelingofantimicrobialresistanceinmetagenomicsequencingdata
AT tsonevstefan originsamplepredictionandspatialmodelingofantimicrobialresistanceinmetagenomicsequencingdata
AT dankodavid originsamplepredictionandspatialmodelingofantimicrobialresistanceinmetagenomicsequencingdata
AT masonchristopher originsamplepredictionandspatialmodelingofantimicrobialresistanceinmetagenomicsequencingdata
AT vassilevdimitar originsamplepredictionandspatialmodelingofantimicrobialresistanceinmetagenomicsequencingdata