Cargando…
Origin Sample Prediction and Spatial Modeling of Antimicrobial Resistance in Metagenomic Sequencing Data
The steady elaboration of the Metagenomic and Metadesign of Subways and Urban Biomes (MetaSUB) international consortium project raises important new questions about the origin, variation, and antimicrobial resistance of the collected samples. CAMDA (Critical Assessment of Massive Data Analysis, http...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7983949/ https://www.ncbi.nlm.nih.gov/pubmed/33763122 http://dx.doi.org/10.3389/fgene.2021.642991 |
_version_ | 1783667971487432704 |
---|---|
author | Zhelyazkova, Maya Yordanova, Roumyana Mihaylov, Iliyan Kirov, Stefan Tsonev, Stefan Danko, David Mason, Christopher Vassilev, Dimitar |
author_facet | Zhelyazkova, Maya Yordanova, Roumyana Mihaylov, Iliyan Kirov, Stefan Tsonev, Stefan Danko, David Mason, Christopher Vassilev, Dimitar |
author_sort | Zhelyazkova, Maya |
collection | PubMed |
description | The steady elaboration of the Metagenomic and Metadesign of Subways and Urban Biomes (MetaSUB) international consortium project raises important new questions about the origin, variation, and antimicrobial resistance of the collected samples. CAMDA (Critical Assessment of Massive Data Analysis, http://camda.info/) forum organizes annual challenges where different bioinformatics and statistical approaches are tested on samples collected around the world for bacterial classification and prediction of geographical origin. This work proposes a method which not only predicts the locations of unknown samples, but also estimates the relative risk of antimicrobial resistance through spatial modeling. We introduce a new component in the standard analysis as we apply a Bayesian spatial convolution model which accounts for spatial structure of the data as defined by the longitude and latitude of the samples and assess the relative risk of antimicrobial resistance taxa across regions which is relevant to public health. We can then use the estimated relative risk as a new measure for antimicrobial resistance. We also compare the performance of several machine learning methods, such as Gradient Boosting Machine, Random Forest, and Neural Network to predict the geographical origin of the mystery samples. All three methods show consistent results with some superiority of Random Forest classifier. In our future work we can consider a broader class of spatial models and incorporate covariates related to the environment and climate profiles of the samples to achieve more reliable estimation of the relative risk related to antimicrobial resistance. |
format | Online Article Text |
id | pubmed-7983949 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-79839492021-03-23 Origin Sample Prediction and Spatial Modeling of Antimicrobial Resistance in Metagenomic Sequencing Data Zhelyazkova, Maya Yordanova, Roumyana Mihaylov, Iliyan Kirov, Stefan Tsonev, Stefan Danko, David Mason, Christopher Vassilev, Dimitar Front Genet Genetics The steady elaboration of the Metagenomic and Metadesign of Subways and Urban Biomes (MetaSUB) international consortium project raises important new questions about the origin, variation, and antimicrobial resistance of the collected samples. CAMDA (Critical Assessment of Massive Data Analysis, http://camda.info/) forum organizes annual challenges where different bioinformatics and statistical approaches are tested on samples collected around the world for bacterial classification and prediction of geographical origin. This work proposes a method which not only predicts the locations of unknown samples, but also estimates the relative risk of antimicrobial resistance through spatial modeling. We introduce a new component in the standard analysis as we apply a Bayesian spatial convolution model which accounts for spatial structure of the data as defined by the longitude and latitude of the samples and assess the relative risk of antimicrobial resistance taxa across regions which is relevant to public health. We can then use the estimated relative risk as a new measure for antimicrobial resistance. We also compare the performance of several machine learning methods, such as Gradient Boosting Machine, Random Forest, and Neural Network to predict the geographical origin of the mystery samples. All three methods show consistent results with some superiority of Random Forest classifier. In our future work we can consider a broader class of spatial models and incorporate covariates related to the environment and climate profiles of the samples to achieve more reliable estimation of the relative risk related to antimicrobial resistance. Frontiers Media S.A. 2021-03-04 /pmc/articles/PMC7983949/ /pubmed/33763122 http://dx.doi.org/10.3389/fgene.2021.642991 Text en Copyright © 2021 Zhelyazkova, Yordanova, Mihaylov, Kirov, Tsonev, Danko, Mason and Vassilev. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Zhelyazkova, Maya Yordanova, Roumyana Mihaylov, Iliyan Kirov, Stefan Tsonev, Stefan Danko, David Mason, Christopher Vassilev, Dimitar Origin Sample Prediction and Spatial Modeling of Antimicrobial Resistance in Metagenomic Sequencing Data |
title | Origin Sample Prediction and Spatial Modeling of Antimicrobial Resistance in Metagenomic Sequencing Data |
title_full | Origin Sample Prediction and Spatial Modeling of Antimicrobial Resistance in Metagenomic Sequencing Data |
title_fullStr | Origin Sample Prediction and Spatial Modeling of Antimicrobial Resistance in Metagenomic Sequencing Data |
title_full_unstemmed | Origin Sample Prediction and Spatial Modeling of Antimicrobial Resistance in Metagenomic Sequencing Data |
title_short | Origin Sample Prediction and Spatial Modeling of Antimicrobial Resistance in Metagenomic Sequencing Data |
title_sort | origin sample prediction and spatial modeling of antimicrobial resistance in metagenomic sequencing data |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7983949/ https://www.ncbi.nlm.nih.gov/pubmed/33763122 http://dx.doi.org/10.3389/fgene.2021.642991 |
work_keys_str_mv | AT zhelyazkovamaya originsamplepredictionandspatialmodelingofantimicrobialresistanceinmetagenomicsequencingdata AT yordanovaroumyana originsamplepredictionandspatialmodelingofantimicrobialresistanceinmetagenomicsequencingdata AT mihayloviliyan originsamplepredictionandspatialmodelingofantimicrobialresistanceinmetagenomicsequencingdata AT kirovstefan originsamplepredictionandspatialmodelingofantimicrobialresistanceinmetagenomicsequencingdata AT tsonevstefan originsamplepredictionandspatialmodelingofantimicrobialresistanceinmetagenomicsequencingdata AT dankodavid originsamplepredictionandspatialmodelingofantimicrobialresistanceinmetagenomicsequencingdata AT masonchristopher originsamplepredictionandspatialmodelingofantimicrobialresistanceinmetagenomicsequencingdata AT vassilevdimitar originsamplepredictionandspatialmodelingofantimicrobialresistanceinmetagenomicsequencingdata |