Cargando…
Predicting bacterial resistance from whole-genome sequences using k-mers and stability selection
BACKGROUND: Several studies demonstrated the feasibility of predicting bacterial antibiotic resistance phenotypes from whole-genome sequences, the prediction process usually amounting to detecting the presence of genes involved in antibiotic resistance mechanisms, or of specific mutations, previousl...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6192184/ https://www.ncbi.nlm.nih.gov/pubmed/30332990 http://dx.doi.org/10.1186/s12859-018-2403-z |
Sumario: | BACKGROUND: Several studies demonstrated the feasibility of predicting bacterial antibiotic resistance phenotypes from whole-genome sequences, the prediction process usually amounting to detecting the presence of genes involved in antibiotic resistance mechanisms, or of specific mutations, previously identified from a training panel of strains, within these genes. We address the problem from the supervised statistical learning perspective, not relying on prior information about such resistance factors. We rely on a k-mer based genotyping scheme and a logistic regression model, thereby combining several k-mers into a probabilistic model. To identify a small yet predictive set of k-mers, we rely on the stability selection approach (Meinshausen et al., J R Stat Soc Ser B 72:417–73, 2010), that consists in penalizing logistic regression models with a Lasso penalty, coupled with extensive resampling procedures. RESULTS: Using public datasets, we applied the resulting classifiers to two bacterial species and achieved predictive performance equivalent to state of the art. The models are extremely sparse, involving 1 to 8 k-mers per antibiotic, hence are remarkably easy and fast to evaluate on new genomes (from raw reads to assemblies). CONCLUSION: Our proof of concept therefore demonstrates that stability selection is a powerful approach to investigate bacterial genotype-phenotype relationships. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2403-z) contains supplementary material, which is available to authorized users. |
---|