Cargando…

NetAllergen, a random forest model integrating MHC-II presentation propensity for improved allergenicity prediction

MOTIVATION: Allergy is a pathological immune reaction towards innocuous protein antigens. Although only a narrow fraction of plant or animal proteins induce allergy, atopic disorders affect millions of children and adults and cost billions in healthcare systems worldwide. In silico predictors can ai...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Yuchen, Sackett, Peter Wad, Nielsen, Morten, Barra, Carolina
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10603389/
https://www.ncbi.nlm.nih.gov/pubmed/37901344
http://dx.doi.org/10.1093/bioadv/vbad151
_version_ 1785126595957096448
author Li, Yuchen
Sackett, Peter Wad
Nielsen, Morten
Barra, Carolina
author_facet Li, Yuchen
Sackett, Peter Wad
Nielsen, Morten
Barra, Carolina
author_sort Li, Yuchen
collection PubMed
description MOTIVATION: Allergy is a pathological immune reaction towards innocuous protein antigens. Although only a narrow fraction of plant or animal proteins induce allergy, atopic disorders affect millions of children and adults and cost billions in healthcare systems worldwide. In silico predictors can aid in the development of more innocuous food sources. Previous allergenicity predictors used sequence similarity, common structural domains, and amino acid physicochemical features. However, these predictors strongly rely on sequence similarity to known allergens and fail to predict protein allergenicity accurately when similarity diminishes. RESULTS: To overcome these limitations, we collected allergens from AllergenOnline, a curated database of IgE-inducing allergens, carefully removed allergen redundancy with a novel protein partitioning pipeline, and developed a new allergen prediction method, introducing MHC presentation propensity as a novel feature. NetAllergen outperformed a sequence similarity-based BLAST baseline approach, and previous allergenicity predictor AlgPred 2 when similarity to known allergens is limited. AVAILABILITY AND IMPLEMENTATION: The web service NetAllergen and the datasets are available at https://services.healthtech.dtu.dk/services/NetAllergen-1.0/.
format Online
Article
Text
id pubmed-10603389
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-106033892023-10-28 NetAllergen, a random forest model integrating MHC-II presentation propensity for improved allergenicity prediction Li, Yuchen Sackett, Peter Wad Nielsen, Morten Barra, Carolina Bioinform Adv Original Article MOTIVATION: Allergy is a pathological immune reaction towards innocuous protein antigens. Although only a narrow fraction of plant or animal proteins induce allergy, atopic disorders affect millions of children and adults and cost billions in healthcare systems worldwide. In silico predictors can aid in the development of more innocuous food sources. Previous allergenicity predictors used sequence similarity, common structural domains, and amino acid physicochemical features. However, these predictors strongly rely on sequence similarity to known allergens and fail to predict protein allergenicity accurately when similarity diminishes. RESULTS: To overcome these limitations, we collected allergens from AllergenOnline, a curated database of IgE-inducing allergens, carefully removed allergen redundancy with a novel protein partitioning pipeline, and developed a new allergen prediction method, introducing MHC presentation propensity as a novel feature. NetAllergen outperformed a sequence similarity-based BLAST baseline approach, and previous allergenicity predictor AlgPred 2 when similarity to known allergens is limited. AVAILABILITY AND IMPLEMENTATION: The web service NetAllergen and the datasets are available at https://services.healthtech.dtu.dk/services/NetAllergen-1.0/. Oxford University Press 2023-10-16 /pmc/articles/PMC10603389/ /pubmed/37901344 http://dx.doi.org/10.1093/bioadv/vbad151 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Li, Yuchen
Sackett, Peter Wad
Nielsen, Morten
Barra, Carolina
NetAllergen, a random forest model integrating MHC-II presentation propensity for improved allergenicity prediction
title NetAllergen, a random forest model integrating MHC-II presentation propensity for improved allergenicity prediction
title_full NetAllergen, a random forest model integrating MHC-II presentation propensity for improved allergenicity prediction
title_fullStr NetAllergen, a random forest model integrating MHC-II presentation propensity for improved allergenicity prediction
title_full_unstemmed NetAllergen, a random forest model integrating MHC-II presentation propensity for improved allergenicity prediction
title_short NetAllergen, a random forest model integrating MHC-II presentation propensity for improved allergenicity prediction
title_sort netallergen, a random forest model integrating mhc-ii presentation propensity for improved allergenicity prediction
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10603389/
https://www.ncbi.nlm.nih.gov/pubmed/37901344
http://dx.doi.org/10.1093/bioadv/vbad151
work_keys_str_mv AT liyuchen netallergenarandomforestmodelintegratingmhciipresentationpropensityforimprovedallergenicityprediction
AT sackettpeterwad netallergenarandomforestmodelintegratingmhciipresentationpropensityforimprovedallergenicityprediction
AT nielsenmorten netallergenarandomforestmodelintegratingmhciipresentationpropensityforimprovedallergenicityprediction
AT barracarolina netallergenarandomforestmodelintegratingmhciipresentationpropensityforimprovedallergenicityprediction