Cargando…

AllerCatPro—prediction of protein allergenicity potential from the protein sequence

MOTIVATION: Due to the risk of inducing an immediate Type I (IgE-mediated) allergic response, proteins intended for use in consumer products must be investigated for their allergenic potential before introduction into the marketplace. The FAO/WHO guidelines for computational assessment of allergenic...

Descripción completa

Detalles Bibliográficos
Autores principales: Maurer-Stroh, Sebastian, Krutz, Nora L, Kern, Petra S, Gunalan, Vithiagaran, Nguyen, Minh N, Limviphuvadh, Vachiranee, Eisenhaber, Frank, Gerberick, G Frank
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6736023/
https://www.ncbi.nlm.nih.gov/pubmed/30657872
http://dx.doi.org/10.1093/bioinformatics/btz029
_version_ 1783450444786302976
author Maurer-Stroh, Sebastian
Krutz, Nora L
Kern, Petra S
Gunalan, Vithiagaran
Nguyen, Minh N
Limviphuvadh, Vachiranee
Eisenhaber, Frank
Gerberick, G Frank
author_facet Maurer-Stroh, Sebastian
Krutz, Nora L
Kern, Petra S
Gunalan, Vithiagaran
Nguyen, Minh N
Limviphuvadh, Vachiranee
Eisenhaber, Frank
Gerberick, G Frank
author_sort Maurer-Stroh, Sebastian
collection PubMed
description MOTIVATION: Due to the risk of inducing an immediate Type I (IgE-mediated) allergic response, proteins intended for use in consumer products must be investigated for their allergenic potential before introduction into the marketplace. The FAO/WHO guidelines for computational assessment of allergenic potential of proteins based on short peptide hits and linear sequence window identity thresholds misclassify many proteins as allergens. RESULTS: We developed AllerCatPro which predicts the allergenic potential of proteins based on similarity of their 3D protein structure as well as their amino acid sequence compared with a data set of known protein allergens comprising of 4180 unique allergenic protein sequences derived from the union of the major databases Food Allergy Research and Resource Program, Comprehensive Protein Allergen Resource, WHO/International Union of Immunological Societies, UniProtKB and Allergome. We extended the hexamer hit rule by removing peptides with high probability of random occurrence measured by sequence entropy as well as requiring 3 or more hexamer hits consistent with natural linear epitope patterns in known allergens. This is complemented with a Gluten-like repeat pattern detection. We also switched from a linear sequence window similarity to a B-cell epitope-like 3D surface similarity window which became possible through extensive 3D structure modeling covering the majority (74%) of allergens. In case no structure similarity is found, the decision workflow reverts to the old linear sequence window rule. The overall accuracy of AllerCatPro is 84% compared with other current methods which range from 51 to 73%. Both the FAO/WHO rules and AllerCatPro achieve highest sensitivity but AllerCatPro provides a 37-fold increase in specificity. AVAILABILITY AND IMPLEMENTATION: https://allercatpro.bii.a-star.edu.sg/ SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-6736023
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-67360232019-09-16 AllerCatPro—prediction of protein allergenicity potential from the protein sequence Maurer-Stroh, Sebastian Krutz, Nora L Kern, Petra S Gunalan, Vithiagaran Nguyen, Minh N Limviphuvadh, Vachiranee Eisenhaber, Frank Gerberick, G Frank Bioinformatics Original Papers MOTIVATION: Due to the risk of inducing an immediate Type I (IgE-mediated) allergic response, proteins intended for use in consumer products must be investigated for their allergenic potential before introduction into the marketplace. The FAO/WHO guidelines for computational assessment of allergenic potential of proteins based on short peptide hits and linear sequence window identity thresholds misclassify many proteins as allergens. RESULTS: We developed AllerCatPro which predicts the allergenic potential of proteins based on similarity of their 3D protein structure as well as their amino acid sequence compared with a data set of known protein allergens comprising of 4180 unique allergenic protein sequences derived from the union of the major databases Food Allergy Research and Resource Program, Comprehensive Protein Allergen Resource, WHO/International Union of Immunological Societies, UniProtKB and Allergome. We extended the hexamer hit rule by removing peptides with high probability of random occurrence measured by sequence entropy as well as requiring 3 or more hexamer hits consistent with natural linear epitope patterns in known allergens. This is complemented with a Gluten-like repeat pattern detection. We also switched from a linear sequence window similarity to a B-cell epitope-like 3D surface similarity window which became possible through extensive 3D structure modeling covering the majority (74%) of allergens. In case no structure similarity is found, the decision workflow reverts to the old linear sequence window rule. The overall accuracy of AllerCatPro is 84% compared with other current methods which range from 51 to 73%. Both the FAO/WHO rules and AllerCatPro achieve highest sensitivity but AllerCatPro provides a 37-fold increase in specificity. AVAILABILITY AND IMPLEMENTATION: https://allercatpro.bii.a-star.edu.sg/ SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2019-09-01 2019-01-18 /pmc/articles/PMC6736023/ /pubmed/30657872 http://dx.doi.org/10.1093/bioinformatics/btz029 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Maurer-Stroh, Sebastian
Krutz, Nora L
Kern, Petra S
Gunalan, Vithiagaran
Nguyen, Minh N
Limviphuvadh, Vachiranee
Eisenhaber, Frank
Gerberick, G Frank
AllerCatPro—prediction of protein allergenicity potential from the protein sequence
title AllerCatPro—prediction of protein allergenicity potential from the protein sequence
title_full AllerCatPro—prediction of protein allergenicity potential from the protein sequence
title_fullStr AllerCatPro—prediction of protein allergenicity potential from the protein sequence
title_full_unstemmed AllerCatPro—prediction of protein allergenicity potential from the protein sequence
title_short AllerCatPro—prediction of protein allergenicity potential from the protein sequence
title_sort allercatpro—prediction of protein allergenicity potential from the protein sequence
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6736023/
https://www.ncbi.nlm.nih.gov/pubmed/30657872
http://dx.doi.org/10.1093/bioinformatics/btz029
work_keys_str_mv AT maurerstrohsebastian allercatpropredictionofproteinallergenicitypotentialfromtheproteinsequence
AT krutznoral allercatpropredictionofproteinallergenicitypotentialfromtheproteinsequence
AT kernpetras allercatpropredictionofproteinallergenicitypotentialfromtheproteinsequence
AT gunalanvithiagaran allercatpropredictionofproteinallergenicitypotentialfromtheproteinsequence
AT nguyenminhn allercatpropredictionofproteinallergenicitypotentialfromtheproteinsequence
AT limviphuvadhvachiranee allercatpropredictionofproteinallergenicitypotentialfromtheproteinsequence
AT eisenhaberfrank allercatpropredictionofproteinallergenicitypotentialfromtheproteinsequence
AT gerberickgfrank allercatpropredictionofproteinallergenicitypotentialfromtheproteinsequence