Cargando…

MHC2MIL: a novel multiple instance learning based method for MHC-II peptide binding prediction by considering peptide flanking region and residue positions

BACKGROUND: Computational prediction of major histocompatibility complex class II (MHC-II) binding peptides can assist researchers in understanding the mechanism of immune systems and developing peptide based vaccines. Although many computational methods have been proposed, the performance of these...

Descripción completa

Detalles Bibliográficos
Autores principales: Xu, Yichang, Luo, Cheng, Qian, Mingjie, Huang, Xiaodi, Zhu, Shanfeng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4290625/
https://www.ncbi.nlm.nih.gov/pubmed/25521198
http://dx.doi.org/10.1186/1471-2164-15-S9-S9
_version_ 1782352275461636096
author Xu, Yichang
Luo, Cheng
Qian, Mingjie
Huang, Xiaodi
Zhu, Shanfeng
author_facet Xu, Yichang
Luo, Cheng
Qian, Mingjie
Huang, Xiaodi
Zhu, Shanfeng
author_sort Xu, Yichang
collection PubMed
description BACKGROUND: Computational prediction of major histocompatibility complex class II (MHC-II) binding peptides can assist researchers in understanding the mechanism of immune systems and developing peptide based vaccines. Although many computational methods have been proposed, the performance of these methods are far from satisfactory. The difficulty of MHC-II peptide binding prediction comes mainly from the large length variation of binding peptides. METHODS: We develop a novel multiple instance learning based method called MHC2MIL, in order to predict MHC-II binding peptides. We deem each peptide in MHC2MIL as a bag, and some substrings of the peptide as the instances in the bag. Unlike previous multiple instance learning based methods that consider only instances of fixed length 9 (9 amino acids), MHC2MIL is able to deal with instances of both lengths of 9 and 11 (11 amino acids), simultaneously. As such, MHC2MIL incorporates important information in the peptide flanking region. For measuring the distances between different instances, furthermore, MHC2MIL explicitly highlights the amino acids in some important positions. RESULTS: Experimental results on a benchmark dataset have shown that, the performance of MHC2MIL is significantly improved by considering the instances of both 9 and 11 amino acids, as well as by emphasizing amino acids at key positions in the instance. The results are consistent with those reported in the literature on MHC-II peptide binding. In addition to five important positions (1, 4, 6, 7 and 9) for HLA(human leukocyte antigen, the name of MHC in Humans) DR peptide binding, we also find that position 2 may play some roles in the binding process. By using 5-fold cross validation on the benchmark dataset, MHC2MIL outperforms two state-of-the-art methods of MHC2SK and NN-align with being statistically significant, on 12 HLA DP and DQ molecules. In addition, it achieves comparable performance with MHC2SK and NN-align on 14 HLA DR molecules. MHC2MIL is freely available at http://datamining-iip.fudan.edu.cn/service/MHC2MIL/index.html.
format Online
Article
Text
id pubmed-4290625
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-42906252015-01-15 MHC2MIL: a novel multiple instance learning based method for MHC-II peptide binding prediction by considering peptide flanking region and residue positions Xu, Yichang Luo, Cheng Qian, Mingjie Huang, Xiaodi Zhu, Shanfeng BMC Genomics Research BACKGROUND: Computational prediction of major histocompatibility complex class II (MHC-II) binding peptides can assist researchers in understanding the mechanism of immune systems and developing peptide based vaccines. Although many computational methods have been proposed, the performance of these methods are far from satisfactory. The difficulty of MHC-II peptide binding prediction comes mainly from the large length variation of binding peptides. METHODS: We develop a novel multiple instance learning based method called MHC2MIL, in order to predict MHC-II binding peptides. We deem each peptide in MHC2MIL as a bag, and some substrings of the peptide as the instances in the bag. Unlike previous multiple instance learning based methods that consider only instances of fixed length 9 (9 amino acids), MHC2MIL is able to deal with instances of both lengths of 9 and 11 (11 amino acids), simultaneously. As such, MHC2MIL incorporates important information in the peptide flanking region. For measuring the distances between different instances, furthermore, MHC2MIL explicitly highlights the amino acids in some important positions. RESULTS: Experimental results on a benchmark dataset have shown that, the performance of MHC2MIL is significantly improved by considering the instances of both 9 and 11 amino acids, as well as by emphasizing amino acids at key positions in the instance. The results are consistent with those reported in the literature on MHC-II peptide binding. In addition to five important positions (1, 4, 6, 7 and 9) for HLA(human leukocyte antigen, the name of MHC in Humans) DR peptide binding, we also find that position 2 may play some roles in the binding process. By using 5-fold cross validation on the benchmark dataset, MHC2MIL outperforms two state-of-the-art methods of MHC2SK and NN-align with being statistically significant, on 12 HLA DP and DQ molecules. In addition, it achieves comparable performance with MHC2SK and NN-align on 14 HLA DR molecules. MHC2MIL is freely available at http://datamining-iip.fudan.edu.cn/service/MHC2MIL/index.html. BioMed Central 2014-12-08 /pmc/articles/PMC4290625/ /pubmed/25521198 http://dx.doi.org/10.1186/1471-2164-15-S9-S9 Text en Copyright © 2014 Xu et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Xu, Yichang
Luo, Cheng
Qian, Mingjie
Huang, Xiaodi
Zhu, Shanfeng
MHC2MIL: a novel multiple instance learning based method for MHC-II peptide binding prediction by considering peptide flanking region and residue positions
title MHC2MIL: a novel multiple instance learning based method for MHC-II peptide binding prediction by considering peptide flanking region and residue positions
title_full MHC2MIL: a novel multiple instance learning based method for MHC-II peptide binding prediction by considering peptide flanking region and residue positions
title_fullStr MHC2MIL: a novel multiple instance learning based method for MHC-II peptide binding prediction by considering peptide flanking region and residue positions
title_full_unstemmed MHC2MIL: a novel multiple instance learning based method for MHC-II peptide binding prediction by considering peptide flanking region and residue positions
title_short MHC2MIL: a novel multiple instance learning based method for MHC-II peptide binding prediction by considering peptide flanking region and residue positions
title_sort mhc2mil: a novel multiple instance learning based method for mhc-ii peptide binding prediction by considering peptide flanking region and residue positions
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4290625/
https://www.ncbi.nlm.nih.gov/pubmed/25521198
http://dx.doi.org/10.1186/1471-2164-15-S9-S9
work_keys_str_mv AT xuyichang mhc2milanovelmultipleinstancelearningbasedmethodformhciipeptidebindingpredictionbyconsideringpeptideflankingregionandresiduepositions
AT luocheng mhc2milanovelmultipleinstancelearningbasedmethodformhciipeptidebindingpredictionbyconsideringpeptideflankingregionandresiduepositions
AT qianmingjie mhc2milanovelmultipleinstancelearningbasedmethodformhciipeptidebindingpredictionbyconsideringpeptideflankingregionandresiduepositions
AT huangxiaodi mhc2milanovelmultipleinstancelearningbasedmethodformhciipeptidebindingpredictionbyconsideringpeptideflankingregionandresiduepositions
AT zhushanfeng mhc2milanovelmultipleinstancelearningbasedmethodformhciipeptidebindingpredictionbyconsideringpeptideflankingregionandresiduepositions