Cargando…

ECMPride: prediction of human extracellular matrix proteins based on the ideal dataset using hybrid features with domain evidence

Extracellular matrix (ECM) proteins play an essential role in various biological processes in multicellular organisms, and their abnormal regulation can lead to many diseases. For large-scale ECM protein identification, especially through proteomic-based techniques, a theoretical reference database...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Binghui, Leng, Ling, Sun, Xuer, Wang, Yunfang, Ma, Jie, Zhu, Yunping
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7195829/
https://www.ncbi.nlm.nih.gov/pubmed/32377454
http://dx.doi.org/10.7717/peerj.9066
_version_ 1783528617098084352
author Liu, Binghui
Leng, Ling
Sun, Xuer
Wang, Yunfang
Ma, Jie
Zhu, Yunping
author_facet Liu, Binghui
Leng, Ling
Sun, Xuer
Wang, Yunfang
Ma, Jie
Zhu, Yunping
author_sort Liu, Binghui
collection PubMed
description Extracellular matrix (ECM) proteins play an essential role in various biological processes in multicellular organisms, and their abnormal regulation can lead to many diseases. For large-scale ECM protein identification, especially through proteomic-based techniques, a theoretical reference database of ECM proteins is required. In this study, based on the experimentally verified ECM datasets and by the integration of protein domain features and a machine learning model, we developed ECMPride, a flexible and scalable tool for predicting ECM proteins. ECMPride achieved excellent performance in predicting ECM proteins, with appropriate balanced accuracy and sensitivity, and the performance of ECMPride was shown to be superior to the previously developed tool. A new theoretical dataset of human ECM components was also established by applying ECMPride to all human entries in the SwissProt database, containing a significant number of putative ECM proteins as well as the abundant biological annotations. This dataset might serve as a valuable reference resource for ECM protein identification.
format Online
Article
Text
id pubmed-7195829
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-71958292020-05-06 ECMPride: prediction of human extracellular matrix proteins based on the ideal dataset using hybrid features with domain evidence Liu, Binghui Leng, Ling Sun, Xuer Wang, Yunfang Ma, Jie Zhu, Yunping PeerJ Bioinformatics Extracellular matrix (ECM) proteins play an essential role in various biological processes in multicellular organisms, and their abnormal regulation can lead to many diseases. For large-scale ECM protein identification, especially through proteomic-based techniques, a theoretical reference database of ECM proteins is required. In this study, based on the experimentally verified ECM datasets and by the integration of protein domain features and a machine learning model, we developed ECMPride, a flexible and scalable tool for predicting ECM proteins. ECMPride achieved excellent performance in predicting ECM proteins, with appropriate balanced accuracy and sensitivity, and the performance of ECMPride was shown to be superior to the previously developed tool. A new theoretical dataset of human ECM components was also established by applying ECMPride to all human entries in the SwissProt database, containing a significant number of putative ECM proteins as well as the abundant biological annotations. This dataset might serve as a valuable reference resource for ECM protein identification. PeerJ Inc. 2020-04-29 /pmc/articles/PMC7195829/ /pubmed/32377454 http://dx.doi.org/10.7717/peerj.9066 Text en ©2020 Liu et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Liu, Binghui
Leng, Ling
Sun, Xuer
Wang, Yunfang
Ma, Jie
Zhu, Yunping
ECMPride: prediction of human extracellular matrix proteins based on the ideal dataset using hybrid features with domain evidence
title ECMPride: prediction of human extracellular matrix proteins based on the ideal dataset using hybrid features with domain evidence
title_full ECMPride: prediction of human extracellular matrix proteins based on the ideal dataset using hybrid features with domain evidence
title_fullStr ECMPride: prediction of human extracellular matrix proteins based on the ideal dataset using hybrid features with domain evidence
title_full_unstemmed ECMPride: prediction of human extracellular matrix proteins based on the ideal dataset using hybrid features with domain evidence
title_short ECMPride: prediction of human extracellular matrix proteins based on the ideal dataset using hybrid features with domain evidence
title_sort ecmpride: prediction of human extracellular matrix proteins based on the ideal dataset using hybrid features with domain evidence
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7195829/
https://www.ncbi.nlm.nih.gov/pubmed/32377454
http://dx.doi.org/10.7717/peerj.9066
work_keys_str_mv AT liubinghui ecmpridepredictionofhumanextracellularmatrixproteinsbasedontheidealdatasetusinghybridfeatureswithdomainevidence
AT lengling ecmpridepredictionofhumanextracellularmatrixproteinsbasedontheidealdatasetusinghybridfeatureswithdomainevidence
AT sunxuer ecmpridepredictionofhumanextracellularmatrixproteinsbasedontheidealdatasetusinghybridfeatureswithdomainevidence
AT wangyunfang ecmpridepredictionofhumanextracellularmatrixproteinsbasedontheidealdatasetusinghybridfeatureswithdomainevidence
AT majie ecmpridepredictionofhumanextracellularmatrixproteinsbasedontheidealdatasetusinghybridfeatureswithdomainevidence
AT zhuyunping ecmpridepredictionofhumanextracellularmatrixproteinsbasedontheidealdatasetusinghybridfeatureswithdomainevidence