Cargando…
Identification of DEP domain-containing proteins by a machine learning method and experimental analysis of their expression in human HCC tissues
The Dishevelled/EGL-10/Pleckstrin (DEP) domain-containing (DEPDC) proteins have seven members. However, whether this superfamily can be distinguished from other proteins based only on the amino acid sequences, remains unknown. Here, we describe a computational method to segregate DEPDCs and non-DEPD...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5175133/ https://www.ncbi.nlm.nih.gov/pubmed/28000796 http://dx.doi.org/10.1038/srep39655 |
_version_ | 1782484600829771776 |
---|---|
author | Liao, Zhijun Wang, Xinrui Zeng, Yeting Zou, Quan |
author_facet | Liao, Zhijun Wang, Xinrui Zeng, Yeting Zou, Quan |
author_sort | Liao, Zhijun |
collection | PubMed |
description | The Dishevelled/EGL-10/Pleckstrin (DEP) domain-containing (DEPDC) proteins have seven members. However, whether this superfamily can be distinguished from other proteins based only on the amino acid sequences, remains unknown. Here, we describe a computational method to segregate DEPDCs and non-DEPDCs. First, we examined the Pfam numbers of the known DEPDCs and used the longest sequences for each Pfam to construct a phylogenetic tree. Subsequently, we extracted 188-dimensional (188D) and 20D features of DEPDCs and non-DEPDCs and classified them with random forest classifier. We also mined the motifs of human DEPDCs to find the related domains. Finally, we designed experimental verification methods of human DEPDC expression at the mRNA level in hepatocellular carcinoma (HCC) and adjacent normal tissues. The phylogenetic analysis showed that the DEPDCs superfamily can be divided into three clusters. Moreover, the 188D and 20D features can both be used to effectively distinguish the two protein types. Motif analysis revealed that the DEP and RhoGAP domain was common in human DEPDCs, human HCC and the adjacent tissues that widely expressed DEPDCs. However, their regulation was not identical. In conclusion, we successfully constructed a binary classifier for DEPDCs and experimentally verified their expression in human HCC tissues. |
format | Online Article Text |
id | pubmed-5175133 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Nature Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-51751332016-12-28 Identification of DEP domain-containing proteins by a machine learning method and experimental analysis of their expression in human HCC tissues Liao, Zhijun Wang, Xinrui Zeng, Yeting Zou, Quan Sci Rep Article The Dishevelled/EGL-10/Pleckstrin (DEP) domain-containing (DEPDC) proteins have seven members. However, whether this superfamily can be distinguished from other proteins based only on the amino acid sequences, remains unknown. Here, we describe a computational method to segregate DEPDCs and non-DEPDCs. First, we examined the Pfam numbers of the known DEPDCs and used the longest sequences for each Pfam to construct a phylogenetic tree. Subsequently, we extracted 188-dimensional (188D) and 20D features of DEPDCs and non-DEPDCs and classified them with random forest classifier. We also mined the motifs of human DEPDCs to find the related domains. Finally, we designed experimental verification methods of human DEPDC expression at the mRNA level in hepatocellular carcinoma (HCC) and adjacent normal tissues. The phylogenetic analysis showed that the DEPDCs superfamily can be divided into three clusters. Moreover, the 188D and 20D features can both be used to effectively distinguish the two protein types. Motif analysis revealed that the DEP and RhoGAP domain was common in human DEPDCs, human HCC and the adjacent tissues that widely expressed DEPDCs. However, their regulation was not identical. In conclusion, we successfully constructed a binary classifier for DEPDCs and experimentally verified their expression in human HCC tissues. Nature Publishing Group 2016-12-21 /pmc/articles/PMC5175133/ /pubmed/28000796 http://dx.doi.org/10.1038/srep39655 Text en Copyright © 2016, The Author(s) http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ |
spellingShingle | Article Liao, Zhijun Wang, Xinrui Zeng, Yeting Zou, Quan Identification of DEP domain-containing proteins by a machine learning method and experimental analysis of their expression in human HCC tissues |
title | Identification of DEP domain-containing proteins by a machine learning method and experimental analysis of their expression in human HCC tissues |
title_full | Identification of DEP domain-containing proteins by a machine learning method and experimental analysis of their expression in human HCC tissues |
title_fullStr | Identification of DEP domain-containing proteins by a machine learning method and experimental analysis of their expression in human HCC tissues |
title_full_unstemmed | Identification of DEP domain-containing proteins by a machine learning method and experimental analysis of their expression in human HCC tissues |
title_short | Identification of DEP domain-containing proteins by a machine learning method and experimental analysis of their expression in human HCC tissues |
title_sort | identification of dep domain-containing proteins by a machine learning method and experimental analysis of their expression in human hcc tissues |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5175133/ https://www.ncbi.nlm.nih.gov/pubmed/28000796 http://dx.doi.org/10.1038/srep39655 |
work_keys_str_mv | AT liaozhijun identificationofdepdomaincontainingproteinsbyamachinelearningmethodandexperimentalanalysisoftheirexpressioninhumanhcctissues AT wangxinrui identificationofdepdomaincontainingproteinsbyamachinelearningmethodandexperimentalanalysisoftheirexpressioninhumanhcctissues AT zengyeting identificationofdepdomaincontainingproteinsbyamachinelearningmethodandexperimentalanalysisoftheirexpressioninhumanhcctissues AT zouquan identificationofdepdomaincontainingproteinsbyamachinelearningmethodandexperimentalanalysisoftheirexpressioninhumanhcctissues |