Cargando…

Identification of DEP domain-containing proteins by a machine learning method and experimental analysis of their expression in human HCC tissues

The Dishevelled/EGL-10/Pleckstrin (DEP) domain-containing (DEPDC) proteins have seven members. However, whether this superfamily can be distinguished from other proteins based only on the amino acid sequences, remains unknown. Here, we describe a computational method to segregate DEPDCs and non-DEPD...

Descripción completa

Detalles Bibliográficos
Autores principales: Liao, Zhijun, Wang, Xinrui, Zeng, Yeting, Zou, Quan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5175133/
https://www.ncbi.nlm.nih.gov/pubmed/28000796
http://dx.doi.org/10.1038/srep39655
_version_ 1782484600829771776
author Liao, Zhijun
Wang, Xinrui
Zeng, Yeting
Zou, Quan
author_facet Liao, Zhijun
Wang, Xinrui
Zeng, Yeting
Zou, Quan
author_sort Liao, Zhijun
collection PubMed
description The Dishevelled/EGL-10/Pleckstrin (DEP) domain-containing (DEPDC) proteins have seven members. However, whether this superfamily can be distinguished from other proteins based only on the amino acid sequences, remains unknown. Here, we describe a computational method to segregate DEPDCs and non-DEPDCs. First, we examined the Pfam numbers of the known DEPDCs and used the longest sequences for each Pfam to construct a phylogenetic tree. Subsequently, we extracted 188-dimensional (188D) and 20D features of DEPDCs and non-DEPDCs and classified them with random forest classifier. We also mined the motifs of human DEPDCs to find the related domains. Finally, we designed experimental verification methods of human DEPDC expression at the mRNA level in hepatocellular carcinoma (HCC) and adjacent normal tissues. The phylogenetic analysis showed that the DEPDCs superfamily can be divided into three clusters. Moreover, the 188D and 20D features can both be used to effectively distinguish the two protein types. Motif analysis revealed that the DEP and RhoGAP domain was common in human DEPDCs, human HCC and the adjacent tissues that widely expressed DEPDCs. However, their regulation was not identical. In conclusion, we successfully constructed a binary classifier for DEPDCs and experimentally verified their expression in human HCC tissues.
format Online
Article
Text
id pubmed-5175133
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-51751332016-12-28 Identification of DEP domain-containing proteins by a machine learning method and experimental analysis of their expression in human HCC tissues Liao, Zhijun Wang, Xinrui Zeng, Yeting Zou, Quan Sci Rep Article The Dishevelled/EGL-10/Pleckstrin (DEP) domain-containing (DEPDC) proteins have seven members. However, whether this superfamily can be distinguished from other proteins based only on the amino acid sequences, remains unknown. Here, we describe a computational method to segregate DEPDCs and non-DEPDCs. First, we examined the Pfam numbers of the known DEPDCs and used the longest sequences for each Pfam to construct a phylogenetic tree. Subsequently, we extracted 188-dimensional (188D) and 20D features of DEPDCs and non-DEPDCs and classified them with random forest classifier. We also mined the motifs of human DEPDCs to find the related domains. Finally, we designed experimental verification methods of human DEPDC expression at the mRNA level in hepatocellular carcinoma (HCC) and adjacent normal tissues. The phylogenetic analysis showed that the DEPDCs superfamily can be divided into three clusters. Moreover, the 188D and 20D features can both be used to effectively distinguish the two protein types. Motif analysis revealed that the DEP and RhoGAP domain was common in human DEPDCs, human HCC and the adjacent tissues that widely expressed DEPDCs. However, their regulation was not identical. In conclusion, we successfully constructed a binary classifier for DEPDCs and experimentally verified their expression in human HCC tissues. Nature Publishing Group 2016-12-21 /pmc/articles/PMC5175133/ /pubmed/28000796 http://dx.doi.org/10.1038/srep39655 Text en Copyright © 2016, The Author(s) http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Liao, Zhijun
Wang, Xinrui
Zeng, Yeting
Zou, Quan
Identification of DEP domain-containing proteins by a machine learning method and experimental analysis of their expression in human HCC tissues
title Identification of DEP domain-containing proteins by a machine learning method and experimental analysis of their expression in human HCC tissues
title_full Identification of DEP domain-containing proteins by a machine learning method and experimental analysis of their expression in human HCC tissues
title_fullStr Identification of DEP domain-containing proteins by a machine learning method and experimental analysis of their expression in human HCC tissues
title_full_unstemmed Identification of DEP domain-containing proteins by a machine learning method and experimental analysis of their expression in human HCC tissues
title_short Identification of DEP domain-containing proteins by a machine learning method and experimental analysis of their expression in human HCC tissues
title_sort identification of dep domain-containing proteins by a machine learning method and experimental analysis of their expression in human hcc tissues
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5175133/
https://www.ncbi.nlm.nih.gov/pubmed/28000796
http://dx.doi.org/10.1038/srep39655
work_keys_str_mv AT liaozhijun identificationofdepdomaincontainingproteinsbyamachinelearningmethodandexperimentalanalysisoftheirexpressioninhumanhcctissues
AT wangxinrui identificationofdepdomaincontainingproteinsbyamachinelearningmethodandexperimentalanalysisoftheirexpressioninhumanhcctissues
AT zengyeting identificationofdepdomaincontainingproteinsbyamachinelearningmethodandexperimentalanalysisoftheirexpressioninhumanhcctissues
AT zouquan identificationofdepdomaincontainingproteinsbyamachinelearningmethodandexperimentalanalysisoftheirexpressioninhumanhcctissues