Cargando…

ACNNT3: Attention-CNN Framework for Prediction of Sequence-Based Bacterial Type III Secreted Effectors

The type III secretion system (T3SS) is a special protein delivery system in Gram-negative bacteria which delivers T3SS-secreted effectors (T3SEs) to host cells causing pathological changes. Numerous experiments have verified that T3SEs play important roles in many biological activities and in host-...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Jie, Li, Zhong, Luo, Jiesi, Yao, Yuhua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7157791/
https://www.ncbi.nlm.nih.gov/pubmed/32328150
http://dx.doi.org/10.1155/2020/3974598
Descripción
Sumario:The type III secretion system (T3SS) is a special protein delivery system in Gram-negative bacteria which delivers T3SS-secreted effectors (T3SEs) to host cells causing pathological changes. Numerous experiments have verified that T3SEs play important roles in many biological activities and in host-pathogen interactions. Accurate identification of T3SEs is therefore essential to help understand the pathogenic mechanism of bacteria; however, many existing biological experimental methods are time-consuming and expensive. New deep-learning methods have recently been successfully applied to T3SE recognition, but improving the recognition accuracy of T3SEs is still a challenge. In this study, we developed a new deep-learning framework, ACNNT3, based on the attention mechanism. We converted 100 residues of the N-terminal of the protein sequence into a fusion feature vector of protein primary structure information (one-hot encoding) and position-specific scoring matrix (PSSM) which are used as the feature input of the network model. We then embedded the attention layer into CNN to learn the characteristic preferences of type III effector proteins, which can accurately classify any protein directly as either T3SEs or non-T3SEs. We found that the introduction of new protein features can improve the recognition accuracy of the model. Our method combines the advantages of CNN and the attention mechanism and is superior in many indicators when compared to other popular methods. Using the common independent dataset, our method is more accurate than the previous method, showing an improvement of 4.1-20.0%.