Cargando…

Collectively encoding protein properties enriches protein language models

Pre-trained natural language processing models on a large natural language corpus can naturally transfer learned knowledge to protein domains by fine-tuning specific in-domain tasks. However, few studies focused on enriching such protein language models by jointly learning protein properties from st...

Descripción completa

Detalles Bibliográficos
Autores principales:	An, Jingmin, Weng, Xiaogang
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2022
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9641823/ https://www.ncbi.nlm.nih.gov/pubmed/36348281 http://dx.doi.org/10.1186/s12859-022-05031-z

Descripción
Sumario:	Pre-trained natural language processing models on a large natural language corpus can naturally transfer learned knowledge to protein domains by fine-tuning specific in-domain tasks. However, few studies focused on enriching such protein language models by jointly learning protein properties from strongly-correlated protein tasks. Here we elaborately designed a multi-task learning (MTL) architecture, aiming to decipher implicit structural and evolutionary information from three sequence-level classification tasks for protein family, superfamily and fold. Considering the co-existing contextual relevance between human words and protein language, we employed BERT, pre-trained on a large natural language corpus, as our backbone to handle protein sequences. More importantly, the encoded knowledge obtained in the MTL stage can be well transferred to more fine-grained downstream tasks of TAPE. Experiments on structure- or evolution-related applications demonstrate that our approach outperforms many state-of-the-art Transformer-based protein models, especially in remote homology detection. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-05031-z.

Collectively encoding protein properties enriches protein language models

Ejemplares similares