Cargando…
ManyFold: an efficient and flexible library for training and validating protein folding models
SUMMARY: ManyFold is a flexible library for protein structure prediction with deep learning that (i) supports models that use both multiple sequence alignments (MSAs) and protein language model (pLM) embedding as inputs, (ii) allows inference of existing models (AlphaFold and OpenFold), (iii) is ful...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9825755/ https://www.ncbi.nlm.nih.gov/pubmed/36495196 http://dx.doi.org/10.1093/bioinformatics/btac773 |
_version_ | 1784866691139764224 |
---|---|
author | Villegas-Morcillo, Amelia Robinson, Louis Flajolet, Arthur Barrett, Thomas D |
author_facet | Villegas-Morcillo, Amelia Robinson, Louis Flajolet, Arthur Barrett, Thomas D |
author_sort | Villegas-Morcillo, Amelia |
collection | PubMed |
description | SUMMARY: ManyFold is a flexible library for protein structure prediction with deep learning that (i) supports models that use both multiple sequence alignments (MSAs) and protein language model (pLM) embedding as inputs, (ii) allows inference of existing models (AlphaFold and OpenFold), (iii) is fully trainable, allowing for both fine-tuning and the training of new models from scratch and (iv) is written in Jax to support efficient batched operation in distributed settings. A proof-of-concept pLM-based model, pLMFold, is trained from scratch to obtain reasonable results with reduced computational overheads in comparison to AlphaFold. AVAILABILITY AND IMPLEMENTATION: The source code for ManyFold, the validation dataset and a small sample of training data are available at https://github.com/instadeepai/manyfold. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-9825755 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-98257552023-01-10 ManyFold: an efficient and flexible library for training and validating protein folding models Villegas-Morcillo, Amelia Robinson, Louis Flajolet, Arthur Barrett, Thomas D Bioinformatics Applications Note SUMMARY: ManyFold is a flexible library for protein structure prediction with deep learning that (i) supports models that use both multiple sequence alignments (MSAs) and protein language model (pLM) embedding as inputs, (ii) allows inference of existing models (AlphaFold and OpenFold), (iii) is fully trainable, allowing for both fine-tuning and the training of new models from scratch and (iv) is written in Jax to support efficient batched operation in distributed settings. A proof-of-concept pLM-based model, pLMFold, is trained from scratch to obtain reasonable results with reduced computational overheads in comparison to AlphaFold. AVAILABILITY AND IMPLEMENTATION: The source code for ManyFold, the validation dataset and a small sample of training data are available at https://github.com/instadeepai/manyfold. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-12-10 /pmc/articles/PMC9825755/ /pubmed/36495196 http://dx.doi.org/10.1093/bioinformatics/btac773 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Applications Note Villegas-Morcillo, Amelia Robinson, Louis Flajolet, Arthur Barrett, Thomas D ManyFold: an efficient and flexible library for training and validating protein folding models |
title | ManyFold: an efficient and flexible library for training and validating protein folding models |
title_full | ManyFold: an efficient and flexible library for training and validating protein folding models |
title_fullStr | ManyFold: an efficient and flexible library for training and validating protein folding models |
title_full_unstemmed | ManyFold: an efficient and flexible library for training and validating protein folding models |
title_short | ManyFold: an efficient and flexible library for training and validating protein folding models |
title_sort | manyfold: an efficient and flexible library for training and validating protein folding models |
topic | Applications Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9825755/ https://www.ncbi.nlm.nih.gov/pubmed/36495196 http://dx.doi.org/10.1093/bioinformatics/btac773 |
work_keys_str_mv | AT villegasmorcilloamelia manyfoldanefficientandflexiblelibraryfortrainingandvalidatingproteinfoldingmodels AT robinsonlouis manyfoldanefficientandflexiblelibraryfortrainingandvalidatingproteinfoldingmodels AT flajoletarthur manyfoldanefficientandflexiblelibraryfortrainingandvalidatingproteinfoldingmodels AT barrettthomasd manyfoldanefficientandflexiblelibraryfortrainingandvalidatingproteinfoldingmodels |