Exploring “dark-matter” protein folds using deep learning
De novo protein design explores uncharted sequence and structure space to generate novel proteins not sampled by evolution. A main challenge in de novo design involves crafting “designable” structural templates to guide the sequence searches toward adopting target structures. We present a convolutional variational autoencoder that learns patterns of protein structure, dubbed Genesis. We coupled Genesis with trRosetta to design sequences for a set of protein folds and found that Genesis is capable of reconstructing native-like distance and angle distributions for five native folds and three novel, the so-called “dark-matter” folds as a demonstration of generalizability. We used a high-throughput assay to characterize the stability of the designs through protease resistance, obtaining encouraging success rates for folded proteins. Genesis enables exploration of the protein fold space within minutes, unrestricted by protein topologies. Our approach addresses the backbone designability problem, showing that small neural networks can efficiently learn structural patterns in proteins. A record of this paper's transparent peer review process is included in the supplemental information.
2-s2.0-85206959473
39383860
École Polytechnique Fédérale de Lausanne
École Polytechnique Fédérale de Lausanne
École Polytechnique Fédérale de Lausanne
Imperial College London
École Polytechnique Fédérale de Lausanne
École Polytechnique Fédérale de Lausanne
École Polytechnique Fédérale de Lausanne
École Polytechnique Fédérale de Lausanne
École Polytechnique Fédérale de Lausanne
École Polytechnique Fédérale de Lausanne
2024-10-16
15
10
898
910.e5
REVIEWED
EPFL
Funder | Funding(s) | Grant Number | Grant URL |
Biltema Foundation | |||
National Center of Competence in Research in Chemical Biology | |||
Swiss National Supercomputing Centre | |||
Show more |