Computational design of complex protein folds and soluble analogs of membrane proteins
Proteins are the molecular machines of life, driving essential biological processes such as enzymatic catalysis, cellular signaling, and immune responses. The ability to design entirely new proteins with desired structures and functions has immense implications for biotechnology. We are now at the dawn of an Artificial Intelligence revolution, where deep-learning models are reshaping our world, from large language models to protein design tools. AlphaFold2 has already revolutionized protein structure prediction, but the next major challenge is to harness these deep learning models for de novo protein design. In my thesis, we demonstrate that deep-learning-driven methodologies can generate complex protein folds with unprecedented accuracy and experimental success rates. We explore AlphaFold2 inversion: using structure prediction models to design existing protein topologies with novel sequences. While AlphaFold2 alone produced promising structures in silico, these designs were often insoluble when tested experimentally, requiring Rosetta-based optimization. To address this, we integrated deep-learning-based sequence optimization into our pipeline, enabling the design of more complex protein folds; which is essential for the design complexer functionalities. Additionally, by retraining sequence generators exclusively on soluble proteins, we successfully redesigned membrane proteins as soluble analogs. These analogs retained key functional motifs while significantly improving expression and stability. For example, we developed functional soluble analogs of Claudins and G-protein-coupled receptors (GPCRs) that preserved important aspects of their native functionalities. We show that Claudin analogs still bind their native interaction partners, while GPCR analogs retain the ability to bind antibodies and G-proteins. Furthermore, we demonstrate that GPCRs can be designed with sufficient precision to fix their conformation to either the active or inactive state. These functional soluble analogs open new avenues for antibody generation, vaccine development, and ligand-binding studies. Building on these advances, we address the next challenge: designing soluble GPCR analogs with more complex functional behaviors, such as signaling and ligand-triggered state switching. This involves multistate design, where proteins adopt multiple conformations in response to external stimuli. We set up a AF2 based multistate design pipeline and generated designs that show switching behaviour in silico. Although these designs are not yet fully validated experimentally, they represent a crucial step toward designing proteins capable of mimicking natural signaling mechanisms. These advances pave the way for custom protein switches, biosensors and molecular circuits. Our findings establish a new paradigm in de novo protein design, where deep-learning-driven design moves beyond static structures to active control over protein function. By combining machine learning, structural biology, and biophysics, we provide a computational framework that expands the possibilities of protein design. The future of protein design lies in creating dynamic, functional proteins that interact with their environment, unlocking new possibilities for medicine, biotechnology, and beyond.
Prof. Giovanni D'Angelo (président) ; Prof. Bruno Emanuel Ferreira De Sousa Correia (directeur de thèse) ; Prof. Anne-Florence Bitbol, Prof. Eva-Maria Strauch, Prof. Possu Huang (rapporteurs)
2025
Lausanne
2025-08-18
11287
157