Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Preprints and Working Papers
  4. Generative Modeling of Full-Atom Protein Conformations using Latent Diffusion on Graph Embeddings
 
preprint

Generative Modeling of Full-Atom Protein Conformations using Latent Diffusion on Graph Embeddings

Sengar, Aditya  
•
Hariri, Ali  
•
Probst, Daniel  
Show more
June 24, 2025

Generating diverse, all-atom conformational ensembles of dynamic proteins such as G-protein-coupled receptors (GPCRs) is critical for understanding their function, yet most generative models simplify atomic detail or ignore conformational diversity altogether. We present latent diffusion for full protein generation (LD-FPG), a framework that constructs complete all-atom protein structures, including every side-chain heavy atom, directly from molecular dynamics (MD) trajectories. LD-FPG employs a Chebyshev graph neural network (ChebNet) to obtain lowdimensional latent embeddings of protein conformations, which are processed using three pooling strategies: blind, sequential and residue-based. A diffusion model trained on these latent representations generates new samples that a decoder, optionally regularized by dihedral-angle losses, maps back to Cartesian coordinates. Using D2R-MD, a 2 µs MD trajectory (12 000 frames) of the human dopamine D2 receptor in a membrane environment, the sequential and residuebased pooling strategy reproduces the reference ensemble with high structural fidelity (all-atom lDDT ∼ 0.7; Cα-lDDT ∼ 0.8) and recovers backbone and sidechain dihedral-angle distributions with a Jensen-Shannon divergence < 0.03 compared to the MD data. LD-FPG thereby offers a practical route to system-specific, all-atom ensemble generation for large proteins, providing a promising tool for structure-based therapeutic design on complex, dynamic targets. The D2R-MD dataset and our implementation are freely available to facilitate further research.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

neurips_arxiv (3).pdf

Type

Main Document

Version

Submitted version (Preprint)

Access type

openaccess

License Condition

CC BY

Size

36 MB

Format

Adobe PDF

Checksum (MD5)

54cc523f55a039fe934776983a479cf4

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés