Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Preprints and Working Papers
  4. Generative Modeling of Full-Atom Protein Conformations using Latent Diffusion on Graph Embeddings
 
preprint

Generative Modeling of Full-Atom Protein Conformations using Latent Diffusion on Graph Embeddings

Sengar, Aditya  
•
Hariri, Ali  
•
Probst, Daniel  
Show more
June 24, 2025

Generating diverse, all-atom conformational ensembles of dynamic proteins such as G-protein-coupled receptors (GPCRs) is critical for understanding their function, yet most generative models simplify atomic detail or ignore conformational diversity altogether. We present latent diffusion for full protein generation (LD-FPG), a framework that constructs complete all-atom protein structures, including every side-chain heavy atom, directly from molecular dynamics (MD) trajectories. LD-FPG employs a Chebyshev graph neural network (ChebNet) to obtain lowdimensional latent embeddings of protein conformations, which are processed using three pooling strategies: blind, sequential and residue-based. A diffusion model trained on these latent representations generates new samples that a decoder, optionally regularized by dihedral-angle losses, maps back to Cartesian coordinates. Using D2R-MD, a 2 µs MD trajectory (12 000 frames) of the human dopamine D2 receptor in a membrane environment, the sequential and residuebased pooling strategy reproduces the reference ensemble with high structural fidelity (all-atom lDDT ∼ 0.7; Cα-lDDT ∼ 0.8) and recovers backbone and sidechain dihedral-angle distributions with a Jensen-Shannon divergence < 0.03 compared to the MD data. LD-FPG thereby offers a practical route to system-specific, all-atom ensemble generation for large proteins, providing a promising tool for structure-based therapeutic design on complex, dynamic targets. The D2R-MD dataset and our implementation are freely available to facilitate further research.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

neurips_arxiv (3).pdf

Type

Main Document

Version

http://purl.org/coar/version/c_71e4c1898caa6e32

Access type

openaccess

License Condition

CC BY

Size

36 MB

Format

Adobe PDF

Checksum (MD5)

54cc523f55a039fe934776983a479cf4

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés