Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Preprints and Working Papers
  4. Data-Efficient and Fast Machine Learning Molecular Dynamics through Integrated Active Learning and Knowledge Distillation
 
preprint

Data-Efficient and Fast Machine Learning Molecular Dynamics through Integrated Active Learning and Knowledge Distillation

Lian, Xiliang  
•
Pasquarello, Alfredo  orcid-logo
May 11, 2026

We develop data-efficient machine learning interatomic potentials (MLIPs) for fast molecular dynamics simulations combining DeePMD and MACE models within an active learning and knowledge distillation framework.Using liquid water as a case study, we first independently train DeePMD and MACE models from scratch through active learning.We find that MACE requires around 3.5 times less training data than DeepMD, but its inference speed is 10 times lower.We also show that starting from a pretrained foundation model based on the MACE architecture further reduces the training data by a factor of 7, resulting in a fine-tuned foundation model with a 25 times data reduction compared to DeePMD.To overcome the limitation associated with the lower inference speed of MACE potentials, we next develop a knowledge distillation scheme to train a DeePMD potential from the fine-tuned foundation model through an inexpensive active learning workflow.The distilled model is generated with #10 times less computer time than the DeePMD model trained from scratch, while showing the same fast inference speed.Comparison with ab initio calculations shows that all the models reach the same level of accuracy in reproducing structural, vibrational, and diffusive properties of liquid water.Our approach enables practical, data-efficient training of customized MLIPs with high speed and accuracy.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

chemrxiv.15002964_v1.pdf

Type

Main Document

Version

Submitted version (Preprint)

Access type

openaccess

License Condition

CC BY

Size

1.95 MB

Format

Adobe PDF

Checksum (MD5)

26f9ac0bef042c8d1ce148eba0912e16

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés