Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. EPFL thesis
  4. Physically-constrained machine learning models of effective electronic Hamiltonians
 
doctoral thesis

Physically-constrained machine learning models of effective electronic Hamiltonians

Suman, Divya  
2025

Accurate quantum mechanical (QM) simulations are central to understanding the electronic structure and properties of molecules and materials. Electronic structure methods solve the Schrödinger equation for the electronic Hamiltonian, from which all ground and excited state properties can be derived. However, their steep computational cost limits applications to small systems or short timescales. Machine learning (ML) offers a way forward by creating surrogate models that map structure to properties at much lower cost. Early ML approaches focus on specific observables, energies, forces, charges, dipole moments, polarizabilities, rather than the underlying electronic structure. While effective within their training domain, such models lack transferability and cannot predict properties beyond those in the training data. A more general solution is to learn fundamental electronic quantities, such as electron densities, density matrices, wave functions, or effective single-particle Hamiltonians, which provide access to many properties through inexpensive postprocessing. This thesis focuses on learning an effective single-particle Hamiltonian.

Accurate Hamiltonians require large basis sets, producing high-dimensional matrices that make direct learning difficult. To balance accuracy and efficiency, we introduce an indirect learning framework. Instead of using matrix elements as final targets, the Hamiltonian is treated as an intermediate representation, while learning targets are derived properties such as orbital energies, charges, or observables computed in either the model basis or a larger reference basis. The model remains parametrized in a compact minimal basis, reducing complexity while still guided by information from more complete calculations. This hybrid design improves efficiency without sacrificing accuracy and preserves access to a wide range of properties through postprocessing. Using automatic differentiation, we optimize the effective Hamiltonian to reproduce observables from either the same or larger basis. Coupled with the Tammâ Dancoff approximation, ML-predicted Hamiltonians can predict singlet excited states across molecules. The models generalize well to unseen, larger systems while being orders of magnitude faster than reference methods, enabling applications such as computing spectral densities from molecular dynamics.

We extend this framework by interfacing ML Hamiltonians with PySCFAD, an auto-differentiable electronic structure code supporting density matrix construction and linear response calculations. This greatly expands the scope of indirect models, providing access to many observables without reimplementing differentiable routines. We analyze how design choices, such as adding physical constraints or basis set parametrization, affect accuracy and transferability. Well-regularized models extrapolate reliably to larger molecules, and for properties like dipole moments and polarizabilities, Hamiltonian-based models outperform property-specific ones. We also extend the framework to periodic systems for predicting band energies.

This thesis shows that ML Hamiltonians offer a powerful and generalizable bridge between QM and ML. By targeting an operator central to electronic structure rather than isolated properties, these models deliver efficient surrogates capable of predicting diverse observables with high accuracy. This work points toward hybrid MLâ QM approaches that unify accuracy and efficiency.

  • Files
  • Details
  • Metrics
Type
doctoral thesis
DOI
10.5075/epfl-thesis-11666
Author(s)
Suman, Divya  

École Polytechnique Fédérale de Lausanne

Advisors
Ceriotti, Michele  
Jury

Prof. Andreas Mortensen (président) ; Prof. Michele Ceriotti (directeur de thèse) ; Prof. Nicola Marzari, Prof. Laura Gagliardi, Prof. Reinhard Maurer (rapporteurs)

Date Issued

2025

Publisher

EPFL

Publisher place

Lausanne

Public defense year

2025-11-21

Thesis number

11666

Total of pages

124

Subjects

electronic structure

•

one-electron Hamiltonian

•

machine learning

•

physical constraints

•

auto-differentiation

EPFL units
COSMO  
Faculty
STI  
School
IMX  
Doctoral School
EDMX  
Available on Infoscience
November 24, 2025
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/256291
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés