The molecular dipole moment (mu) is a central quantity in chemistry. It is essential in predicting infrared and sum-frequency generation spectra as well as induction and long-range electrostatic interactions. Furthermore, it can be extracted directly-via the ground state electron density-from high-level quantum mechanical calculations, making it an ideal target for machine learning (ML). In this work, we choose to represent this quantity with a physically inspired ML model that captures two distinct physical effects: local atomic polarization is captured within the symmetry-adapted Gaussian process regression framework which assigns a (vector) dipole moment to each atom, while the movement of charge across the entire molecule is captured by assigning a partial (scalar) charge to each atom. The resulting "MuML" models are fitted together to reproduce molecular mu computed using high-level coupled-cluster theory and density functional theory (DFT) on the QM7b dataset, achieving more accurate results due to the physics-based combination of these complementary terms. The combined model shows excellent transferability when applied to a showcase dataset of larger and more complex molecules, approaching the accuracy of DFT at a small fraction of the computational cost. We also demonstrate that the uncertainty in the predictions can be estimated reliably using a calibrated committee model. The ultimate performance of the models-and the optimal weighting of their combination-depends, however, on the details of the system at hand, with the scalar model being clearly superior when describing large molecules whose dipole is almost entirely generated by charge separation. These observations point to the importance of simultaneously accounting for the local and non-local effects that contribute to mu; furthermore, they define a challenging task to benchmark future models, particularly those aimed at the description of condensed phases. Published under license by AIP Publishing.