Incompleteness of Atomic Structure Representations
Many-body descriptors are widely used to represent atomic environments in the construction of machine-learned interatomic potentials and more broadly for fitting, classification, and embedding tasks on atomic structures. There is a widespread belief in the community that three-body correlations are likely to provide an overcomplete description of the environment of an atom. We produce several counterexamples to this belief, with the consequence that any classifier, regression, or embedding model for atom-centered properties that uses three- (or four)-body features will incorrectly give identical results for different configurations. Writing global properties (such as total energies) as a sum of many atom-centered contributions mitigates the impact of this fundamental deficiency-explaining the success of current "machine-learning" force fields. We anticipate the issues that will arise as the desired accuracy increases, and suggest potential solutions.
WOS:000576896200009
2020-10-12
125
16
166001
REVIEWED