On the Difficulty of Selecting Ising Models with Approximate Recovery
In this paper, we consider the problem of estimating the underlying graph associated with an Ising model given a number of independent and identically distributed samples. We adopt an approximate recovery criterion that allows for a number of missed edges or incorrectly included edges, in contrast with the widely studied exact recovery problem. Our main results provide information-theoretic lower bounds on the sample complexity for graph classes imposing constraints on the number of edges, maximal degree, and other properties. We identify a broad range of scenarios where, either up to constant factors or logarithmic factors, our lower bounds match the best known lower bounds for the exact recovery criterion, several of which are known to be tight or near-tight. Hence, in these cases, approximate recovery has a similar difficulty to exact recovery in the minimax sense. Our bounds are obtained via a modification of Fano's inequality for handling the approximate recovery criterion, along with suitably designed ensembles of graphs that can broadly be classed into two categories: 1) those containing graphs that contain several isolated edges or cliques and are thus difficult to distinguish from the empty graph; 2) those containing graphs for which certain groups of nodes are highly correlated, thus making it difficult to determine precisely which edges connect them. We support our theoretical results on these ensembles with numerical experiments.
Record created on 2016-01-11, modified on 2017-01-25