We consider the group testing problem, in which one seeks to identify a subset of defective items within a larger set of items based on a number of tests. We characterize the information-theoretic performance limits in the presence of list decoding, in which the decoder may output a list containing more elements than the number of defectives, and the only requirement is that the true defective set is a subset of the list, or more generally, that their overlap exceeds a given threshold. We show that even under this highly relaxed criterion, in several scaling regimes the asymptotic number of tests is no smaller than the exact recovery setting. However, we also provide examples where a reduction is provably attained. We support our theoretical findings with numerical experiments.