We consider the role of contextual guidance in learning and processing within multi-stream neural networks. Earlier work (Kay & Phillips, 1994, 1996; Phillips et al., 1995) showed how the goals of feature discovery and associative learning could be fused within a single objective, and made precise using information theory, in such a way that local binary processors could extract a single feature that is coherent across streams. In this paper we consider multi-unit local processors with multivariate binary outputs that enable a greater number of coherent features to be extracted. Using the Ising model, we define a class of information-theoretic objective functions and also local approximations, and derive the learning rules in both cases. These rules have similarities to, and differences from, the celebrated BCM rule. Local and global versions of Infomax appear as by-products of the general approach, as well as multivariate versions of Coherent Infomax. Focusing on the more biologically plausible local rules, we describe some computational experiments designed to investigate specific properties of the processors. The main conclusions are : 1. The local methodology introduced in the paper has the required functionality. 2. Different units within the multi-unit processors learned to respond to different aspects of their receptive fields. 3. The units within each processor generally produced a distributed code in which the outputs were correlated, and which was robust to damage; in the special case where the number of units available was only just sufficient to transmit the relevant information, a form of competitive learning was produced. 4. The contextual connections enabled the information correlated across streams to be extracted, and, by improving feature detection with weak or noisy inputs, they played a useful role in short-term processing and in improving generalization. 5. The methodology allows the statistical associations between distributed self-organizing population codes to be learned.