Whether it occurs in artificial or biological substrates, {\it learning} is a {distributed} phenomenon in at least two aspects. First, meaningful data and experiences are rarely found in one location, hence {\it learners} have a strong incentive to work together, through {\it distributed learning}. Second, a learner is itself a distributed system, a {\it learning machine}, made of more basic processes; the change in the connections between these basic processes is what allows learning. This high-level abstraction encompasses a large set of learning situations, from nervous systems, to metabolic networks in an organism, to data centers, where several machines are working together to recommend personalized content for a billion-users social media. In both levels of distribution, a system's ability to cope with the failure of some of its components is crucial. My research explores the robustness of learning systems from %these two perspectives. The first aspect is {\it coarse-grained}, considering the unit of failure as a whole learner. The second is {\it fine-grained}, considering the unit of failure as the basic component of the learner (e.g. a neuron or a synapse).