Problems of pattern formation have been extensively studied in distributed computing. One of this problems is the gathering problem: agents must gather at a same position in a distributed manner. When gathering is not possible, a close problem is the convergence problem. In this article, we investigate the two following questions: (1) Can processes gather when each process cannot see more that one other process at the same time? (2) Can a gathering behavior be learned by processes? Regarding the first point, we introduce a new model with an extremely restricted visibility: each process can only see one other process (its closest neighbor). Our goal is to see if (and to what extent) the gathering and convergence problems can be solved in this setting. We first show that, surprisingly, the problem can be solved for a small number of processes (at most 5), but not beyond. This is due to indeterminacy in the case where there are several "closest neighbors" for a same process. By removing this indeterminacy with an additional hypothesis (choosing the closest neighbor according to an order on the positions of processes), we then show that the problem can be solved for any number of processes. We also show that up to one crash failure can be tolerated for the convergence problem. Regarding the second point, we present the first experimental evidence that a gathering behavior can be learned without explicit communication in a partially observable environment. The learned behavior has the same properties as a self-stabilizing distributed algorithm, as processes can gather from any initial state (and thus tolerate any transient failure). Besides, we show that it is possible to scale and then tolerate the brutal loss of up to 90% of agents without significant impact on the behavior.