The year 2016, in which I am writing these words, marks the centenary of Claude Shannon, the father of information theory. In his landmark 1948 paper "A Mathematical Theory of Communication", Shannon established the largest rate at which reliable communication is possible, and he referred to it as the channel capacity. Since then, researchers have focused on the design of practical coding schemes that could approach such a limit. The road to channel capacity has been almost 70 years long and, after many ideas, occasional detours, and some rediscoveries, it has culminated in the description of low-complexity and provably capacity-achieving coding schemes, namely, polar codes and iterative codes based on sparse graphs. However, next-generation communication systems require an unprecedented performance improvement and the number of transmission settings relevant in applications is rapidly increasing. Hence, although Shannon's limit seems finally close at hand, new challenges are just around the corner. In this thesis, we trace a road that goes from polar to Reed-Muller codes and, by doing so, we investigate three main topics: unified scaling, non-standard channels, and capacity via symmetry. First, we consider unified scaling. A coding scheme is capacity-achieving when, for any rate smaller than capacity, the error probability tends to 0 as the block length becomes increasingly larger. However, the practitioner is often interested in more specific questions such as, "How much do we need to increase the block length in order to halve the gap between rate and capacity?". We focus our analysis on polar codes and develop a unified framework to rigorously analyze the scaling of the main parameters, i.e., block length, rate, error probability, and channel quality. Furthermore, in light of the recent success of a list decoding algorithm for polar codes, we provide scaling results on the performance of list decoders. Next, we deal with non-standard channels. When we say that a coding scheme achieves capacity, we typically consider binary memoryless symmetric channels. However, practical transmission scenarios often involve more complicated settings. For example, the downlink of a cellular system is modeled as a broadcast channel, and the communication on fiber links is inherently asymmetric. We propose provably optimal low-complexity solutions for these settings. In particular, we present a polar coding scheme that achieves the best known rate region for the broadcast channel, and we describe three paradigms to achieve the capacity of asymmetric channels. To do so, we develop general coding "primitives", such as the chaining construction that has already proved to be useful in a variety of communication problems. Finally, we show how to achieve capacity via symmetry. In the early days of coding theory, a popular paradigm consisted in exploiting the structure of algebraic codes to devise practical decoding algorithms. However, proving the optimality of such coding schemes remained an elusive goal. In particular, the conjecture that Reed-Muller codes achieve capacity dates back to the 1960s. We solve this open problem by showing that Reed-Muller codes and, in general, codes with sufficient symmetry are capacity-achieving over erasure channels under optimal MAP decoding. As the proof does not rely on the precise structure of the codes, we are able to show that symmetry alone guarantees optimal performance.