Shannon in his seminal work \cite{paper:shannon} formalized the framework on the problem of digital communication of information and storage. He quantified the fundamental limits of compression and transmission rates. The quantity \textit{channel capacity} was coined to give an operational meaning on the ultimate limit of information transfer between two entities, transmitter and receiver. By using a random coding argument, he showed the achievability of reliable communication on any noisy channel as long as the rate is less than the channel capacity. More precisely, he argued that, using a fixed composition random codebook at the sender and a maximum likelihood decoder in the receiver, one can achieve almost zero error communication of information. Shannon's result was a promise on the achievability on the amount of data one can transfer over a medium. Achieving those promises has spurred scientific interest since 1948. Significant effort has been spent to find practical codes which gets close to the Shannon limits. For some class of channels, codes which are capacity achieving has been found since then. Low Density Parity Check Codes (LDPC) is one such capacity achieving family of codes for the class of binary erasure channels. Recently, Arikan proposed a family of codes known as Polar codes which are found to be capacity achieving for binary input memoryless channels. Slowly but steadily the longstanding problem of achieving Shannon limit is thus getting settled. In order to realize a practical system which can achieve the Shannon limits, two important things are necessary. One of them is an efficient capacity achieving code sequence and the other is an implementable decoding rule. Both these require the knowledge of the probabilistic law of the channel through which communication take place. At the transmitter one needs to know the channel capacity whereas at the receiver a perfect knowledge of the channel is necessitated to build optimum decoder. The question of whether a decoder can be designed without knowing the underlying channel, yet we can do a reliable communication arises here. This is the subject of discussion in this report. Consider a situation where a communication system is to be designed without explicit knowledge about the channel. Here, neither the transmitter nor the receiver knows the exact channel law. Suppose, the possible list of channels is made available upfront to both entities (transmitter and receiver). Can we devise a reliable communication scheme, irrespective of the actual channel picked without both transmitter and receiver being aware of it. This is the central theme in what is known\footnote{Also referred as \textit{coding for class of channels} as discussed in \cite{paper:blackwell}, \cite{book:Wolfowitz}.} as \textit{coding for compound channels}\cite{book:Csiszar}. Designing a reliable communication system for compound channels essentially involve building a codebook at the transmitter and a decoding rule at the receiver. The encoder-decoder pair together must guarantee reliable communication over a set of channels. In the general setting, no feedback is assumed from the receiver to the transmitter. Thus, a single coding strategy (codebook) must be devised (and fixed) upfront prior to transmission. At the receiver end, the decoding strategy must be devised independent of any knowledge of the channel. Of course, the codebook devised at the sender is perfectly known to the receiver. The answer to the question on whether such a communication system can be build is on the affirmative. In order to talk about the existence of reliable communication strategies, one must first talk about the maximum possible rate, the capacity. The highest achievable rate in a compound setting is known as the \textit{compound capacity} or capacity of the family. This is analogous to the famous Shannon capacity being the ultimate limit on achievable rate for any single channel. The capacity of compound set of memoryless channels has been studied by Blackwell et al in \cite{paper:blackwell} and that of linear dispersive channels is investigated by Root and Varaiya in \cite{paper:root:varaiya}. Lapidoth and Telatar looked at the compound capacity for families of channels with memory, more specifically the finite state channels \cite{paper:lapidoth-telatar1998}. As a special case, they have also derived the compound channel capacity of a class of Gilbert-Elliot channels. A decoder which operates without the knowledge of the channel in this setup is called a universal decoder. It is known that Maximum Mutual Information (MMI) decoders proposed by Goppa are universal. MMI decoders compute the empirical mutual between a received codeword against the codebook and find the best matching word as the true estimate. The complexity of MMI decoder remain fixed even if we were to find structured codes. This motivate us to ask the question whether we can build a universal decoder which offer better structure. Decoding rules which brings additive nature were considered in the literature as a potential scheme. Our work in this has been driven by this line of thoughts. In this work, we focus on class of discrete memoryless channels (DMC) and more specifically binary memoryless channels. We show that it is possible to build a linear universal decoder for any compound binary memoryless channels. The recent introduction of Polar codes motivated us to look into their suitability to the compound channel setup. We have carried out a preliminary investigation in this direction. While it is not clear whether Polar codes are universal under the optimum universal decoding, we find that they are universal for certain restricted classes of compound BMC sets.