We consider the problem of multicasting information from a source to a set of receivers over a network where intermediate network nodes perform randomized linear network coding operations on the source packets. We propose a channel model for the noncoherent network coding introduced by Koetter and Kschischang in [6], that captures the essence of such a network operation, and calculate the capacity as a function of network parameters. We prove that use of subspace coding is optimal, and show that, in some cases, the capacity-achieving distribution uses subspaces of several dimensions, where the employed dimensions depend on the packet length. This model and the results also allow us to give guidelines on when subspace coding is beneficial for the proposed model and by how much, in comparison to a coding vector approach, from a capacity viewpoint. We extend our results to the case of multiple source multicast that creates a virtual multiple access channel.