Orthographic images compose an efficient and economic way to represent aerial images. This kind of information allows to measure two-dimensional objects and relate these to Geographic Information Systems. This paper deals with the computation of a true ortho-graphic image given a set of overlapping perspective images. These are, together with the internal and external calibration the only input to our approach. These few requirements form a large advantage to systems where the digital surface model (DSM), e.g. provided by LIDAR data, is necessary. We used a Bayesian approach and define a generative model of the input images. In this, the input images are regarded as noisy measurements of an underlying true and hence unknown orthoimage. These measurements are obtained by an image formation process (generative model) that involves apart from the true orthoimage several additional parameters. Our goal is to invert the image formation process by estimating those parameters which make our input images most likely. We present results on aerial images of a complex urban environment.