In modern traffic management Video Image Detection Systems (VIDS) are becoming increasingly important as traffic sensors. They are getting more affordable and don’t require any road construction like commonly used induction loops. Furthermore, due to the fact that they are able to monitor a wide area, they potentially offer the derivation of a whole new set of traffic parameters. Good examples are the derivation of source-destination relations, queue-length, travel-times or general event detection like untypical movements, accidents, blockages and congestions. Additionally, by using more than one camera the surveillance area can be enlarged or the detection accuracy can be increased due to redundancy of observations. However, in order to take advantage of a multiple camera system, the observations from different cameras have to be fused. In the setup that will be presented a geometric fusion is proposed by projecting the observations into a combined geo-referenced coordinate frame. The basic requirement for this transformation is the knowledge of the interior and exterior orientation of every camera. Three different approaches for determining the exterior orientation have been implemented, namely a Newton method, a least squares adjustment based on ground control points and a method based on line features. Furthermore, direct linear transformation and minimum space resection are applied to calculate initial estimates. These algorithms are subject to an in depth evaluation in respect to their application as a traffic monitoring sensor.