Vision-based inertial-aided navigation is gaining ground due to its many potential applications. In previous decades, the integration of vision and inertial sensors was monopolised by the defence industry due to its complexity and unrealistic economic burden. After the technology advancement, high-quality hardware and computing power became reachable for the investigation and realisation of various applications. In this thesis, a mapping system by vision-aided inertial navigation was developed for areas where GNSS signals are unreachable, for example, indoors, tunnels, city canyons, forests, etc. In this framework, a methodology on the integration of vision and inertial sensors was presented, analysed and tested when the only available information at the beginning is a number of features with known location/coordinates (with no GNSS signals accessibility), thus employing the method of "SLAM: Simultaneous Localisation And Mapping". SLAM is a term used in the robotics community to describe the problem of mapping the environment and at the same time using this map to determine (or to help in determining) the location of the mapping device. In addition to this, a link between the robotics and geomatics community was established where briefly the similarities and differences were outlined in terms of handling the navigation and mapping problem. Albeit many differences, the goal is common: developing a "navigation and mapping system" that is not bounded to the limits imposed by the used sensors. Classically, terrestrial robotics SLAM is approached using LASER scanners to locate the robot relative to a structured environment and to map this environment at the same time. However, outdoors robotics SLAM is not feasible with LASER scanners alone due to the environment's roughness and absence of simple geometric features. Recently in the robotics community, the use of visual methods, integrated with inertial sensors, has gained an interest. These visual methods rely on one or more cameras (or video) and make use of a single Kalman Filter with a state vector containing the map and the robot coordinates. This concept introduces high non-linearity and complications to the filter, which then needs to run at high rates (more than 20 Hz) with simplified navigation and mapping models. In this study, SLAM is developed using the Geomatics Engineering approach. Two filters are used in parallel: the Least-Squares Adjustment (LSA) for feature coordinates determination and the Kalman Filter (KF) for navigation correction. For this, a mobile mapping system (independent of GPS) is introduced by employing two CCD cameras (one metre apart) and one IMU. Conceptually, the outputs of the LSA photogrammetric resection (position and orientation) are used as the external measurements for the inertial KF. The filtered position and orientation are subsequently employed in the Photogrammetric intersection to map the surrounding features that are used as control points for the resection in the next epoch. In this manner, the KF takes the form of a navigation only filter, with a state vector containing the corrections to the navigation parameters. This way, the mapping and localisation can be updated at low rates (1 to 2 Hz) and use more complete modelling. Results show that this method is feasible with limitation induced from the quality of the images and the number of used features. Although simulation showed that (depending on the image geometry) determining the features' coordinates with an accuracy of 5-10 cm for objects at distances of up to 10 metres is possible, in practice this is not achieved with the employed hardware and pixel measurement techniques. Navigational accuracies depend as well on the quality of the images and the number and accuracy of the points used in the resection. While more than 25 points are needed to achieve centimetre accuracy from resection, they have to be within a distance of 10 metres from the cameras; otherwise, the resulting resection output will be of insufficient accuracy and further integration quality deteriorates. The initial conditions highly affect SLAM performance; these are the method of IMU initialisation and the a-priori assumptions on error distribution. The geometry of the system will furthermore have a consequence on possible applications. To conclude, the development consisted in establishing a mathematical framework, as well as implementing methods and algorithms for a novel integration methodology between vision and inertial sensors. The implementation and validation of the software have presented the main challenges, and it can be considered the first of a kind where all components were developed from scratch, with no pre-existing modules. Finally, simulations and practical tests were carried out, from which initial conclusions and recommendations were drawn to build upon. It is the author's hope that this work will stimulate others to investigate further this interesting problem taking into account the conclusions and recommendations sketched herein.