Understanding what elements of our visual environment guide our attention would be a crucial asset for design. In architecture, this knowledge could influence the position and size of the various components (e.g., windows) to support ergonomic, safe and emotionally compelling spaces. Saliency maps are theoretical predictions of human’s viewing patterns. They tell us where people devote their perceptual and cognitive resources when they look at a given scene. These maps and underlying models were developed and improved over the past 20 years as a result of increasing capabilities in eye-tracking technologies and advances in vision, neurosciences and computer sciences. We can distinguish two big families of models corresponding to different approaches to visual attention: (1) ‘bottom-up’ models that focus on the most conspicuous image regions (such as contrasts, colors, orientations)), and (2) ‘top-down’ models that include feature recognition and information finding relevant to an ongoing behavior, task, or goal. Top-down models are the most recent and many of them rely on deep learning. They have proven to be powerful in terms of prediction (Borji, 2018). In parallel, experimental procedures moved from pictures to virtual reality (VR) scenes, video and real environments. These developments offer new ways to understand how people experience and decrypt space. Yet, the application of saliency mapping to architectural design and daylight assessment remains largely unexplored. The objective of this study is to bridge this gap and test, through a pilot study, the applicability of saliency models for architectural daylit scenes. Our study is based on a dataset of head-tracking logs originally collected for a study on visual pleasantness, interests, and excitement (Rockcastle et al., 2017). We used this dataset because of the richness of the scenes explored, and because eye and head movement have proven to show a good correlation with each other in the context of VR (Sitzmann et al., 2018). We obtained ground truth fixations maps based on a method developed for head mounted VR logs (Upenik and Ebrahimi, 2017), and compared these maps against a saliency model developped in VR (Monroy et al., 2018). Findings reveal an equator bias in both the head tracking logs and the predictions, common fixations but also discrepancies between ground truth (actual) and prediction. Limitations involve the use of computer generated black and white scenes with no objects present in them, and a testing procedure that did not match conventional experimental set-up of saliency studies. Despite these differences, saliency mapping was found to be promising tool to aid architecture and daylight design. The study ends with a set of guidelines in regard to the design of future experiments.