Recent progress in computational photography has shown that we can acquire near-infrared (NIR) information in addition to the normal visible (RGB) information with only slight modification to the standard digital camera. In this thesis, we study if this extra channel can improve one of the more difficult computer vision tasks: Scene understanding. Due to the proximity of NIR to visible radiation, NIR images share many properties with visible images. However, as a result of the material dependency of reflection in the NIR part of the spectrum, such images reveal different characteristics of the scene. In this work we study how to effectively exploit these differences to improve scene recognition and semantic segmentation performance. An initial psycho-physical test that we carried out gave promising evidence that humans understand the content of a scene more effectively when presented with the NIR image as opposed to the visible image. Motivated by this, we first formulate a novel framework that incorporates NIR information into a low-level segmentation algorithm to better detect the material boundaries of an object. This goal is achieved by first forming a illumination-invariant representation, i.e., the intrinsic image, and then by employing the material dependent properties of NIR images. Secondly, by leveraging on state-of-the-art segmentation frameworks and a novel manually segmented image database, we study how to best incorporate the specific characteristics of the NIR response into high-level semantic segmentation tasks. We show through extensive experimentation that introducing NIR information significantly improves the performance of automatic labeling for certain object classes, like fabrics and water, whose response in the NIR domain is particularly discriminant. We then thoroughly discuss the results with respect to both physical properties of the NIR response and the characteristics of the segmentation framework.