This cross-validation study evaluates the performance and robustness of twenty-two established and newly proposed glare prediction metrics. Experimental datasets of daylight-dominated workplaces in office-like test rooms were collected from studies by seven research groups in six different locations (Argentina, Germany, Denmark, Israel, Japan and USA). The variability in experimental setups, location and research teams allowed reliable evaluation of the performance and robustness of glare metrics for daylight-dominated workplaces. The results show that several metrics are reliable, but also that purely empirical-derived metrics behave less robust than metrics considering the saturation-effect. In this study the Daylight Glare Probability (DGP) delivered the highest performance amongst the tested metrics and was also found to be the most robust one. This is the first study in the domain of daylight glare research which combines and evaluates experimental data from independent research groups.