This work addresses the problem of studying and characterizing topology changes between resulting and reference segmentation masks in video sequences. In particular, the goal of this paper is to examine the impact of individual and combined artifacts found in video object segmentation applications (e.g., added regions and holes). Added regions and holes artifacts are synthetically generated and inserted in a segmentation mask. We performed a psychophysical experiment in which human subjects were asked to rate the annoyance of the generated artifacts when presented alone or in combination. The results show how individual objective metrics can be derived and how an overall objective metric can be predicted by linearly combining individual segmentation errors for a specific video content.