A method and device for predicting faults in a distributed heterogeneous IT system (100), the method comprising: creating a local checkpoint (19) in an explorer node (10) of said system (100), said local checkpoint (19) reflecting the state of said explorer node (10) running a path exploration engine (14) on said local checkpoint (19) in order to predict faults, wherein a plurality of possible inputs (71) are used by said exploration engine (14) in order to explore different paths, wherein path exploration comprises sending messages to remote client nodes (20), and receiving messages from said remote clients (20) wherein said received messages do not reveal checkpoints of said other nodes, so as to avoid leakage of any confidential information.