Design, study, and control of mixed animals-robots societies are the fields of scientific exploration that can bring new opportunities for study and control of groups of social insects and animals and, in particular, for improvement of welfare and breeding conditions of domestic animals. Our long-term objective is to develop a mobile robot, socially acceptable by chickens and able to interact with them using appropriate communication channels. For interaction purposes the robot has to know positions of all birds in an experimental area and detect those uttering calls. In this paper, we present an audio-visual approach to locate the robots and animals on a scene and detect their calling activity. The visual tracking is provided by a marker-based tracker with help of an overhead camera. Sound localization is achieved by the beamforming approach using an array of sixteen microphones. Visual and sound information are probabilistically mixed to detect the calling activity. The experimental results demonstrate that our system is capable to detect the sound emission activity of multiple moving robots with 90% probability.