This work tackles the challenge of detecting and matching objects in scenes observed simultaneously by fixed and mobile cameras. No calibration between the cameras is needed, and no training data is used. A fully automated system is presented to detect if an object, observed by a fixed camera, is seen by a mobile camera and where it is localized in its image plane. Only the observations from the fixed camera are used. An object descriptor based on grids of region descriptors is used in a cascade manner. Fixed and mobile cameras collaborate to confirm detection. Detected regions in the mobile camera are validated by analyzing the dual problem: analyzing their corresponding most similar regions in the fixed camera to check if they coincide with the object of interest. Experiments show that objects are successfully detected even if the cameras have significant change in image quality, illumination, and viewpoint. Qualitative and quantitative results are presented in indoor and outdoor urban scenes.