Object tracking in image sequences is a key challenge in computer vision. Its goal is to follow objects that move or evolve over time while preserving the identity of each object. However, most existing approaches focus on one class of objects and model only very simple interactions, such as the fact that different objects do not occupy the same spatial location at a given time instance. They ignore that objects may interact in more complex ways. For example, in a parking lot, a person may get in a car and become invisible in the scene. In this thesis, we focus on tracking interacting objects in image sequences. We show that by exploiting the relationship between different objects, we can achieve more reliable tracking results. We explore a wide range of applications, such as tracking players and the ball in team sports, tracking cars and people in a parking lot and tracking dividing cells in biomedical imagery. We start by tracking the ball in team sports, which is a very challenging task because the ball is often occluded by the players. We propose a sequential approach that tracks the players first, and then tracks the ball by deciding which player, if any, is in possession of the ball at any given time. This is very different from standard approaches that first attempt to track the ball and only then to assign possession. We show that our method substantially increases performance when applied to long basketball and soccer sequences. We then focus on simultaneously tracking interacting objects. We achieve this by formulating the tracking problem as a network-flow Mixed Integer Program, and expressing the fact that one object can appear or disappear at locations of another in terms of linear flow constraints. We demonstrate our method on scenes involving cars and passengers, bags being carried and dropped by people, and balls being passed from one player to the next in team sports. In particular, we show that by estimating jointly and globally the trajectories of different types of objects, the presence of the ones which were not initially detected based solely on image evidence can be inferred from the detections of the others. We finally extend our approach to dividing cells in biomedical imagery. In this case, cells interact by overlapping with each other and giving birth to daughter cells. We propose a novel approach to automatically detecting and tracking cell populations in time-lapse images. Unlike earlier approaches that rely on linking a predetermined and potentially incomplete set of detections, we generate an overcomplete set of competing detection hypotheses. We then perform detection and tracking simultaneously by solving an integer program to find the optimal and consistent subset. This eliminates the need for heuristics to handle missed detections due to occlusions and complex morphology. We demonstrate the effectiveness of our approach on a range of challenging image sequences consisting of clumped cells and show that it outperforms the state-of-the-art techniques.