We propose a framework for tracking multiple targets, where the input is a set of candidate regions in each frame, as obtained from a state-of-the-art background segmentation module, and the goal is to recover trajectories of targets over time. Due to occlusions by targets and static objects, as also by noisy segmentation and false alarms, one foreground region may not correspond to one target faithfully. Therefore, the one-to-one assumption used in most data association algorithms is not always satisfied. Our method overcomes the one-to-one assumption by formulating the visual tracking problem in terms of finding the best spatial and temporal association of observations, which maximizes the consistency of both motion and appearance of trajectories. To avoid enumerating all possible solutions, we take a data-driven Markov Chain Monte Carlo (DD-MCMC) approach to sample the solution space efficiently. The sampling is driven by an informed proposal scheme controlled by a joint probability model combining motion and appearance. Comparative experiments with quantitative evaluations are provided.