To take advantage of both stereo cameras and radar, this paper proposes a fusion approach to accurately estimate the location, size, pose, and motion information of a threat vehicle with respect to a host one from observations that are obtained by both sensors. To do that, we first fit the contour of a threat vehicle from stereo depth information and find the closest point on the contour from the vision sensor. Then, the fused closest point is obtained by fusing radar observations and the vision closest point. Next, by translating the fitted contour to the fused closest point, the fused contour is obtained. Finally, the fused contour is tracked by using rigid body constraints to estimate the location, size, pose, and motion of the threat vehicle. Experimental results from both synthetic data and real-world road test data demonstrate the success of the proposed algorithm.