Skip to Main Content
For distributed smart camera networks to perform vision-based tasks such as subject recognition and tracking, every camera's position and orientation relative to a single 3-D coordinate frame must be accurately determined. In this paper, we present a new camera network localization solution that requires successively showing a 3-D feature point-rich target to all cameras, then using the known geometry of a 3-D target, cameras estimate and decompose projection matrices to compute their position and orientation relative to the coordinatization of the 3-D target's feature points. As each 3-D target position establishes a distinct coordinate frame, cameras that view more than one 3-D target position compute translations and rotations relating different positions' coordinate frames and share the transform data with neighbors to facilitate realignment of all cameras to a single coordinate frame. Compared to other localization solutions that use opportunistically found visual data, our solution is more suitable to battery-powered, processing-constrained camera networks because it requires communication only to determine simultaneous target viewings and for passing transform data. Additionally, our solution requires only pairwise view overlaps of sufficient size to see the 3-D target and detect its feature points, while also giving camera positions in meaningful units. We evaluate our algorithm in both real and simulated smart camera networks. In the real network, position error is less than 1" when the 3-D target's feature points fill only 2.9% of the frame area.