A method is introduced to track the object's motion and estimate its pose directly from 2-D image sequences. Scale-invariant feature transform (SIFT) is used to extract corresponding feature points from image sequences. We demonstrate that pose estimation from the corresponding feature points can be formed as a solution to Sylvester's equation. We show that the proposed approach to the solution of Sylvester's equation is equivalent to the classical SVD method for 3D-3D pose estimation. However, whereas classical SVD cannot be used for pose estimation directly from 2-D image sequences, our method based on Sylvester's equation provides a new approach to pose estimation. Smooth video tracking and pose estimation is finally obtained by using the solution to Sylvester's equation within the importance sampling density of the particle filtering framework. Finally, computer simulation experiments conducted over synthetic data and real-world videos demonstrate the effectiveness of our method in both robustness and speed compared with other similar object tracking and pose estimation methods.