Skip to Main Content
A novel method for object tracking in videos for drinking activity recognition is proposed. The query object is detected in the first video frame, extracting a new query image. The obtained query image is then compared with patches within a determined region of interest around the position of the detected object in the previous frame. For each image, the local steering kernels are extracted and the similarity between the query image and the patches of the video frame is measured by calculating the matrix cosine similarity. The proposed method finds application in drinking activity recognition, by tracking the object, i.e., the glass, being used.