Abstract:
In the field of behavior recognition, the action recognition method based on joint data has achieved high recognition accuracy, but its recognition accuracy is not high i...Show MoreMetadata
Abstract:
In the field of behavior recognition, the action recognition method based on joint data has achieved high recognition accuracy, but its recognition accuracy is not high in the recognition of similar actions that are easily confused. To solve this problem, considering that RGB video has more detailed motion information, a hierarchical recognition framework based on the fusion of joint data and RGB video data based on a spatio-temporal graph convolution network is proposed. First, the actions are classified according to the joint data; for example, similar hand movements are recognized into a class. Then, the optical flow network is used to extract the rich details contained in the RGB video, i.e. the optical flow slice data centered on the joint with human structural characteristics are constructed. Finally the data are sent into the spatio-temporal graph convolutional network for recognition of similar actions. The advantage of the proposed hierarchical recognition framework is that action recognition is divided into two levels. The first-level recognition can quickly identify the action by using the characteristics of the joint data, including only the coordinate information of the joints. The second level of recognition is similar action recognition by extracting local motion features from RGB video through optical flow network. Tests on the NTU dataset show that the accuracy of similar action recognition using joint data is only 80.15%, and the recognition rate of similar action recognition using the proposed hierarchical framework is 87.78%. It is proven that the framework is effective for similar action recognition.
Published in: 2024 36th Chinese Control and Decision Conference (CCDC)
Date of Conference: 25-27 May 2024
Date Added to IEEE Xplore: 17 July 2024
ISBN Information: