Proposed Model Architecture Illustrating the major novel components and methodological flow
Abstract:
In the domain of action recognition using multisensory data, the integration of RGB and signal-based modalities offers a promising approach to enhance the accuracy of act...Show MoreMetadata
Abstract:
In the domain of action recognition using multisensory data, the integration of RGB and signal-based modalities offers a promising approach to enhance the accuracy of action classification systems. Our system was developed through experimentation on three benchmark datasets: UTD-MHAD (University of Texas at Dallas Multimodal Human Action Dataset), HWU-USP and LaRa. Initially, the data undergoes preprocessing, where gaussian and butterworth filters are applied to the RGB and signal data, respectively. Following this, windowing/segmentation is applied to signals and RGB data. After that, features are extracted from the signal data, including auto-regression, MFCC (Mel-frequency Cepstral Coefficients), and transient detection principle, while the RGB (Red Green Blue) was processed as a combined input to extract features such as angles, velocity, full-body elliptical modeling, fiducial points, and a 2.5D point cloud of the entire body. These features are then fused, followed by the application of the Yeo-Johnson power optimizer to refine the data. The optimized data is subsequently classified using a Neurofuzzy classifier to recognize different actions. This classifier is chosen for its ability to adapt to the heterogeneous nature of multimodal data, where features are spread across different domains, making traditional classifiers less effective. The Neurofuzzy model employs cross-validation for training and testing to ensure reliable results. The results also suggest that the proposed model yields a higher accuracy than the existing models. More specifically, in the HWU-USP dataset, the accuracy amounts to mean 89%, in the LaRa, to mean 91% and 88% over the UTD-MHAD dataset. The system under study effectively distinguishes related actions, but its efficiency is hindered by the complexity of individual actions and the increased noise in the dataset.
Proposed Model Architecture Illustrating the major novel components and methodological flow
Published in: IEEE Access ( Volume: 13)