Human action recognition can be performed using multiscale salient features which encode the local events in the video. Existing feature extraction methods use non-causal spatio-temporal filtering, and hence, they are not biologically plausible. To address this inconsistency, new features extracted from a biologically plausible perception model are introduced. In this model, the opponent-based motion energy is computed using oriented motion filters constructed from a bio-inspired time-causal filtering. The salient features are then extracted from the regions of interest in the motion energy map. The extracted opponent based motion features are then utilized for action classification with a bag-of-words approach. Experiments using a publicly available (Weizmann) data set shows 93:5% classification accuracy which is an improvement over comparable methods.