Direction-Independent Human Activity Recognition Using a Distributed MIMO Radar System and Deep Learning

Modern monostatic radar-based human activity recognition (HAR) systems perform very well as long as the direction of human activities is either toward or away from the radar. The monostatic single-input–single-output (SISO) and monostatic multiple-input–multiple-output (MIMO) radar systems cannot detect motion of an object that moves perpendicularly to the radar’s boresight axis. Due to this physical layer limitation, today’s radar-based HAR systems fail to classify multidirectional human activities. In this article, we resolve this typical but critical physical layer problem of contemporary HAR systems. We propose a HAR system underlying a distributed MIMO radar configuration, where multiple antennas of a millimeter wave (mm-wave) MIMO radar system (Ancortek SDR-KIT 2400T2R4) are distributed in an indoor environment. In our proposed HAR system, we have two independent and identical monostatic radar subsystems that irradiate and capture the multidirectional human movement from two perspectives, which allows to compute two distinct time-variant (TV) radial velocity distributions. A feature extraction network extracts numerous features from the measured TV radial velocity distributions, which are then fused by a multiclass classifier to detect five types of human activities. The proposed multiperspective MIMO-radar-based HAR system achieves a classification accuracy of 98.52%, which surpasses the accuracy of SISO radar-based HAR system by more than 9%. Our approach resolves the physical layer limitations of modern HAR systems that are based on either monostatic SISO or monostatic MIMO radar systems.


I. INTRODUCTION
A. General Background S TUDIES have shown a considerable amount of progress in the area of human activity recognition (HAR) over the past few years [1], [2], [3], [4].The steady interest in HAR is due to its extensive range of applications.Over the years, HAR systems have proven their usefulness in application areas such as social robotics [5], autonomous driving [6], sports [7], [8], home automation [9], healthcare [10], automated video analysis [11], and human-computer interaction [12].
To date, numerous diverse sensing modalities have been adopted to effectuate the HAR task.However, each modality The authors are with the Faculty of Engineering and Science, University of Agder, 4898 Grimstad, Norway (e-mail: sahil.waqar@uia.no;muhammad.muaaz@uia.no;matthias.paetzold@uia.no).
For instance, due to the ongoing advancements in computer vision techniques, HAR systems based on vision sensors have produced remarkable results [14], [15].However, vision sensors are often criticized because they are very susceptible to lighting conditions, occlusion, and can violate user privacy.Wearable sensors [16], [17], [18], [19] on the other hand, despite being very effective HAR sensors, are generally criticized for being fragile, obtrusive, and vulnerable to user negligence.Also, the need to be worn indefinitely renders the wearable sensors impractical and inconvenient, especially for elderlies or infirmed persons.By taking into consideration the aforementioned shortcomings, recently HAR systems based on radio frequency (RF) sensing techniques have been preferred more and more despite new challenges and hurdles.
Lately, many researchers have studied and eventually leaned toward Wi-Fi and radar systems for the HAR purpose [20], [21], [22], [23].Unlike radar systems, commercial grade Wi-Fi routers have the channel frequency response with notably noisy phases [24], [25], [26], [27].In contrast, commercial coherent radar systems conserve the phase information within their coherent processing interval (CPI) [28].Thus, small phase variations corresponding to nonstationary scatterers in an environment can be easily processed by coherent signal processing techniques [29], [30].This is one of the reasons why coherent radar systems have been preferred over Wi-Fi devices to capture the propagation phenomena caused by complex human activities.In the context of RF sensing, the recognition of human activity often relies on exploiting the micro-Doppler phenomenon [31], [32], [33], [34] to discern the specific type of activity being performed.Thanks to recent advancements in the areas of radar techniques and machine/deep learning, the classification and tracking of a wide range of human activities in complex environments will be within reach in a few years.

B. Problem Description
A major problem of radar-based HAR systems is their inability to generate an adequate micro-Doppler signature in a situation where a person moves perpendicularly to the radar boresight axis.Our article is a step in this direction, in which we propose a pragmatic solution to the problem of directionindependent HAR.Thus, we will look into the classification of five different types of human activities performed in different directions.Single monostatic radar-based HAR systems do not consider the direction of human motion and thus tend to fail in classifying human activities performed in different directions.
Erol and Amin [35] reported the average classification performance at different aspect angles for a human falling activity.For a human fall parallel to the radar boresight axis or at 0 • aspect angle, the classification accuracy was 96%; at 60 • aspect angle, the classification accuracy dropped to 85%; and at 90 • aspect angle (falling perpendicularly to the radar boresight axis), the classification accuracy plummeted to 45% rendering the HAR system futile.Similarly, for six human activities, Ding et al. [36] reported a decrease in classification performance from 95.8% to 86.7% by changing the radar's viewing angle from 15 • to 30 • .

C. Related Work
Some of the approaches to mitigate the problem of the direction of human motion are discussed here along with their shortcomings.In [37] and [38] it was shown that by positioning a radar on the ceiling, a human falling in different directions can be detected, but the solution cannot be generalized to classify more complex human activities.To realize a directionindependent HAR, it is tempting to employ a monostatic beamforming multiple-input-multiple-output (MIMO) radar system with the capability of measuring the target's angle [39], [40].But in practice, commercial beamforming radar systems have poor angular and cross-range resolutions due to their limited hardware resources.Thus, for applications such as short-range hand gesture sensing, where the cross-range resolution is not a concern, Molchanov et al. [41] rightly utilized the angular information of a single-input-multiple-output (SIMO) frequency-modulated continuous wave (FMCW) monopulse radar.Unfortunately, the approach cannot be extended to direction-independent HAR systems because of the radar's poor cross-range resolution.Recently, HAR systems are realized by using three-dimensional (3-D) point cloud data generated by millimeter wave (mm-wave) monostatic MIMO radar systems [42], [43].But 3-D point cloud data also suffer from the problem of poor cross-range resolution.For better angle estimation or, equivalently, cross-range resolution, more advanced signal processing techniques such as the "estimation of signal parameters via rotational invariance techniques (ESPRIT)" [44] and "multiple signal classification (MUSIC)" algorithms [45], [46] are usually employed, but these estimation techniques demand a high signal-to-noise ratio [47].Alternatively, a single-input-single-output (SISO) bistatic radar system [48] is a good choice for HAR.However, an even better choice for the direction-independent HAR is multiperspective multistatic MIMO radar systems.They can provide the best multiview signatures of human activities, as we will see in this article.

D. Proposed Approach for HAR
To overcome the aforementioned issues and drawbacks of monostatic SISO, SIMO, and beamforming MIMO radarbased HAR systems, we develop a multiperspective 2 × 2 distributed MIMO radar system to realize a direction-independent HAR system.In our approach, two radar subsystems, each consisting of one transmit and one receive antenna and their own independent signal preprocessing units, are spatially distributed to irradiate the environment from different perspectives (see Section III).This multistatic MIMO radar framework enables us to detect and classify different types of human activities independent of their respective directions.
Human body segments can be modeled by N moving scatterers, which reflect back the radar signals to the radar receiver.The scatterers' distinct time-variant (TV) radial velocity components can be described by the so-called TV radial velocity distribution (see Section II).The TV radial velocity distributions at the output of the radar's signal preprocessor are in fact the input feature maps to our classifier, which is based on a deep convolutional neural network (DCNN).We use deep learning methods to automatically extract the features from the TV radial velocity distributions of the MIMO radar system to finally classify the type of human activity regardless of its direction of motion.
Conventionally, it was not uncommon to manually extract features in single-variable and joint-variable domains to classify human activities using machine learning techniques, such as support vector machine (SVM), with a well-documented classification accuracy of 90% [49].Widely adopted conventional machine learning algorithms in conjunction with domain-based feature engineering usually have theoretical foundations and are computationally less expensive when compared to deep learning algorithms.However, manual feature engineering is quite cumbersome and requires specific expertise.Determining the relevance and significance of features for identifying specific motion artifacts is also a complicated task.Large differences in manually measured features were found in different individuals monitored for health status, body height, and habits [50].Therefore, to account for the intricate attributes of human motion, and to overcome the aforementioned challenges associated with manual feature engineering, deep learning algorithms are preferred [51].
To train and test our SISO and MIMO radar-based directionindependent HAR classifiers (see Section V), we recorded a novel HAR dataset, where the human activities were performed in several directions in the two-dimensional (2-D) horizontal x y plane.In this regard, we denote the recorded HAR dataset with the superscript "(2-D)" as HAR (2-D) (see Section IV).For a conventional monostatic SISO radar-based HAR classifier that contains the human movement merely along the one-dimensional (1-D) x-axis or the monostatic radar's boresight, we denote the recorded HAR dataset accordingly by HAR (1-D) .

E. Contributions
The MIMO radar-based HAR system presented in this article is a stride forward toward actualizing more advanced RF-based HAR systems.The main contributions of the research are as follows.
1) For our direction-independent HAR system, we have addressed a critical physical layer problem of monostatic radar systems related to the target's aspect angle.2) For a monostatic SISO and the multistatic MIMO radar configurations, we have analyzed the variations in measured channel characteristics for five types of human activities (falling, walking, standing, sitting, picking).We also studied the effects of different directions of human activities by analyzing the TV radial velocity distributions of the MIMO radar system (see Section III).3) We composed a completely novel HAR dataset, denoted as HAR (2-D) , by using the multiperspective 2 × 2 MIMO radar configuration (see Section IV).We recorded real human activities by using a commercial mm-wave radar system known as Ancortek SDR-KIT 2400T2R4.
The HAR (2-D) dataset consists of five types of human activities performed by six different persons in several directions.4) By using the HAR (2-D) dataset and its derivative or subset dataset denoted as HAR (1-D) , we have developed and analyzed three different HAR systems (see Section V): 1) a SISO radar-based conventional HAR system; 2) a SISO radar-based direction-independent HAR system; and 3) a MIMO radar-based direction-independent HAR system.The proposed 2 × 2 MIMO radar-based HAR system is capable of recognizing human gross motor activities regardless of the aspect angle or direction of motion, and it is straightforwardly scalable to a higher number of antennas for a more complex human activity classification task.5) For the three HAR systems, we accordingly designed three different DCNN-based multiclass classifiers.The DCNN classifier extract features automatically from the radar's TV radial velocity distribution before classifying an activity.For the distributed MIMO radar-based classifier, feature level fusion has been adopted, which virtually combines the target's information from different aspect angles, and thereby eradicates the limitations that emerge due to the direction of motion.6) The classification performances of the three HAR systems have been assessed and compared quantitatively.
It is shown that the proposed HAR system, based on the multiperspective 2 × 2 MIMO radar framework, improves the classification accuracy of the monostatic SISO radar-based HAR system from 88.98% to 98.52%.

F. Article Organization
The article organization is as follows.Section II describes the MIMO radar system model and the deep learning methods that are utilized in this research.A critical problem of modern SISO and monostatic MIMO radar-based HAR systems and its solution is discussed in Section III.The data acquisition campaign is described in Section IV.In Section V, a conventional and a direction-independent SISO radar-based HAR system, as well as a direction-independent MIMO radar-based HAR system are presented.Lastly, Section VI draws the conclusions.

II. SYSTEM OVERVIEW A. MIMO Radar Signal Preprocessing
An FMCW 2 × 2 MIMO radar system periodically transmits a chirp waveform c i (t ′ ), which can be expressed as [52] where i = 1, 2. The symbol φ i is the initial phase term, f 0 is the initial frequency, and γ is the slope of the chirp waveform.The symbols t ′ and T sw in (1) are the fast time and duration of the chirp, respectively.We adopted a time division multiple access (TDMA) scheme, where the transmitter antenna A T x i periodically transmits the chirp waveform c i (t ′ ) in separate time windows, which are defined as (2n + i − 1)T sw ≤ t ′ < (2n + i)T sw for n = 0, 1, . . .and i = 1, 2. With the help of the Dirac delta function δ(•), we can express the transmit signal s i (t ′ , t) in terms of fast time t ′ and slow time t as [53] s The symbol T n,i in (2) is the discrete slow time that depends on the chirp duration T sw according to T n,i = (2n + i − 1)T sw .For a 2 × 2 MIMO radar, the notation A T x i -A R x k describes the wireless link between the transmitter antenna A T x i and the receiver antenna A R x k .The transmit signal s i (t ′ , t) interacts with L stationary and nonstationary scatterers present in the wireless link A T x i -A R x k , where i, k ∈ {1, 2}.Let the symbols d (l)  ik , c 0 , and λ denote the propagation distance of the lth scatterer, speed of light, and radar's wavelength, respectively.Then, the beat frequency f (l) b,ik and the phase φ (l) ik of the lth scatterer are given by f (l)  b,ik = 2d (l) ik γ/c 0 and φ (l) ik = 4π d (l) ik /λ, respectively, where l = 1, 2, . . ., L. For the wireless link A T x i -A R x k and the lth scatterer, the received beat signal s (l) b,ik (t ′ , t) can be expressed as [53] s (l)  b,ik (t where a (l) ik is the gain, which is assumed to be constant within the radar's CPI.The propagation delay τ (l)  ik in ( 3) is related to Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
the beat frequency f (l) b,ik by τ (l) ik = f (l) b,ik /γ.At the radar receiver, the composite beat signal s b,ik (t ′ , t) is simply the sum of all L beat signals, i.e., We obtain the beat frequency function S b,ik ( f b , t) by computing the Fourier transform of the beat signal s b,ik (t ′ , t) over the fast time t ′ , i.e., [54] where f b is the beat frequency.The beat frequency function S b,ik ( f b , t) in ( 5) further undergoes a short-time Fourier transform (STFT) over the slow time t.Subsequently, the square of the STFT results in the TV micro-Doppler signature S ik ( f, t), which is given as where f represents the Doppler frequency, f b,max is the maximum beat frequency, t ′′ is the running time, and W r (•) is a window function, which is in our case a rectangular function with a width of 64T sw .Finally, the TV radial velocity distribution p ik (v, t) is obtained from the TV micro-Doppler signature S ik ( f, t) according to [53] where v represents the radial velocity.Note that the human body is composed of body segments and each body segment contains several scatterers that reflect back the RF signals to the radar.Each scatterer on a human body segment has a unique TV radial velocity component due to its spatially distinct motion.The TV radial velocity distribution p ik (v, t) contains the radial velocity components from all the scatterers on the human body.We use the expression in (7) to obtain the TV radial velocity distribution p ik (v, t) of the recorded human activities.The TV radial velocity distribution p ik (v, t) is converted into an image in the time-velocity domain, which is basically an input feature map to the DCNN, as described in Section II-B.

B. Deep Learning
In this section, a supervised learning-based multiclass classification method is delineated.Assume a d-dimensional mth feature vector x m that belongs to a feature space X .This feature space X is a proper subset of the real coordinate space R d , meaning that x m ∈ X ⊂ R d .For the entire number of classes C, the mth label y m is an element of a label space , where M is the total number of labeled training samples.
We aim to design a DCNN-based classifier function C f that maps the input feature space X into the label space Y, i.e., C f : X → Y.An empirical risk R J (C f ) corresponding to the categorical cross-entropy loss function J CCE and the classifier function C f is given as [55], [56] where E D {•} denotes the expectation operator that is performed over the empirical distribution, which can either be the dataset D or a mini-batch from the dataset D. In ( 8), the symbol θ is a vector of trainable parameters defined as θ = (θ 1 , θ 2 , . . ., θ L ), where L depends on the complexity of the classifier.The symbol y c m in ( 8) is the cth entity of the mth one-hot encoded label vector y m , which means y c m ∈ {0, 1} such that (1) ⊤ y m = 1 ∀ m, where 1 is a C-dimensional vector of ones, and (•) ⊤ is the transpose operator.The symbol C c f represents the cth element of the classifier function C f .We have used the softmax layer as an output layer of the deep neural network (DNN), thus The trainable parameters of the vector θ corresponding to the classifier function C f can be obtained by minimizing the empirical risk R J (C f ).
The learning process of the DCNN and DNN is the same, but in case of DCNN, the number of trainable parameters is drastically reduced.In a DCNN, convolutional layers are employed to generate the feature maps from their inputs by means of multiple learnable filters.Assume a total number of Q filters in a convolutional layer, then the mth input feature map x m is convolved with the qth filter.The qth filter is characterized by its trainable weight vector w q and bias b q .Then, the qth output y q of the convolutional layer is given by where the symbol * denotes the convolutional operator.The function σ (•) in ( 9) is a rectified linear unit (ReLU) activation function [57] formulated as σ (x) = max(0, x), which mitigates the problems of slow convergence and gradient vanishing [58].
Pooling layers are generally utilized as an abstraction and downsampling tool to progressively reduce the spatial size and redundancies of the extracted feature maps to increase the network's computational efficiency.Moreover, dropout layers are added to the network to improve the network generalizability and to avoid the overfitting problem [59].After several convolutional layers, the feature maps are flattened before feeding them to the fully connected dense layers or multilayer perceptron (MLP) layers.
In this research, we use a stochastic optimization technique known as adaptive moment estimation (Adam) [60] to optimize or train the parameters of the vector θ.The Adam algorithm applies adaptive learning rates that are based on the estimates of the first-order moment m κ and second-order moment v κ of the gradient g κ according to and where the symbol κ denotes the iteration number, and the decay factors are denoted by β 1 and β 2 .The gradient g κ in ( 10) and ( 11) is the gradient of the stochastic objective function f (θ κ ) = min θ κ R J (C f ).Note that in the Adam algorithm, element-wise operations are adopted for all the vectors m κ , v κ , g κ , and θ κ .Additionally, to counteract the initialization bias of the moments or to avoid the moments' biasedness toward zero, Kingma and Ba [60] suggested that the first-and second-order moments can be rectified as mκ = m κ /(1 − β κ 1 ) and vκ = v κ /(1 − β κ 2 ), respectively.Then, for α κ being the learning rate and ϵ a small constant, the ℓth parameter of the vector θ κ at the κth iteration can be updated as [60] where ℓ = 1, 2, . . ., L. By using the Adam optimizer delineated in this section, we perform the parameter optimization of our DCNN-based classifiers (see Section V), where our objective function is the minimization of the empirical risk R J (C f ) as defined in (8).

III. EXPERIMENTAL SETUP AND THE
PROPOSED SOLUTION In the following, we develop a more pragmatic and complex HAR system suitable for detecting human activities with motion in multiple directions.To this end, we utilize the multiperspective 2 × 2 distributed MIMO radar configuration [53] (see Fig. 1) to eventually realize a direction-independent HAR system.The human activities were monitored by using the 2 × 2 MIMO radar configuration shown in Fig. 1.This configuration is also used for comparison with conventional SISO radar-based HAR systems, and to find out whether the multiperspective MIMO radar configuration can mitigate their limitations.We deployed a software-defined radar system known as Ancortek SDR-KIT 2400T2R4, which is an FMCW mmwave MIMO radar system, and used its transmitter-receiver antennas in a 2 × 2 configuration.The operating parameters of the Ancortek radar system are delineated in Table I.
For the proposed 2 × 2 MIMO radar-based HAR system, we arrange two radar subsystems, denoted by Radar 1 and Radar 2 , where each radar subsystem has a collocated transmitter and a receiver antenna in a monostatic configuration.Radar 1 and Radar 2 are distributed in an indoor setting such that the 2 × 2 MIMO radar system renders a multiperspective illumination of a target as shown in Fig. 1, thereby having the potential to overcome the limitations that are posed by the monostatic SISO or monostatic MIMO radar systems in the context of HAR.We operate Radar 1 and Radar 2 in different time slots according to the TDMA scheme, where both radar subsystems have identical but independent radar  signal preprocessing chains (see Section II).The radar signal preprocessing chains process the raw in-phase and quadrature (IQ) data recorded by the Ancortek MIMO radar system.For a human activity, the radar signal preprocessing block of Radar i generates the TV radial velocity distribution p ii (v, t) by using (7) for i ∈ {1, 2}.
We consider five different types of human activities, which are as follows: falling on a mattress on the floor, walking, standing up from a chair, sitting down on a chair, and picking up an object from the floor.For these activities, the measured TV radial velocity distributions p 11 (v, t) and p 22 (v, t) are shown in Figs. 2 and 3, where the Scenarios 1, 2, and 3 denote the directions of human activities according to Fig. 1.In Scenario 1 (Scenario 2), the human motion is parallel to the boresight of Radar 1 (Radar 2 ), whereas in Scenario 3, the human movement is roughly at 45 • to the boresights of both radar subsystems, as depicted in Fig. 1.
Radar 1 and Radar 2 complement each other such that when the activity direction changes from the x-axis to the y-axis of Fig. 1, the activity signature slowly vanishes from the radial velocity distribution p 11 (v, t) of Radar 1 and starts appearing in the radial velocity distribution p 22 (v, t) of Radar 2 .For "Fall" activities performed in different directions, the measured radial velocity distributions p 11 (v, t) and p 22 (v, t) in the three scenarios vary significantly, as shown in Fig. 2. We can see from Figs. 2 and 3 that Radar 1 and Radar 2 are unable to acquire optimal human activity signatures in Scenarios 2 and 1, respectively.The suboptimal human activity signatures contribute toward the poor classification performance of a SISO radarbased direction-independent HAR system (see Section V-B).Therefore, analogous to a monostatic SISO or monostatic MIMO radar case, a single radial velocity distribution either Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.from Radar 1 or Radar 2 cannot completely portray a human activity and would not be sufficient for the realization of a direction-independent HAR system.Additionally, Fig. 3 shows how the TV radial velocity distribution p ii (v, t) changes with the type of human activity.This figure demonstrates that the human activity signature or TV radial velocity distribution p ii (v, t) depends on the type as well as the direction of the human activity.To see the radar signatures corresponding to different multidirectional human activities, please refer to Figs. 12-15 in Appendix.
As our 2 × 2 distributed MIMO radar-based HAR system generates two distinct activity signatures from two different aspect angles for human activity, we must fuse or merge the information from the two activity signatures in order to accurately classify the human activity regardless of the direction of motion.In this research, we have implemented a fusion technique at the feature level.For this purpose, for each radar subsystem, Radar 1 and Radar 2 , the features are extracted independently and automatically by several convolutional layers from the radial velocity distributions p 11 (v, t) and p 22 (v, t), respectively.The extracted features from the radar subsystems are then merged by the concatenation layer (see Section V-C).
In Section V, we show how the classification performance of the monostatic radar-based HAR system deteriorates if the human activities take place in the 2-D x y plane, which is depicted by the 3 × 3 grid in Fig. 1.We also explain the design of the proposed 2 × 2 distributed MIMO radar-based HAR system and show how it overcomes the above constraints on the direction of human activity motion.Compared to the SISO radar-based direction-independent HAR system, we see that the proposed 2 × 2 distributed MIMO radar-based direction-independent HAR system significantly ameliorates the classification accuracy.

IV. DATA COLLECTION
A comprehensive measurement campaign was carried out in an indoor environment consisting of fixed objects, such as chairs, tables, cabinets, computers, and other electronic items.The five types of activities were performed by six different persons, one of them was a female candidate.The human activities were carried out in several directions, with different speeds, and in different locations.For instance, the falling activities were performed in six different directions as depicted by the scenario markers in Fig. 1.Specifically, the falling activities were executed in the following directions: from (x 3 , y 2 ) to (x 1 , y 2 ), from (x 1 , y 2 ) to (x 3 , y 2 ), from (x 2 , y 3 ) to (x 2 , y 1 ), from (x 2 , y 1 ) to (x 2 , y 3 ), from (x 3 , y 3 ) to (x 1 , y 1 ), and from (x 1 , y 1 ) to (x 3 , y 3 ).The walking activities were performed and recorded in a similar fashion.The other human activities-standing up, sitting down, and picking up an object-were performed accordingly.
In this article, the term HAR (2-D) is coined to represent the dataset recorded by the 2 × 2 MIMO radar system, where the superscript "(2-D)" refers to the human movement in the 2-D horizontal x y plane in Fig. 1.Therefore, for the directionindependent HAR task, we define FEN (2-D) , SISO (2-D) , and MIMO (2-D) as a feature extraction network, a SISO radar-based HAR classifier, and a MIMO radar-based HAR classifier, respectively.On the other hand, to denote the human movement along the 1-D x-axis of the 3 × 3 grid in Fig. 1, we use the superscript "(1-D)."Thus, for the conventional 1-D HAR task, where the human movement is restricted to Scenario 1 in Fig. 1, we define HAR (1-D) , FEN (1-D) , and SISO (1-D) as a dataset recorded by Radar 1 , a feature extraction network, and a conventional SISO radar-based HAR classifier, respectively.
We need the HAR (2-D) dataset to realize the SISO (2-D) and MIMO (2-D) HAR systems, whereas the HAR (1-D) dataset is required for the SISO (1-D) HAR system.The details of the HAR (2-D) dataset related to the measurement campaign based on the proposed 2 × 2 MIMO radar framework are shown in Table II.As entered in Table II, we recorded a total of 1364 activities.For each activity, we generated the TV radial velocity distributions p 11 (v, t) and p 22 (v, t) corresponding to the radar subsystems Radar 1 and Radar 2 , respectively.On the other hand, Table III shows the HAR (1-D)  dataset, which is a proper subset of the HAR (2-D) dataset, i.e., HAR (1-D)  ⊂ HAR (2-D) .Note that the HAR (1-D) dataset only contains those activities of the HAR (2-D) dataset that were performed parallel to the boresight of Radar 1 .To implement a conventional monostatic radar-based HAR system (see Section V-A), we use only the TV radial velocity distributions p 11 (v, t) corresponding to the recorded human activities of the HAR (1-D) dataset.Each human activity trial was recorded for 10 s.The persons were told to maintain the initial and the final poses before and after performing the activity.Though each activity trial was recorded for 10 s, the actual duration of the activity was only 2-5 s, depending on the type of the activity and the speed at which the activity was carried out.We applied the active segment detection (ASD) [61] approach to the high-pass filtered in-phase component of the raw activity data to automatically extract an active segment, i.e., the section of the raw activity data corresponding to the actual duration of the activity.The ASD marks the start and end points of the activity by monitoring the variance of the filtered in-phase component of the raw activity data.The identified markers are used to extract active segments from the raw IQ activity data of Radar 1 and Radar 2 .Thereafter, we applied radar signal processing techniques (see Section V-C) to compute the TV radial velocity distributions p 11 (v, t) and p 22 (v, t) as given in (7).
To demonstrate the utility and effectiveness of our proposed multiperspective distributed MIMO radar approach, we develop three different types of classifiers or HAR systems.First, we develop a SISO (1-D) HAR system underlying a monostatic SISO radar configuration (see Section V-A).As conventional monostatic radar-based HAR systems only consider human activities performed along the radar boresight, SISO (1-D) uses the HAR (1-D) dataset for training and testing purposes.Second, to highlight how the classification performance of a HAR system deteriorates by the introduction of different movement directions, we developed a SISO radarbased direction-independent HAR system denoted as SISO (2-D)  (see Section V-B).Unlike SISO (1-D) , the SISO (2-D) HAR system makes use of the HAR (2-D) dataset for training and testing purposes because SISO (2-D) is designed to classify human activities in multiple directions of motion.Lastly, to significantly improve the classification performance of the SISO (2-D)  HAR system, we also developed a 2 × 2 distributed MIMO radar-based direction-independent HAR system denoted as MIMO (2-D) (see Section V-C).Analogous to the SISO (2-D)  HAR system, the proposed MIMO (2-D) HAR system uses the HAR (2-D) dataset for training and testing purposes, because MIMO (2-D) also considers the classification of human activities in multiple directions.
In this work, the recorded data from Person 1 and 2 were divided into training and validation datasets and used for the training phase of the DCNN-based SISO (1-D) , SISO (2-D) , and MIMO (2-D) classifiers.Of this data, 80% was used to train the classifiers, and 20% was used for validation.The recorded data from the rest of the participants-Person 3, 4, 5, and 6-were reserved to test the trained classifiers or HAR systems.In Sections V-A-V-C, we elucidate the design and development of the SISO (1-D) , SISO (2-D) , and MIMO (2-D) HAR systems, respectively, along with their results and discussions.

V. SISO AND DISTRIBUTED MIMO RADAR-BASED HAR SYSTEMS A. Conventional SISO Radar-Based HAR System
In this section, we describe the design of the SISO (1-D) HAR system, which is analogous to a conventional SISO radar-based HAR system.We show the classification performance of the SISO (1-D) HAR system while restricting the human motion parallel to the boresight of Radar 1 .Thus, we consider the HAR (1-D) dataset in Table III for the SISO (1-D) HAR system.Recall that the HAR (1-D) dataset contains only the human activities that were carried out in front of Radar 1 in Scenario 1.For all recorded human activities listed in Table III, we generated the TV radial velocity distributions p 11 (v, t) using the data of Radar 1 and converted the preprocessed data to images of size 224 × 224 × 3.Each image representing a human activity is a color image (see Figs. 2 and 3) with 224 pixels in the horizontal and vertical dimensions, and the number 3 refers to the red, green and blue (RGB) color channels.
The images of the radial velocity distributions p 11 (v, t) are used as input feature maps for the feature extraction network FEN (1-D) as depicted in Fig. 4. We can see from Fig. 4, that the first, second, and third convolutional layers of FEN (1-D) contain 32, 48, and 64 filter channels, respectively.The dimension of each 2-D learnable filter or kernel, also commonly known as kernel dimension k d , is 6 × 6 pixels.For each convolutional layer of the SISO (1-D) network, we set the stride parameter to 1 so that the kernels are moved or strode by one pixel at a time.To avoid the problem of overfitting, we used L2 regularization [62] to penalize and eventually eliminate the spike-like weight vectors.The problems of slow convergence and vanishing gradients were mitigated by using the ReLU activation function on the convolutional layers [58].Furthermore, each convolutional layer in Fig. 4 is followed by a max-pool layer and a dropout layer.The max-pool layer is of the order 2 × 2, which downsamples the output of the convolutional layer by a factor of 2. Each max-pool layer is followed by a dropout layer with a dropout rate of 15%.Finally, all the features that are generated by FEN (1-D) are flattened before feeding them to the fully connected layers.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
A DCNN-based SISO (1-D) classifier is depicted in Fig. 5, where FEN (1-D) generates features from the input feature maps or, equivalently, the TV radial velocity distribution p 11 (v, t).Then, the extracted features undergo two fully connected layers of the order 256 × 1 and 128 × 1.As we are classifying five different types of human activities, the second-to-last fully connected layer is followed by an output layer of order 5 × 1 with the softmax activation function that converts the logits computed by the network into probabilities.To train the SISO (1-D) classifier, the HAR (1-D) dataset (see Table III) is divided into training, validation and testing datasets.The training and validation data account for 65.6% of the total data and belong to Persons 1 and 2, while the test data account for 34.4% of the HAR (1-D) dataset belonging to Persons 3, 4, 5, and 6.
In the training phase of the SISO (1-D) HAR system, we used the Adam optimizer to minimize the empirical risk R J (C f ) in ( 8) corresponding to the categorical cross-entropy loss function J CCE .Thus, the weights and biases of the DCNNbased SISO (1-D) classifier were optimized by using the Adam optimizer and the examples from the HAR (1-D) dataset.The default values of the decay factors or forgetting factors in (10) and ( 11) are equal to β 1 = 0.9 and β 2 = 0.999, respectively.In order to prevent division by 0 in (12), the value of ϵ was set to be 10 −8 .A batch size of 32 was adopted in the training phase of the SISO (1-D) classifier.Note that the parameter optimization or training of the three classifiers-SISO (1-D) , SISO (2-D) , and MIMO (2-D) -was performed in the same way with the same values for the network hyperparameters.For all three classifiers, the training history is summarized by the training loss, training accuracy, validation loss, and validation Fig. 6.Confusion matrix of the results obtained by the SISO (1-D) HAR system.The first five entries of the last row and last column show the precision and recall, respectively, whereas the last entry highlighted in dark gray shows the overall accuracy.
accuracy curves in Fig. 10.During the training phase, which spans 100 epochs, there is no evidence of overfitting of the SISO (1-D) classifier (see Fig. 10).
We use a confusion matrix shown in Fig. 6 to summarize and quantitatively assess the overall performance of the trained DCNN-based SISO (1-D) classifier.The human activity classification performance of SISO (1-D) was evaluated using test-examples from the HAR (1-D) dataset.On the y-axis of the confusion matrix, we have the true class of an activity, and the x-axis shows the predicted class of an activity.Thus, for the first five rows and columns of the confusion matrix in Fig. 6, the diagonal entries show the number of correctly classified human activities, while the nondiagonal entries show the number of misclassified human activities.For example, the first column of the third row shows that a "Stand" activity has been incorrectly predicted or misclassified as a "Fall" activity.Moreover, the first five entries of the last row and last column of the confusion matrix show the precision and recall [63], respectively.Thus, we can see from Fig. 6 that the walking activity has a 100% recall and a precision of 96.88%.Most importantly, the overall accuracy of the SISO (1-D) classifier is 97.28%, which is indicated by the white color of the sixth entry in the last row and last column of the confusion matrix.It should be noted that using a complex network architecture (FEN (2-D) ) for a smaller dataset (HAR (1-D) ) can lead to overfitting and reduced generalizability.When we conducted experiments by changing the structure of FEN (1-D) to FEN (2-D)  for the SISO (1-D) HAR system, as expected, we observed a small decline in the accuracy of the SISO (1-D) classifier, which dropped to 96.60% from 97.28%.
In this section, we looked into a conventional SISO radarbased HAR system denoted as SISO (1-D) that demonstrated a good classification performance (see Fig. 6).The classification performance of the SISO (1-D) classifier is comparable to stateof-the-art HAR systems.Analogous to the SISO (1-D) classifier or HAR system, most modern HAR systems that are based on either radar or Wi-Fi data are able to classify basic human activities with classification accuracies above 90% [24], [43], [64].However, in these conventional monostatic radar-based HAR systems, the human subjects' movements are limited to Scenario 1.In Section V-B, we extend the HAR problem by considering human motion in the horizontal x y plane, and investigate how this affects the classification performance of a conventional SISO radar-based directionindependent HAR system.

B. Direction-Independent SISO Radar-Based HAR System
To provide a comprehensive analysis and ensure a fair comparison, we include the SISO (2-D) approach in this section, which is a direction-independent monostatic SISO radarbased HAR system.This inclusion allows us to highlight the limitations of the SISO (2-D) HAR system and emphasize the effectiveness of the proposed MIMO (2-D) HAR system in addressing diverse directions of human activities.By comparing their performance using the HAR (2-D) dataset, we aim to demonstrate the significance of the proposed directionindependent HAR framework.Hence, we use the HAR (2-D)  dataset as shown in Table II to realize the SISO (2-D) HAR system.For all the recorded human activities listed in Table II, we generated the TV radial velocity distributions p ii (v, t) by using Radar i data, where i may be chosen as either 1 or 2. For brevity, we report only the results of the SISO (2-D)  HAR system trained and tested with the data of Radar 1 .The TV radial velocity distributions p 11 (v, t) representing human activity fingerprints were converted into images of the order 224 × 224 × 3 (see Figs. 2 and 3), which were used as input feature maps to the feature extraction network FEN (2-D) as depicted in Fig. 7.
The neural network architecture of the SISO (2-D) classifier is similar to the SISO (1-D) classifier except for a few modifications.For instance, the DCNN-based SISO (2-D) classifier uses FEN (2-D) instead of FEN (1-D) to extract features from the input feature maps or the TV radial velocity distribution p ii (v, t) as shown in Fig. 5. Compared with FEN (1-D)  in Fig. 4, we see that FEN (2-D) in Fig. 7 has an additional convolutional layer, and each convolutional layer has a larger number of filters, i.e., 40, 60, 80, and 100.Consequently, the SISO (2-D) HAR system has a greater network complexity and capacity compared to the SISO (1-D) HAR system.Note that we needed a more complex DCNN classifier with higher network capacity because: 1) SISO (2-D) uses a larger HAR (2-D)  dataset containing 1364 human activity fingerprints instead of 427 and 2) because SISO (2-D) aims to classify human activities in different directions, taking into account more diverse, complex, and sometimes suboptimal human activity signatures.
Moreover, the kernel dimension k d of each 2-D learnable filter in FEN (2-D) is 5 × 5 as shown in Fig. 7.The rest of the specifications of the SISO (2-D) and SISO (1-D) classifiers are similar in terms of the max-pool layers, dropout layers, stride, batch size, and activation function.Analogous to SISO (1-D) , SISO (2-D) uses L2 regularization to penalize and eliminate the peaky weight vectors to avoid the overfitting problem.Like SISO (1-D) , SISO (2-D) uses the Adam optimizer to minimize the empirical risk R J (C f ) in ( 8) corresponding to the categorical cross-entropy loss function J CCE .In order to train the SISO (2-D)   Fig. 7. Feature extraction network FEN (2-D) designed for SISO (2-D) and MIMO (2-D) HAR systems.
classifier, the HAR (2-D) dataset is split into training, validation, and testing datasets.The and validation data is 65.4% of the total data and belongs to Person 1 and 2, whereas the testing data is 34.6% the HAR (2-D) dataset belonging to 3, 4, 5, and 6.As mentioned in Section V-A, the training history is summarized by the training loss, training accuracy, validation loss, and validation accuracy curves shown in Fig. 10 for all three classifiers.Note that for the SISO (2-D) classifier, there no evidence of overfitting during the training phase that spans over 100 epochs, as shown in Fig. 10.
The classification performance of the SISO (2-D) directionindependent HAR system was evaluated using the test-examples from the HAR (2-D) dataset.Recall that the SISO (2-D) HAR system is realized by using the data of Radar 1 .To summarize and quantitatively assess classification performance of the SISO (2-D) HAR system, we present a confusion matrix in Fig.The predicted class of a human activity is shown on the x-axis and of the confusion matrix, respectively.The confusion matrix in Fig. 8 shows that the overall classification performance of the SISO (2-D) HAR system has dropped significantly to only 88.98%.On a partially unrelated note and without going into too much detail, we would also like to mention that Radar 2 provides relatively poor data quality due to the cross-channel interference problem [65].Solving the cross-channel interference problem requires the deployment of longer RF cables (see Table I), which cause a higher attenuation of the received signal.For this reason, a SISO (2-D)  direction-independent HAR system realized by using only the data of Radar 2 provided an overall classification accuracy of just 83.05%.
Looking at the nondiagonal entries of the confusion matrix in Fig. 8, we see numerous misclassified human activities, e.g., the "Pick" activity was misclassified 15 times as the "Stand" activity by the SISO (2-D) HAR system.Therefore, the worst precision of the system is 76.34% corresponding to the "Stand" activity, and the worst recall is observed as 80.19% for the "Pick" activity.Interestingly, the precision and recall are 100% for the "Fall" activity, which implies that the SISO (2-D) HAR system learned to classify the human falling activity in all directions.Unfortunately, this is not true for the other four types of human activity, which have diverse and relatively complex radial velocity distributions that vary in different directions (see Figs. 2 and 3).Confusion matrix of the results obtained by the SISO (2-D) HAR system, where SISO (2-D) was trained and tested by using Radar 1 data.The first five entries of the last row and last column show the precision and recall, respectively, whereas the last entry shows the overall accuracy.
In this section, a direction-independent SISO radar-based HAR system (SISO (2-D) ) was investigated, which showed significant degradation in its classification performance for human motion in different directions.For the simpler case of human motion, or when the human motion was restricted to Scenario 1 in Fig. 1, the overall classification accuracy of the SISO (1-D) HAR system was 97.28%.However, when we complicated the human motion by considering the different directions of motion, the classification accuracy dropped to 88.98% for the SISO (2-D) HAR system.The deterioration of the classification performance manifested by the SISO (2-D) HAR system comes from the physical limitations of monostatic SISO radar systems.These physical limitations of monostatic radar systems can be overcome by the 2 × 2 distributed MIMO radar configuration of Fig. 1 to eventually realize a direction-independent MIMO (2-D) HAR system.In Section V-C, we will see how the MIMO (2-D) HAR system ameliorates the shortcomings of the SISO (1-D) and SISO (2-D) HAR systems altogether.

C. 2 × 2 MIMO Radar-Based Direction-Independent HAR System
We now elucidate the design of our proposed 2 × 2 distributed MIMO radar-based direction-independent HAR system denoted as MIMO (2-D) .Considering the different directions of human activities in the horizontal x y plane in Fig. 1, we use the HAR (2-D) dataset (see Table II) to eventually realize the MIMO (2-D) HAR system.In this section, we demonstrate that unlike the SISO (2-D) HAR system, our proposed MIMO (2-D) HAR system is able to recognize the human activities with a very good classification performance for the HAR (2-D) dataset.For all the recorded human activities listed in Table II, we computed the TV radial velocity distributions p 11 (v, t) and p 22 (v, t) by using Radar 1 and Radar 2 data, respectively.The TV radial velocity distributions p 11 (v, t) and p 22 (v, t) were converted separately into images of the order 224 × 224 × 3 (see Figs. 2 and 3), which served as input feature maps to the feature extraction network FEN (2-D)  as depicted in Fig. 7.
Although the neural network architecture of the MIMO (2-D)  and SISO (2-D) HAR systems are quite different in Fig. 9 and Fig. 5, respectively, the building blocks, hyperparameter values, and training processes of the two networks are very similar.For instance, the MIMO (2-D) and SISO (2-D) HAR systems use the same specifications related to kernel dimension k d , max-pool layers, dropout layers, stride, batch size, activation function, regularizer, and Adam optimizer (refer to V-B for more details).Moreover, the same feature extraction network FEN (2-D) in 7 has been adopted for the MIMO (2-D) and SISO (2-D) HAR systems.However, unlike the SISO (2-D) HAR system, the MIMO (2-D) HAR system uses two identical feature extraction blocks as depicted in Fig. 9 for the TV radial velocity distributions p 11 (v, t) and p 22 (v, t).The two FEN (2-D) blocks of MIMO (2-D) HAR system extract unique features automatically and independently of the two radial velocity distributions p 11 (v, t) and p 22 (v, t).In Fig. 9, we can see that these features are then merged using a concatenation layer, which is followed by MLP and softmax layers to eventually classify the human activities.
Analogous to the SISO (2-D) HAR system, the MIMO (2-D) HAR system also uses the HAR (2-D) dataset for the training and testing purposes.However, for the MIMO (2-D) HAR system, the main difference is that the activity fingerprints from both radar subsystems shown in Fig. 1 are simultaneously utilized to classify the human activities.In other words, for the classification of human activity, two distinct multiperspective radial velocity distributions p 11 (v, t) and p 22 (v, t) produced by Radar 1 and Radar 2 , respectively, are processed at once by the MIMO (2-D) HAR system.Therefore, in the MIMO (2-D) HAR system, we utilized 2728 images or equivalently 1364 pairs of images corresponding to 1364 human activities of the HAR (2-D) dataset.The HAR (2-D) dataset was split into training, validation, and testing datasets, where the training and validation data was 65.4% of the total data belonging to Person 1 and 2, and the testing data was 34.6% of the total data belonging to Person 3, 4, 5, and 6.Recall that the training history is summarized by the training loss, training accuracy, validation loss, and validation accuracy curves as depicted in Fig. 10 for all three classifiers or HAR systems.Note that this figure does not reveal any signs of overfitting during the training phase of the MIMO (2-D) HAR system.
In Fig. 11, we present a confusion matrix to quantitatively assess the overall classification performance of the MIMO (2-D) direction-independent HAR system.The human activity classification performance of the MIMO (2-D) HAR system was evaluated over the test examples from the HAR (2-D)  dataset.In the test examples, the number of falling activities is comparatively low because it is difficult to carry out a real-life "Fall" activity.Nevertheless, the train-test split ratio is roughly 77 : 23 for the "Fall" activity.In the confusion matrix in Fig. 11, the overall classification performance of the MIMO (2-D) direction-independent HAR system comes out to be 98.52%, which is a significant improvement over the classification accuracy of 88.98% achieved by the SISO (2-D)  direction-independent HAR system.Looking at the nondiagonal entries of the confusion matrix in Fig. 11, we see only seven misclassified human activities.We can observe that the Fig. 9. Architecture of the proposed MIMO (2-D) HAR system with two independent FEN (2-D) blocks to generate feature vectors that are fused by the concatenation layer for subsequent classification.Fig. 10.Training history for the SISO (1-D) , SISO (2-D) , and MIMO worst precision of the MIMO (2-D) HAR system is 95.54% corresponding to the "Stand" activity, and the worst recall is observed as 94.34% for the "Pick" activity.Note that the increase in the classification performance is basically due to the multiperspective illumination of the environment by the proposed 2 × 2 distributed MIMO radar-based HAR system.
We addressed a HAR task in a complex situation, where we considered the human motion in the horizontal x y plane in Fig. 1.To mitigate the shortcomings of the SISO (2-D) HAR system in relation to the human activity direction, we illuminated the subject from different aspect angles by using the proposed 2 × 2 MIMO radar-based direction-independent HAR system denoted as MIMO (2-D) , which demonstrated a remarkably good classification performance as summarized by the confusion matrix in Fig. 11.As evident from the classification performance of the MIMO (2-D) HAR system, the physical limitations of the monostatic radar systems were successfully mitigated by the multiperspective 2 × 2 distributed MIMO radar configuration.Therefore, by addressing and rectifying the fundamental radar problem at the physical layer, we were able to design a radar-based HAR system that was capable of recognizing human activities independent of their directions with a classification accuracy close to 100%.

VI. CONCLUSION
In this article, we analyzed and resolved a crucial physical layer problem of state-of-the-art monostatic SISO, SIMO, and MIMO radar-based HAR systems, which primarily arises due to the target's aspect angle.Thus, a more pragmatic and more complex HAR problem has been elucidated in this research in the context of RF sensing, where we improve the activity recognition task by considering multiple directions of human activities.A novel HAR dataset (HAR (2-D) ) was recorded by using the proposed multiperspective 2 × 2 MIMO radar framework.We developed and analyzed three different HAR systems, denoted as SISO (1-D) , SISO (2-D) , and MIMO (2-D) , by using our HAR (2-D) dataset and its sub-dataset HAR (1-D) .
Analogous to most modern radar-based HAR systems, the SISO (1-D) HAR system was able to classify human activities with a classification accuracy of 97.28%.However, in this conventional monostatic radar-based HAR approach, the movement of the human subjects was restricted along the radar's boresight axis.By developing and analyzing the monostatic SISO (2-D) HAR system and considering the human activities taking place in the 2-D x y plane, we substantiated a significant deterioration in the classification performance from 97.28% to 88.98%.The deterioration of the classification performance manifested by the SISO (2-D) HAR system came from the inherent physical layer limitations of the monostatic SISO radar systems.To overcome these physical layer issues and drawbacks experienced by today's radar-based HAR systems, we utilized a multiperspective 2 × 2 distributed MIMO radar system to realize a direction-independent HAR system that was capable of recognizing human gross motor Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Images containing the heatmap of the measured radial velocity distributions p ii (v,t) of the "Walk" activity in three different scenarios, where each image has the radial velocity v on the y -axis ranging [−1.5, 1.5] m/s and time t on the x-axis spanning over 3-5 s.  activities regardless of the aspect angle or direction of motion.To eradicate the limitations that emerge due to the direction of motion, feature level fusion was adopted in the DCNN-based MIMO (2-D) classifier, which virtually combines the target's information from different aspect angles.
For the HAR (2-D) dataset, it was shown that the proposed multiperspective MIMO (2-D) HAR system significantly outperforms the monostatic SISO (2-D) HAR system.Compared with the SISO (2-D) HAR system, the proposed MIMO (2-D) HAR system significantly improved the classification accuracy from 88.98% to 98.52%.Therefore, the physical layer limitations of the monostatic SISO radar-based HAR systems were successfully mitigated by the proposed MIMO (2-D) HAR system.Images containing the heatmap of the measured radial velocity distributions ii (v,t) of the "Pick" activity in three different scenarios, where each image has the radial velocity v on the y -axis ranging [−1.5, 1.5] m/s and time t on the x-axis spanning over s.
The MIMO (2-D) HAR system presented in this article paves a way forward toward actualizing a more realistic and more advanced radar-based HAR system.To further enhance the classification performance, we plan to use the bistatic components of the 2 × 2 MIMO radar system, which are the TV radial velocity distributions p 12 (v, t) and p 21 (v, t).For more aspect angle coverage and a more complex HAR problem, we plan to extend the fundamental distributed 2 × 2 MIMO radar system to a larger MIMO antenna configuration.

Manuscript received 19
June 2023; accepted 28 August 2023.Date of publication 6 September 2023; date of current version 16 October 2023.This work was supported by the Research Council of Norway through the CareWell Project under Grant 300638.The associate editor coordinating the review of this article and approving it for publication was Prof. Pierluigi Salvo Rossi.(Corresponding author: Sahil Waqar.)

Fig. 1 .
Fig. 1.Measurement setup of the proposed 2 × 2 MIMO radar-based HAR system consisting of Radar 1 and Radar 2 , where Scenarios 1-3 characterize human activities in different directions.

Fig. 2 .
Fig. 2.Images containing the heatmap of the measured radial velocity distributions p ii (v,t) of the "Fall" activity in three different scenarios, where each image has the radial velocity v on the y -axis ranging [−1.5, 1.5] m/s and time t on the x-axis spanning over 2-4 s.

Fig. 3 .
Fig. 3. Images containing the heatmap of the measured radial velocity distributions p ii (v,t) of different human activities, where each image has the radial velocity v on the y -axis ranging [−1.5, 1.5] m/s and time t on the x-axis spanning over 3-5 s.

Fig. 8 .
Fig. 8.Confusion matrix of the results obtained by the SISO (2-D) HAR system, where SISO(2-D) was trained and tested by using Radar 1 data.The first five entries of the last row and last column show the precision and recall, respectively, whereas the last entry shows the overall accuracy.

Fig. 11 .
Fig. 11.Confusion matrix of the results obtained by the proposed MIMO(2-D) HAR system with an overall accuracy of 98.52%.

Fig. 12 .
Fig. 12.Images containing the heatmap of the measured radial velocity distributions p ii (v,t) of the "Walk" activity in three different scenarios, where each image has the radial velocity v on the y -axis ranging [−1.5, 1.5] m/s and time t on the x-axis spanning over 3-5 s.

Fig. 13 .
Fig. 13.Images containing the heatmap of the measured radial velocity distributions p ii (v,t) of the "Stand" activity in three different scenarios, where each image has the radial velocity v on the y -axis ranging [−1.5, 1.5] m/s and time t on the x-axis spanning over 2-3 s.

Fig. 14 .
Fig. 14.Images containing the heatmap of the measured radial velocity distributions p ii (v,t) of the "Sit" activity in three different scenarios, where each image has the radial velocity v on the y -axis ranging [−1.5, 1.5] m/s and time t on the x-axis spanning over 2-3 s.

Fig. 15 .
Fig. 15.Images containing the heatmap of the measured radial velocity distributions ii (v,t) of the "Pick" activity in three different scenarios, where each image has the radial velocity v on the y -axis ranging [−1.5, 1.5] m/s and time t on the x-axis spanning over s.

TABLE I 2
× 2 MIMO RADAR SYSTEM PARAMETERS

TABLE III HAR
(1-D)DATA SUBSET RECORDED BY RADAR 1 , WHERE THE DIRECTION OF MOTION OF THE HUMAN ACTIVITIES IS RESTRICTED TO MERELY SCENARIO 1