Detection Error Contaminated by Outliers to Classify Density Profiles Dependent on the Relative Speed Between a MIMO Sensor and a Human Hand

The detection of human hand intrusion is crucial for improving the productivity of human–robot collaboration. Recent trends to safety-related sensors have adopted the concept of radar-based human presence detection. Some of them have already been commercialized. However, it has not yet been elucidated whether radars can sufficiently detect human hand intrusion and satisfy the required safety integrity level. In this study, we present outlier and error profiles obtained by detecting human hand intrusion based on the motion and speeds of the human hand and radar sensor. Our experiments indicate that slow hand movements require further studies from the viewpoint of safety. In addition, we suggest a new noise model demonstrating random noise and temporary outliers based on the recorded noise profiles of actual human participants. The obtained outlier and error profiles can be utilized as an outlier judgement criterion of the sensing in the specific case of the radar.


I. INTRODUCTION
Over the past few decades, significant progress has been made in the fields of robot automation and human-robot collaboration (HRC) [1]. This can be attributed to the recent trends of modern manufacturing that require agility and flexibility. Moreover, the development of human presence detection technologies allows the sharing of human and robot workspaces. Several studies have demonstrated that HRC can potentially help in maximizing the productivity of the manufacturing process in terms of floor space and time costs [2], [3], [4]. Safety is the main concern when adopting The associate editor coordinating the review of this manuscript and approving it for publication was Wen-Sheng Zhao . HRC in manufacturing sites and robots can only be controlled via proper risk assessment [5], [6], [7]. Therefore, risk reduction (to an acceptable level) is required for satisfying the requirements of collaborative-operation modes based on safety standards.
The field of HRC includes many situations in which a non-contact safety approach is required. Speed and separation monitoring (SSM) can be used as a safety method to ensure operator safety during HRC tasks [8], [9]. In SSM, human body parts must be detected using a suitable safety-related sensor to estimate the protective separation distance (PSD) [10]. In the framework of the IEC 61496 series, several types of electro-sensitive protective devices (EPSE), such as light curtains, laser scanners, and vision sensors, have been discussed [11], [12], [13], [14]. In addition to the conventional safety-related sensors, radar sensors are currently being discussed by the technical committee of TC44 in addition to the IEC 61496-5. In this study, we focus on radars used as safety-related sensors considering safety issues during human detection that must be urgently addressed.

A. CONTRIBUTIONS
This study aims to specify and suggest a new noise model which can completely demonstrate the noise profiles when detecting a human hand using a radar sensor. This study focused on the potential factors that can reduce the performance of radars while detecting human hand intrusion.
The results indicate that the type and proportion of the obtained outliers are different with respect to the relative speed and motion of the human hand. The obtained outlier and error profiles are modelized and can be used as an outlier assessment criterion for human intrusion sensing using radar. Finally, based on a hand intrusion experiment involving ten participants, we suggest a new noise model that demonstrates random noise and characteristic outliers generated by misdetection.

II. LITERATURE REVIEW
Although radars are robust to environmental conditions such as moisture, dust, and vibration, compared with laser scanners, typical radars have relatively low resolutions and measurement reliabilities [15]. In addition, currently commercialized safety-related radar sensors are limited to the detection of human presence using a single output single input antenna [16]. In contrast, advanced forms of radar applications, such as target classification using thresholding methods, artificial intelligence, or multiple input and output antenna systems, are continuously being reported at a research level [17], [18], [19], [20], [21]. Especially, various efforts have been reported for human presence sensing to replace the conventional laser scanning approach by introducing the radar as an extrinsic sensor system [22], [23]. Moreover, radar applications such as sensor fusion, which includes radar and laser scanners, or simulation approaches for layout optimization have been proposed [24], [25]. Therefore, the productivity of HRC using radars can be improved if highly accurate human body detection (for example, the detection of human hand) is performed.
However, it has not yet been elucidated whether the performance of radars is reliable enough to achieve the required safety integrity level or whether they can be applied for detecting human hand intrusion [2], [15], [26]. From the perspective of human detection and tracking, undesired background noise, known as ''clutter,'' is uniformly distributed and can be sufficiently reduced using a constant false alarm rate method [27], [28]. However, in dynamic conditions, a constant false alarm rate method leads to incorrect detection, which must be alleviated to ensure safety [29]. Therefore, some studies have focused on pre-processing techniques that can complement misdetections, which lead to dangerous side failure [30], [31].
Notably, the specific properties of outlier/error profiles generated by the misdetection of human hands using radars have not yet been reported from the viewpoint of safety. Furthermore, the error/outlier profiles of radars may be different from those of commercially available safety-related sensors. Therefore, in this study, the performance and error profiles of a radar sensor were analyzed during the detection of human hand intrusion. In particular, the potential factors that influence radar performance, such as the relative speed and motion of the human hand, and radar placement, were discussed.

III. POINT CLOUD PROCESSING ALGORITHM
In this section, human hand detection techniques are introduced based on point cloud sample processing. For human hand detection, the Euclidean clustering approach is presented for effectively extracting human hand intrusion samples. In addition, a multiple input-output frequency modulated continuous wave (MIMO-FMCW) radar principle is introduced.

A. RANGE FOURIER TRANSFORM
The distance and velocity between the radar and the object can be computed using linear chirp frequency transmission. Each transmitted chirp can be explained with a carrier frequency (f c ), sweep sample time (t w ), or sweep frequency (f w (= 1/t w )). After a reflected chirp signal is received, the object range can be calculated by comparing it with the transmitted chirp profile.
where r denotes the range obtained using each chirp signal, and c is the speed of light. Note that f + b is the average beat frequency of increment and decrement chirps, compensating a Doppler effect.
Similarly, the velocity can be obtained using the difference in the beat frequency.
where v denotes the velocity, and λ c is the wavelength of a chirp. Notably, f − b denotes the difference in the beat frequency during the frequency modulation representing the Doppler frequency.
Finally, angular estimation can be performed using multiple receiver antennas via a simultaneous sampling approach. In other words, the estimated angle of arrival (AoA) can be given by: where θ(t) denotes the angular estimation, ω(t) is the phase difference between two neighboring patch antennas, and d is VOLUME 10, 2022 the preliminarily determined distance between each receiver antenna. Based on the obtained frequency samples, a rangedoppler map can be created, which can be converted into a point cloud format to recognize specific objects.

B. POINT CLOUD AND CLUSTER PROCESSING
The raw measured point cloud obtained by the radar is usually discretely distributed [17]. In other words, it generally does not include sufficient information for human tracking or detection. Therefore, a process to associate the point cloud samples using the initial observation points is essential for the target estimation. Fig. 1 illustrates the preliminary point cloud and cluster processing for the detection of human hand intrusion. Notably, an input point cloud P is defined as a set of I points measured by the radar and is expressed as follows: The first step for cluster detection involves the application of a passthrough filter to consider only the points included in the space of interest, p i , such as x min < x i < x max , y min < y i < y max , and z min < z i < z max . In particular, the point cloud P * belonging to the space of interest can be expressed as a subset of the input point cloud P as follows: Notably, passthrough filtering is essential only for considering the object cluster points that belong to the space of interest.
Next, point clusters are extracted using the hierarchical Euclidean clustering method [32].
where J denotes the total number of clusters generated by hierarchical Euclidean clustering, and C j represents the cluster containing nonoverlapping points with all the other clusters. Based on the Euclidean distance threshold criterion, d * , every cluster should satisfy the following conditions to avoid overlap: where the sets of points p j , p k ⊂ P belong to the point clusters C j and C k , respectively. Finally, the cluster of interest, C * , that might be a potential human cluster can be defined with the cluster size.
where N cmin and N cmax represent the number of minimum and maximum points associated with the corresponding cluster, respectively.

IV. EXPERIMENTAL DESIGN
The detection of a quasi-stationary object is assumed to be more challenging than that of a fast-moving object owing to the characteristics of the radar specialized for detecting moving objects. Therefore, we experimentally compared the detection error and outlier profiles based on the speed of the human hand. For the comparison, fast and slow movements were performed during the human hand intrusion task by the participants. Subsequently, detection error and outlier profiles were investigated for different human intrusion motions. For the comparison, the participants reproduced three typical movements (back and forth, lateral swing, and vertical swing). From the viewpoint of detection, the back and forth movement was expected to be the most challenging owing to its relatively small reflection area. Next, the detection error and outlier profiles were investigated when the radar detected the dynamic movements of human hand intrusion. Furthermore, the sensor mount was moved at three speed levels, i.e., 0, 100, and 400 mm/s. Fig. 2 presents the overall experimental layout of the measurement condition. A linear actuator (LEFB25S2S-1000-S2A1, SMC, Japan) was used to repeatably move the radar sensor back and forth (10 cm) [33]. The duration of the experiment was 15 min, which is the demand rate suggested in IEC/TS 62998 [34]. For the comparison, a motion capture system (Motion Analysis Co., Santa Rosa, CA, America) with 12 cameras was used to measure the relative distance and velocity between the sensor antenna and the hand of the participants. Importantly, the values measured by the motion capture system were considered to be the true values in this experiment.

B. RADAR SETUP
To measure human hand intrusion, a MIMO radar (IWR6843ISK, Texas instruments, America) with a 60 GHz standard antenna was used (The azimuth and elevation field FIGURE 2. Hand intrusion experiment: Radar-detected human hand intrusion for different types of intrusion scenarios. In addition, a linear actuator was used to move the radar back and forth to demonstrate dynamic measurement conditions. During the experiment, the participants were asked to reproduce the motion of a hand approaching the target. For the comparison, motion capture cameras were used to accurately track the human hand motion. of view were ±60 • and ±20 • , respectively). The transmit power was set to 12 dBm with a maximum bandwidth of 4 GHz (60-64 GHz). Accordingly, the gain of the transmitting or receiving antenna was set to 12 dB. The chirpfrequency slope was set to 71.26 MHz/µs with a sampling rate of 5279 kbps, which resulted in 222 samples for each chirp. Therefore, the range resolution using a fast Fourier transform with a size of 256 was 4.34 cm with a measurement rate of 30 Hz. After measuring the surroundings, the obtained point cloud samples were transformed to a global coordinate following the coordinate frame presented in Fig. 3. The aforementioned radar was implemented using Robot Operating System (ROS) Melodic running on Linux Ubuntu 18.04 LTS (64-bit) [35]. The three-dimensional point clouds were measured using  the TI mmWave ROS package 1 officially provided by the manufacturer with calibration files and serial drivers required for communication between personal computers. 2 The TI mmWave ROS package was modified from the original Industrial Toolbox (2.3.0). All the point clouds and closest distance points extracted by the Euclidean clustering method 3 were recorded using the rosbag package with time-stamped samples. 4 Figs. 4 and 5 show the visualized point cloud samples and possible human body part points using the ROS rviz package. Fig. 4 reveals that distinguishing human body parts is challenging, especially the human hand, owing to their low reflectivity and occlusion due to the complexity of human motions. First, to filter useful information from the measured samples, passthrough and Voxel filters with the same lattice length of 5 cm, which is the theoretical resolution of radars, were used. The distance threshold, d * , that is used to cluster potential human body parts using the Euclidean clustering method was found to be insensitive to the parameters until d * was increased to 20 cm, considering a maximum human hand length of approximately 85 cm [33]. Finally, the number of minimum (N cmin ) and maximum (N cmax ) points were empirically set to be 2 and 50, respectively.

C. HAND INTRUSION
Ten healthy participants comprising five males and five females were enrolled in this experiment with the permission of the Institutional Review Board of the Department of Engineering of Nagoya University (approval no. 20-21).

1) HAND SPEED
The participants moved their hands while exhibiting two types of speeds: fast (2 m/s) 5 and slow (0.1 m/s). 6 They were instructed to reach for the target, and a metronome was used to maintain each suggested speed (100, 20 bpm).

2) HAND MOTION
During the experiment, the participants demonstrated three typical patterns of intrusions, as shown in Fig. 6.

3) SPEED OF SENSOR MOUNT
While the participants reproduced the intrusion motion based on their respective speeds and motions, the radar-mounted 4 https://github.com/ros/ros_comm 5 Maximum speed of the human hand suggested in ISO 13855. 6 Slow speed is being discussed by TC/44/PT IEC 61496-5. linear actuator moved back and forth at three different speeds (v m = 0, 100, and 400 mm/s) along the rails.

D. EVALUATION METRIC 1) MEASUREMENT ERROR
To compare the measurement errors, the root mean square error (RMSE), , was used where N denotes the number of observed vector samples included in each trial; x k , y k , and z k denote the measured position obtained from the radar for each axis;x k ,ŷ k , and z k represent the measured position obtained from the motion capture system for each axis.
Further, a Gaussian mixture model was used to distinguish between the group of errors derived from the outliers [36]. The misdetection of a human hand is classified into two cases. In one case, the human hand is lost or misdetected and, instead, the human arm is detected. In the other case, instead of the human hand, the human body trunk is detected. We call these phenomena as ''target shifting''. Therefore, three Gaussian distributions were fitted to each experimental case. Then, the mean ( ) and deviation values were compared.

2) OUTLIER JUDGEMENT
HRC systems, especially the safety-related parts should satisfy the required performance level, PL r = d [37]. In the latest IEC/TS 62998 standard issued in 2018, a coverageinterval concept was introduced, which enables the quantitative evaluation of measurement uncertainty assuming that the measurement error follows a Gaussian distribution [34]. Assuming that the upper limit of the required probability of failure per hour is PFH u , the coverage probability (C p ) can be expressed as where r d represents the demand rate of the collaborative system. According to the IEC/TS 62998, if the required safety integrity level is PL r = d, the upper limit of PFH is 10 −6 , and the suggested demand rate (r d ) is approximately 4h −1 . Note that the unit of demand rate is the average number of times per hour the safety-related device is demanded. Therefore, C p , which satisfies the corresponding safety integrity level is obtained as 1 − 2.5 × 10 −7 . Finally, the coverage interval that includes such a probability is ±5.16 σ , which can be derived using the statistical measurement error of the sensor system. In this study, assuming that the standard deviation of the error (σ ) is 4 cm, 7 the calculated range of ± 20.64 cm corresponds to the acceptable error. Moreover, the measurement samples that belong outside the scope of the border are treated as outliers.
As shown in Fig. 7, the outliers are classified into three types based on the observations [38]: first, the innovative outlier (IO), which generates irreversible trends after observations; second, the additive outlier (AO), which affects a single observation; and third, the temporary outlier (TO), which generates transitory trends during certain time intervals. In particular, the outliers can further be classified based on their expected risks into a negative outlier (safety side) and a positive outlier (hazardous side).

V. RESULTS
In this section, the analysis of detection error contaminated by outliers is presented based on the evaluation metric stated in IV-D. Table 1 summarizes the error profiles against each relative speed between the radar and human hand. The Gaussian component whose mean RMSE is the smallest is the first component. Similarly, the second and third components were determined on the basis of their mean RMSEs. Fig. 9 summarizes the results of the outlier proportion based on the hand speed of the participants. The results reveal that, compared to when the participant's hand speed was fast, when it was slow, the inlier decreased by 6% from 92% to 86% In particular, TO-pos increased from 2.7% to 12%, whereas the TO-neg decreased from 3.6% to 1.6%. Fig. 10 summarizes the results of the outlier proportion based on the hand motion of the participants. The results reveal that LS had the highest inlier proportion among the hand motions (Back and forth: 87%, Lateral swing: 94%, Vertical swing: 87%). Further, LS had the lowest proportion  of the TO among the hand motions (Back and forth: 9.9%, Lateral swing: 4.7%, Vertical swing: 8.1%). Fig. 11 summarizes the results of the outlier proportion based on the speed of the sensor mounter during the experiment. Compared to when the speeds of the sensor mounter were 0 mm/s and 100 mm/s, when the speed was 400 mm/s, the inlier increased (87%, 87%, and 91%, respectively). Fig. 12 illustrates the probability density of the obtained positive temporary outliers according to the relative speed between the human hand and sensor. The samples were measured based on the instant relative speed when the samples were classified as the positive temporary outliers. Note that VOLUME 10, 2022 TABLE 1. Error profiles for each case of the human hand motion against each speed of the sensor mounter v m . For the comparison, the profiles of the error were categorized using the Gaussian mixture model, where P, , and σ represent the proportions, root mean square error, and standard deviation of each distribution, respectively.

FIGURE 10.
Comparison of outlier proportion between each hand motion of the 10 participants. The result of the three motions reveals that the lateral swing motion (LS) had the least outliers. Remarkably, the outliers belonging to Temp-p were greater than those belonging to Temp-n.
the instant relative speed measured by the motion capture system was used as the sample. Fig. 13 displays the probability density for obtained samples to be classified as the positive temporary outlier considering the total number of obtained samples. Note that the curve fitting was carried out by using the exponential function with each of the histogram bins.

A. RELATIONSHIP BETWEEN HAND SPEED AND OUTLIER
As expected, during the detection of human hand intrusion using radar, the detection of a static or quasi-static FIGURE 11. Comparison of outlier proportion between each speed of the sensor mounter for the hand intrusion of the 10 participants. As a result, the proportion of the Temp-p was the lowest when the speed of the sensor mounter was v m = 400 mm/s. intrusion was more challenging than that of a dynamic intrusion (Fig. 9). The error-profile results were categorized using the Gaussian mixture model. Specifically, the third component was assumed to be an outlier distribution owing to its distinctive mean value and large standard deviation of the error (Fig. 8). Notably, the outlier proportion increased when the hand speed was low for all the hand-intrusion movements.
When the speed was low, most of the outliers were identified as TOs. Moreover, the proportion of the AOs was relatively lower than that of the TOs. Hence, misdetection is more probable when the hand intrusion speed is low. Consequently, further iterative measurement or observations must  be performed for redetection. Therefore, we can conclude that slow hand intrusion motions require further analyses from the viewpoint of safety.

B. RELATIONSHIP BETWEEN HAND INTRUSION MOTION AND OUTLIERS
As expected, compared with those of the back and forth motion, the results of lateral motion contained fewer outliers owing to its relatively large cross-section area (Fig. 10). This may be because, unlike the back and forth motion, in the Lateral swing motion, the hand cross-section is valid continuously along the arm. In the case of a vertical swing motion, although the cross-section of the hand is continuously generated, similar to that in the Lateral swing motion, the results are rather similar to those of the back and forth motion ( Fig. 10). Moreover, when comparing the mean value of the second component, the values of the vertical swing and backand-forth motions were found to be higher than the average value of the first component (Table 1). The challenge in the generation of a cross-section area can be attributed to the dynamically changing vertical plane of the radar. Therefore, owing to the diversity of human hand intrusion, it is preferable to evaluate the intrusion with a three-dimensional movement such as a vertical swing.

C. RELATIONSHIP BETWEEN SENSOR MOUNT SPEED AND OUTLIER RATIO
As expected, the maximum inliers were observed at a speed of 400 mm/s. A minimal difference was observed between the inliers at the speeds of 0 and 100 mm/s (Fig. 11). In addition, the outlier proportion decreased when the sensor mount speed increased in the cases of the back-and-forth and vertical swing motions (Table 1). Therefore, the relative speed due to intrusion and the position of the sensor affects the misdetection of radars. However, the ratio of the TO-pos at 100 mm/s increased more than those at 0 mm/s. Although these results indicate that small differences in the relative speed do not significantly affect misdetection, a limitation exists where the range of the mount speed is small and must be tested under various patterns of operations.

D. NOISE PROFILE
This study confirmed that, contrary to the noise profile suggested by IEC 61496, the noise profile of human hand intrusion obtained from our experiment was in the form of a nonparametric distribution [39]. In terms of the detection/removal of outliers, a criterion based on a distribution different from the actual error profile may result in outlier overestimation/underestimation. In contrast, despite the numerous efforts invested in the detection and removal of outliers, the existing algorithms for outlier simulation focus on AOs. However, the actual results showed that TOs were dominant during radar detection [40], [41], [42].
Therefore, methods that include TOs and nonparametric methods for the outlier assessment criterion and outlier generating algorithms, respectively, can be highly effective. Accordingly, we propose a novel noise profile that satisfies the aforementioned criterion, given by where µ 1−3 denotes the position vectors of the primary (hand), secondary (arm), and tertiary (trunk) targets. The constant λ is the additive contamination ratio, and α, and β denote the scale factor. The constant ρ represents the temporary contamination ratio that contributes to the generation of TOs. Notably, the second case that demonstrates target shifting is initiated only when is positive to generate dangerside outliers. Moreover, the valid range of parameters for generating adequate noise profiles are 1 < α, 1 < β, 0 < λ < 1, and 0 < ρ < 1. However, choosing the precise parameter to demonstrate the realistic outlier ratio is challenging. Nevertheless, the results from the comprehensive relative-speed analysis indicate that the proportion of TO-pos can be accounted for by the instant relative speed between the human hand and sensor, indicating that ρ is the function of v r . Therefore, to validate the robust techniques when using radar as a human hand intrusion sensor, relative speed should be considered as a critical index for the positive temporary outlier.

VII. CONCLUSION
This study presented outlier profiles obtained during the detection of human hand intrusion using MIMO-radar. Three potential factors that degraded the radar performance, that is, hand intrusion speed, movement, and sensor mount speed, were investigated. Ten participants were enrolled in our experiments. The experimental results indicated that, instead of dynamic movements, the stationary and quasi-stationary movements between the human hand and radar needed to be further analyzed because they frequently resulted in misdetection. Furthermore, temporary outliers were found to be more dominant. Thus, we proposed a novel noise generation algorithm that included a TO generating sequence.