Privacy-Aware Gait Identification With Ultralow-Dimensional Data Using a Distance Sensor

As one of the most natural user behaviors, walking has been widely focused on developing personal identification systems due to its unique biometric authentication features. Popular visual solutions are usually affected by various environmental conditions, and their redundant user information (e.g., body type and appearance) makes it more challenging for users to maintain privacy and security. This article proposes a distance sensor-based gait identification system that uses only 1-D data with a simple system structure. Specifically, a time-of-flight (ToF) sensor was placed in front of a walking person, and a time series of distances was acquired. We extracted gait features from the data by calculating the velocity and acceleration curves and identifying individuals using a random forest (RF) classifier. We evaluated our system on ten users using leave-one-out cross validation. The average identification accuracy was 91.05% for ten users. This study shows that gait recognition is possible using only 1-D time-series data with a noncontact sensor. It can be used as a contactless identification, reducing the computational resources required for low-cost and low-power-consumption edge computing.


I. INTRODUCTION
I DENTITY recognition has been a widespread concern in recent decades. With the development of application needs, detection technologies with different characteristics can serve various application scenarios, including biometric, wearable, and visual. Among them, one of the important application scenarios for authentication is for public facilities, as some require identifying users who need access to ensure security. A number of research and technology efforts have been made to help solve the identification problem for outdoor facilities [1], [2], [3], [4]. In general, the first method is to employ explicit information that identifies the user, such as the individual's name and address. The unique information attributed to the user is used for identification. However, the approach usually imposes a relatively inefficient burden, such as handwritten input for registration and the long process required for electronic processing. Other methods, such as electronic ID cards, can quickly provide identification to facilitate greater efficiency. However, property-based methods are still subject to card loss, fraud, and theft risks. In addition, with the development of sensing technology and information processing, systems are gradually becoming more intelligent, using the user's biometric information for identification. Since such features are collected from individual users, the problem of loss is solved, and the natural way of identification can be realized more efficiently, reducing the burden of users. Among them, noncontact measurement technology is gradually developing to fit the natural behavior of users and improve the convenience of users' use. Gait identification is one of the biometric approaches. It is personal identification based on individuality, such as stride length, arm swing, posture, and left-right asymmetry, that appears in the pattern of limb movement when a person walks. It has been studied actively in recent years [5]. A popular method to measure gait is using image sensors, such as red, green and blue (RGB) and depth cameras, and image processing and pose estimation are adopted to extract features [6], [7], [8], [36]. The image sensor-based method employs nonwearable devices and requires less attention or cooperation from users than conventional biometric authentication techniques. Therefore, it is a suitable method for identifying users in public facilities in terms of ease of use. However, secondary information, such as face image, race, and body silhouette, still can be extracted from the captured images. A potential technical challenge is that because biometric information is immutable personal information, it needs to be carefully managed, and the cost of security measures is high. There is also the issue that given human-centered technology, users may be reluctant to provide their biometric information for authentication due to privacy concerns [9]. Therefore, there is still a problem with the management cost of personal information and the users' sense of resistance. Other types of noncontact and nonvisual methods have emerged to provide a more pervasive and suitable solution, such as using Wi-Fi signals [11], [12], radar signals [13], and acoustic signals [14], among others. According to the capture reflected signal from the user's walking behavior, the unique characteristics are likely to be extracted as the features for an individual to be identified. Because of this feature, Wi-Fibased channel state information (CSI) and gait detection at the Doppler shift can be used for individual gait recognition. In combination with machine learning techniques, excellent performance has been demonstrated. However, such systems usually require a stable external environment and a high signal capture frequency to ensure high-accuracy gait information acquisition. Although this benefits the system's robustness, it also imposes requirements on the system design, power consumption considerations, device location, cost, and so on. Improving the universality of the use of the system for gait identification in multiple situations becomes a factor to be considered in addition to accuracy improvement. Therefore, it is important to explore gait identification systems for more edge cases, such as low-power consumption and low-cost system.
Practically, the application of gait identification is broad, the system characteristics required for different applications are not uniform, and the size of the pool of users needed varies. Therefore, in this article, we explored the use of ultralowdimensional data to design contactless privacy-preserving gait identification systems. From the literature, we first proposed using a time-of-flight (ToF) distance sensor placed in front of the walking user to identify the user by detecting 1-D time-series data ( Fig. 1). With the design of this system, the possibility of gait identification with 1-D data is demonstrated. Since the system involves only a single ToF sensor and 1-D data acquisition, the whole system maintains low-cost (less than U.S. 10 cost) and low-power-consumption (less than 1 mW) characteristics. Compared with other mainstream noncontact gait-based identification systems, the introduced design provides a simple solution for situations, where the user pool is small and the identification performance requirements are not extremely stringent. II. BACKGROUND AND RELATED WORK RGB cameras and depth cameras are mainly used for gait recognition by image sensors. There are two main methods: one is to extract the features of gait by image processing, and the other is to estimate human posture from camera images and obtain gait features from temporal changes in the coordinates of each body part. El-Alfy et al. [15] proposed a personal authentication method that captured the geometric properties of silhouette boundaries in an image by evaluating the contour curvature using gauss maps. Zulcaffle et al. [16] presented a method that used images acquired by a 3-D ToF camera. They extracted the silhouette of a person from the depth image and used multiple classifiers to identify the person, which switched the algorithm depending on the package's existence and the walking intensity. Combined with the external environment, Zheng et al. [6] used a camera and a pressure sensor installed on the floor. Cumulative pressure and walking images were used as inputs for the system. They first calculated the canonical correlation between the input pressure image and the database to select the most appropriate camera images from the dataset, and then, image matching was performed on the camera input for personal authentication. Sabir et al. [7] employed human 3-D posture information acquired by Kinect. They focused on several joints in the posture information and used the statistics of their distances and angles from the ground for one walking cycle as feature values. They utilized the k-nearest neighbor (KNN) method and a linear classifier for the estimation. Yang et al. [8] also used human posture information acquired by Kinect. They calculated the statistics of the relative distances between symmetrical joints in the human body as features, and the KNN method was deployed with the Manhattan distance for the estimation. In these image sensor-based personal authentication methods, the biometric information acquired is highly subjective, and the RGB image or depth image can be used to read the user's appearance and silhouette besides the gait characteristics. Thus, privacy concerns have become the most significant barrier to adopting such technologies.
To solve this privacy problem in video surveillance, Koshimizu et al. [9] introduced an abstracting method for people in the image. They listed the following abstraction levels: erasure (transparent), dotting (existence information only), boxing, silhouetting, edging, blurring, head-only boxing, head-only silhouetting, head-only edging, head blurring, and annotation (showing personal name).
In addition, to protect personal privacy, some wearable devices and other daily accessory-based equipment exist. For example, Fujii et al. [10] proposed a method that used time series acquired by multiple accelerometers attached to slippers. They used fast Fourier transform to extract features and a support vector machine (SVM) to identify the frequency features. Kurahashi et al. [17] estimated the identity of individuals from the angular velocity information obtained by the gyroscope sensor attached to the shaft of toilet paper in the bathroom, where it is challenging to install sensors, such as cameras and microphones, due to privacy concerns.
Wireless sensing provided a suitable solution for privacy protection issues in personal authentication. The wireless signal can be altered due to personal behavior, and such a technique captures the varied reflected wireless signal and links it to personal identification. The Wi-Fi signal, as a pretty common signal in daily life, has been proposed to recognize the user's gait and authentication [11], [12], [18], [21]. The CSI from the Wi-Fi signal is detected and used to analyze the unique features. Wang et al. [11] utilized the spectrograms from CSI measurement leading by a walking pattern. Korany et al. [20] measured the Wi-Fi magnitude of a small number of transceivers to identify multiple users. Through a multidimensional framework, the signal from each individual can be separated. Xin et al. [21] introduced an indoor-based user identification method that employed principal component analysis, discrete wavelet transform, and dynamic time warping (DTW) for CSI waveform. The different technical approaches focus on applying the reflected signal from indoor Wi-Fi to identify the user. However, Wi-Fi signals still maintain a low sampling frequency and are susceptible to interference from other electromagnetic signals, making them unstable. High-frequency radar-based detection can also sense the user's gait information. Saho et al. [13] introduced a Doppler radar-based user identification scheme. It employed the micro-Doppler signatures from the specific motion, that is, sit-to-stand and stand-to-sit. The produced Doppler spectrograms were input into a convolutional neural network for identification. Besides the conventional spectrograms captured [19], Shah et al. [18] proposed to fuse Wi-Fi and radar imaging to recognize the freezing of gait episodes for patients with Parkinson's disease. CSI and micro-Doppler signatures were employed. In addition to the random forest (RF)-based method, Xu et al. [14] designed AcousticID that used the acoustic signal to capture body movement. As the propagation speed is relatively low, it allows for measuring body movement more accurately.
Although the application of wireless signals reduces intrusion to users, there are still more prefabricated conditions in their application for the placement of equipment, signal interference problems, environmental noise, and the system's layout. Therefore, it is essential to continue exploring gait recognition under a high degree of abstract data, which can be important for widening the application scenario. In this article, gait recognition is accomplished using 1-D signals related to the user's walking behavior. We identified the gait using only capturing the distance time series. It contains only the same information as a dotted person, and it is considered the most abstract information [9], [35].

III. PROPOSED METHOD
In this section, the proposed approach is introduced. Our system uses distance sensors to measure the velocity-related feature intervals of human gait from the front and to identify individuals. Fig. 2 shows the flow of the proposed method in the form of a diagram. The detailed implementation is described below.

A. Hardware Sensing System
The sensing system employed Nucleo F446RE as the controller and a VL53L1X ToF laser module for the distance sensor (Fig. 3). This device is connected to the computer via a universal serial bus (USB) serial interface and continually sends the measured distance information. VL531L1X, as a ToF sensor, can keep up to 4-m length detection [22]. Fig. 4 shows an example of the time series acquired by the ToF distance sensor.

B. Data Processing
A low-pass filter filtered the captured time-series data to eliminate the noise at first. The data acquired by the ToF sensor could be considered a form of human gait record. Thus, from the recorded distance variation, the gait phase can be divided into four stages: stop, accelerate, maintain a constant speed, and decelerate (Fig. 4). Because each person has different walking characteristics, the process of approaching the target by walking can generally be reflected via the velocity variance.   Since the data dimension used in this system is extremely low, we mainly obtain the changes in the distance sequence of the user's acceleration and deceleration process during walking to capture the speed change as much as possible. Fig. 5 shows the distance variations for walking.

C. Calculating the Velocity and Acceleration
To extract the velocity variations by each user, we calculated the time series of discrete differences by the relationship in (1) from the distance variation to obtain the velocity data. Similarly, as the main difference reflected by various users' gait behavior is related to acceleration and deceleration, we further calculated the discrete differences of the derived velocity time series, i.e., the acceleration time series. Fig. 6  presents an example of calculated velocity and acceleration time series from original distance data where x is the time series of distance and x ′ is that of velocity.

D. Classification
Due to the limited information regarding the gait and low dimensionality of the collected time-series data with a simple signal variation, the traditional frequency-domain data features and deep learning are difficult to show superior. Thus, we divided the obtained distance, velocity, and acceleration time series into several intervals, respectively, and calculated each interval's statistical features, including the mean, variance, maximum, and minimum value to form the handcrafted features set (Fig. 7). From the algorithm's perspective, it aims to capture more informative information regarding the human's walking status change based on different kinematic characteristics (i.e., the velocity and acceleration), which mainly reflect the different people's gait patterns on approaching a target. To process these handcrafted features, the RF classifier was adopted. Following this method, the microscopic variations from the gait pattern could be entirely explored, because all the distance, velocity, and acceleration alterations have been fully considered. The process of the classification algorithm is shown in Algorithm 1.

A. Participants
A total of ten participants (five male and five female) were recruited to evaluate the accuracy of the proposed system for personal identification. The related demographic information of participants was presented in Table I.

B. Environment
The participants stood upright at the position 3 m away from the sensing device and walked toward the sensor when the experimenter gave a signal. The system was mounted at 1.1 m from the floor and was in front of the participant. The participants were requested to walk naturally as usual.

C. Data Collection
The data were collected ten times per person. The sampling rate was 20 Hz. The sensing system started measuring the distance, while the distance between the user and the sensor was less than 3 m. Thus, each participant had ten-trail data, and each data generally had a 4-6-s length. The data from the sensing device were saved as a comma-separated values (CSV) file by executing a data recording script on the laptop.

D. Identification Result With Different Intervals and Classification Methods
The identification accuracy was evaluated by leave-one (trial)-out cross validation of each participant's dataset. In our method, we divided the time-series data into several intervals and calculated their statistical features as the feature set. Thus, the interval length/number could affect the feature number and influence the identification performance. We evaluated  the effect of interval length/number regarding the identification result. In general, one walking time series consists of 90-100 frames, so the interval numbers vary from 3 to 8 to test the system's performance to ensure the interval could contain enough effective information. Fig. 8 presents the identification variations against the different interval numbers. The performance of system alters between 85% and 91%. For six or seven intervals used, the system could have the optimal performance, as each interval normally keeps 0.7 s around information, which have a good insight into microscopic information regarding the gait, and enables the system to reach 91% around accuracy. Therefore, we divided the captured time series into seven intervals for handcrafted feature extraction.
Since the captured distance time series is only 1-D, we calculated the discrete velocity and acceleration time series to increase the input information. Table II shows the accuracy result of different time series used for identification. From the result, only 1-D data are challenging to provide enough information, and all three-time series, including the distance, velocity, and acceleration employed, are helping to improve the system performance, as enough kinematic characteristics could be captured.
Regarding the classification methods, we also evaluated several benchmark machine learning approaches for identifying the time series, including the DTW + KNN, DTW + SVM, SVM (radial basis function (RBF) kernel and C = 1000), decision tree (DT), and RF with 30 estimators. The DTW is suitable for measuring the shapes in a time series and is less affected by temporal expansion and contraction if the waveforms are similar [29], [30]. Notably, the kernel method with a global alignment kernel (GAK) was adopted in SVM [31]. As the DTW only processes the two-time series, we tested a single type of time-series data (distance/velocity/acceleration) and utilized the best one, i.e., the velocity data, as the input data to train the classifier for comparison. Also, for other DT, SVM, and RF methods, the input was aligned with the extracted features from several intervals. All the machine learning algorithms were implemented in Python with the TsLearn toolkit [32]. Table III presents the results of different classification methods. The results show that the RF can outperform other classification methods with the highest accuracy of 91.05%. Fig. 9 presents the visualized results of extracted features used for the RF classifier.

E. Identification Result With Different User Pools
Since the data dimension used is extremely low, it is incredibly challenging to require the system to be able to be applied to an extensive range of user pools. However, as mentioned earlier, the application scenario of user identification in public facilities is comprehensive. Therefore, the system's performance is experimentally investigated in the case of small-scale user pools. Such small-scale user pools are usually suitable for scenarios, such as homes, offices, and so on. We evaluated different user pool sizes from three to ten people. For the user number less than 10, tested users were extracted from recruited participants through a combination way. Also, the identification results under each combination are averaged to obtain accuracy. Fig. 10 shows the accuracy of the proposed system on different user pools from four to ten people.
From the figure, the identification performance gradually decreases, as the number of users increases. For smaller pools of users (e.g., four to six users), the system is able to better distinguish the gait characteristics of the users and is able to maintain an identification accuracy of 97%-95%. This also demonstrates the superiority of the method in situations where the number of users required is small (e.g., family units). However, for a larger number of users (e.g., eight to ten users), the system is able to produce identification results of around 91%. Therefore, the system is still feasible for user aggregation scenarios where the identification requirement is not really high and strict.

F. Identification Result With Different Trails Numbers
As an identification system, it is also required to consider the system's implementation conditions. To reduce the complexity and time it takes for users to build datasets, we also investigated different training samples to validate the system's performance. Since ten walking trails were collected for each participant, we varied the total trails from four to ten and evaluated the system's performance for different user numbers. The used trails were randomly selected from the initial dataset, and the evaluation method was still leave-oneout cross validation. Fig. 11 presents the results of using different trials for identification training and testing with different user numbers. For more users to identify, fewer trails cannot satisfy the required recognition. With more samples used for training, the system's performance is improving. Nevertheless, for fewer user numbers, few trials are able to train the classifier and identify the user with an acceptable result. For example, for only four users, six trials dataset with leave-one-out cross validation shows 96% accuracy. Since the time-series data used  in this system are 1-D, its data distribution is simple, enabling the system to use fewer samples to train the classifier. At the user terminal, it is convenient to collect the dataset of users in advance. Especially for smaller user groups, users usually perform five to six walk data (less than 1 min) collection to complete the training of the system to obtain 91% usage performance.

A. Performance and Characteristics
The CMC curve and confusion matrix in Figs. 12 and 13 show that the accuracy is significantly improved between ranks 2 and 3. Even if identification is wrong, the correct user is generally listed as the second/third possibility. Therefore, a simple improvement in the dataset's quality may solve the problem. For example, we can increase the number of datasets per user or the sensor's sampling rate. Table IV shows a comparison among the related nonvisual and noncontact gaitbased identification systems. Mainstream systems still focus on radar and Wi-Fi-based signal utilization. Through detecting the reflected wireless signal's variations, the machine learning method was employed to complete the feature representation and the identification. The performance was commonly reported between 80% and 95%. Also, for wearable-based gait recognition, such a system utilized the multiple inertial measurement unit (IMU) sensor or pressure sensor to capture the kinematic characteristics from human gaits to recognize the identification [5], [28]. Such systems have already presented Compared with other solutions for nonvisual gait-based identification, the proposed method made a unique contribution to its novel detecting method and resulted in more edge cases. Radar and Wi-Fi-based methods have attracted close attention for the most popular schemes, because the distribution of reflected wireless signals is wide. Information from the various human body parts can be used, which is better for building a classification system. However, in other words, the abundant information also contains redundant features that need to be distinguished. The abundant reflected wireless signal generally requires relatively high-power consumption at the signal transmitter (such as the WiFi transmitter typically consumed more than 5-W power [33]), which potentially challenges a more portable and flexible deployment solution considering the energy supply. Though the advanced radar sensor has become tiny and low power gradually, the high cost still limits more pervasive applications (e.g., IVS-979 radar sensor in [24] costs more than U.S. 300). On the contrary, this article explored ultralow information to build the identification system and demonstrated its feasibility to an extent. The ToF sensor used in this article typically consumed the current at the µA level [22] and 1 mW around power consumption and costs less than U.S. 10. It is feasible to be combined with any low-power-consumption microcontroller to be fused into a internet of thing (IoT) scheme. It has broader distribution characteristics than other wireless signal solutions, especially for the more edge cases, such as budget-limited and powerconstraint cases.

B. Sensor's Detection Range
The proposed system uses a ToF distance sensor to detect changes in human-system distance caused by gait. The field of view (FoV) of the sensor used in the study is 27 • . Typically, the measurable range of the sensor is tapered, and the longer the distance, the more comprehensive the range. However, humans sometimes have other body movements that impact their measurements while walking. For example, if the FoV is too large in the swing of the arm, the measurement data may be contaminated by the measured distance on the arm. If the FoV is too small, users need to walk straighter and more accurately when measuring gait. Although arm swing is a part of gait, it is better to separate it when measuring the distance from the center of the body. Therefore, this trade-off needs to be considered when adjusting the FoV and distance range of the sensor. Besides, in the experiment, participants did not change their clothing, so we did not discuss the effect of clothing on recognition accuracy. Since the distance sensor does not recognize the subject's silhouette, it should be less affected by clothing changes.

C. Sensor Placement and Application Scenarios
In this study, we placed a ToF sensor in front of a walking person. In a practical application, the mounting location is fixed, where the user walks directly toward a wall or a door. For example, the sensor can be placed on a door surface at the end of a path that the user can walk a few meters. In this study, the scenarios used were primarily in apartments or environments with long corridors. Also, it is suitable for relatively small users group identification, such as the home, office, and laboratory. Due to its low-power-consumption characteristic, the system could be powered by general portable power sources and flexibly deployed.

VI. LIMITATION AND FUTURE WORK
Though this article has introduced the feasibility of using 1-D distance data from a ToF sensor to recognize human identity by walking behavior, it still has several limitations. First, as the simple data were captured, the system only focused on the velocity change. It potentially causes the problem of nonhuman movement, which could lead to a false positive of the system. The next step could be to improve the system's robustness to fit a more unclutter environment. As the system uses the ToF sensor, it normally would be sensitive to the ambient light to an extent. This characteristic could also be fused to sense how the lightness alters (such as the indoor light) according to the object's movement and increase the insight of detected objects.
In addition, the system currently only tracks one person's walking behavior. However, broader cases show that users may be walking together and using distance data and face difficulty identifying multiple people [37]. In future work, coordinated designs using multiple distance sensors can be continued to adapt to a richer set of application scenarios. For example, human walking can be measured from the side, which increases the flexibility of mounting locations.
Regarding the identification algorithm, so far, we have tested several methods for handling time-series-based identification. However, such methods are all related to handcrafted features-based solutions. We did not implement any deep learning method (such as 1-D convolutional neural network (CNN) [34]). The main reason is that the proposed system is superior in its lightweight, tiny computational resources, and low-cost features. We expected the user's burden to be manageable, which means only a few data trails are needed from the user's end to form the user's dataset before deployment. Under such a situation, only five to nine trials are required, and the deep learning method is hard to figure out the in-depth features from such a simple time series. However, we believe that more complex situations could be considered and may envision a more advanced algorithm to process the data. The next step could also be figuring out the highperformance algorithm against the complicated situation and multiple sensors coordination.

VII. CONCLUSION
In this study, we proposed a privacy-aware system to identify individuals with ultralow-dimensional data. The system is assumed to be placed in front of the user. It fused the ToF sensor and detected the distance variance between the user and the system during the walking process. By extracting features from human gait patterns, we evaluated the system by leave-one-out cross validation, and the average identification accuracy was 91.05% for ten users. It provides a lightweight, low-cost, high-distributive user identification system for external facilitates. Although identification systems using nonvisual and noncontact signals have gradually increased, the variation of the single-dimensional distance data used in this article has advantages in the characteristics, power consumption, and cost of the system, even if the recognition performance cannot directly beat other systems of the same type. This system is extremely cost-efficient both in budget and power consumption and can be used as a boundary condition system (with low data dimensions) to provide coupling design with other systems to achieve better usage.