A Gesture Air-Writing Tracking Method that Uses 24 GHz SIMO Radar SoC

Gesture air-writing is an advanced nontouching interaction method that replaces traditional typewriter keyboards, touch screens and other input devices. Due to its low power consumption, noncontact detection and independence from light conditions, millimeter wave radar has become a valuable gesture air-writing solution. In this paper, we proposed a prototype of a gesture air-writing tracking system that is based on the 24-GHz frequency-modulated continuous-wave (FMCW) radar system-on-chip (SoC). The transmitted chirp signal of this radar chip covers up to a 4-GHz bandwidth, which provides sufficient range resolution to track hand gestures. With the development of single input, multiple output (SIMO) antennas, the air-writing symbols can be reconstructed in an observation plane. A system design and signal processing algorithm for gesture air-writing applications is proposed for the prototype system in this paper. To test the performance of the proposed method, experiments with cases of five simple gestures, air-writing numbers and letter tracking are carried out. The experimental results verify the efficiency and accuracy of the proposed method.


I. INTRODUCTION
Gesture recognition, as a mainstream future method of human-computer interaction, has been widely studied and applied in recent years. Among the relevant studies, gesture recognition based on visual sensors [1]- [5] is a popular method that has been widely studied and has a high recognition rate. However, vision sensors often have the following problems: susceptibility to light, hidden privacy risks, and inability to penetrate media, such as plastic and foam. Other choices, such as accelerometers and gyroscopes [6]- [8], are unpractical for wearable gesture recognition devices.
Compared with the traditional sensor mentioned above, millimeter-wave radar [9]- [19], due to its low power consumption, noncontact detection and independence from light conditions, has led to the development of a wide range of applications in gesture recognition. Through millimeter wave radar, the range, velocity, angle and micro-Doppler information of gestures can be obtained in real time, such as Google's Soli project in reference [13], which is based on The associate editor coordinating the review of this manuscript and approving it for publication was Guolong Cui . a 60-GHz radar carrier frequency signal. The Soli sensor is very small in size and can recognize delicate gestures accurately. In addition, the radar's dynamic gesture recognition can be easily combined with deep learning techniques [14]- [19].
Although various human gestures are involved in gesture recognition, air-writing is one of the most challenging and useful application. It is believed that writing in the air instead of typewriting will be more popular interaction method in the future. Air-writing refers to writing characters or words in free space by hand or finger movements [20], [21]. Users can write virtual language characters by gestures in free space, and these letters will then be tracked and recognized by smart sensors. Therefore, the position of the gesture in free space must be tracked in real time. Two main problems are of concern in air-writing: how to effectively obtain the trajectory information of gesture movements in the free space and how to recognize the desired writing characters through the acquired information.
In references [22]- [28], several different air-writing methods are introduced, including camera-based [22], [23], RFID-based [24]- [26], WiFi-based [27] and motion sensor-based [28]. To compare the implementation of different air-writing methods more effectively in these papers, the measuring principles and limitations of these methods are listed in Table 1 respectively. Compared with the above methods, millimeter wave radar has the advantages of not being affected by light, can measure the trajectory of moving targets, and does not need to wear any equipment. Therefore, in the paper, millimeter-wave radar will be used for air-writing. According to [29], [30], the radar-based triangulation algorithm is used to track and generate air-writing characters in three-dimensional free space. However, three radar sensors with single-transmitting and single-receiving are still needed. In air-writing processes, the minimum dimensions required for drawing letters are two-dimensional space. Therefore, to simplify the problem, we track the letters on the plane, and we simplify the 3D solid to the 2D plane. In addition, a SIMO mm-wave radar system with one transmitting channel and two receiving channels is introduced to achieve the air-writing with lower system complexity. Compared with references [29], [30], the transmission channels are reduced from 3 to 1, and the reception channels from 3 to 2. The total number of channels is halved. In this way, hardware resources are saved, and the computational complexity is reduced.
Based on this approach, in the reconstruction of gesture air-writing trajectories, the following three conditions were determined: first, the reconstruction of gesture characters should be in a planar area. Here, we set up a rigorous experimental environment and the scope of the gesture air-writing is limited to a small area of 20 cm in length and 15 cm in width to verify the effectiveness of the algorithm. The reason for limiting the area is that our target application is short-distance gesture sensing for smart devices. The schematic diagram of gesture air-writing scenario is shown in Fig. 1. Second, the radar system should be able to identify the trajectory of moving targets in the plane and have the smallest possible hardware equipment. In this paper, a gesture air-writing system on the 24-GHz wideband FMCW radar chip is introduced. The range of the swept frequency of the radar chip covers is 4-GHz. The radar chip has one transmit channel and two receive channels. Therefore, the Euclidean distance and planar angle can be measured. The algorithm should be stable and efficient. By configuring the radar chip parameters and a series of algorithmic operations, an instantaneous range-Doppler heat map can be obtained, and the two receiving channels can also acquire the gesture angle information. Finally, through feature extraction in the range-Doppler heat map and smoothing the filtering of the features, this approach can show the drawing of the letter symbols in the air.
The structure of this paper is as follows. In section II, the system architecture, the signal model of the FMCW radar, the parameter design of the radar chip and the radar signal flow will be introduced. Section III introduces the gesture feature extraction, three smoothing algorithms and the character generation method. In section IV, experiments are conducted to verify the effectiveness of the radar systems and algorithms. Section V shows the results.

A. SYSTEM ARCHITECTURE
In contrast to traditional literal writing, when drawing characters in the air, people often write characters in an imaginary area, and they try to achieve the corresponding characters with one stroke in the writing process. In this way, the written content can be human-computer interaction, and it is separate from traditional media. The air-writing track the movement attitude and trajectory of the hand and reconstruct the characters written in the air. Millimeter wave radar, which is capable of measuring the range, velocity, and angle of the target, has great potential. In this paper, our goal is to build a small millimeter-wave radar system to track and reconstruct human characters written in the air.
To solve the problem, the radar system must meet the following conditions. First, gesture air-writing is a 2D trajectory on a plane, and thus, the radar system should have the ability to detect the range, velocity and angle of arrival. Second, the composition of the hardware for the radar should be as simple as possible to save resources and costs. Finally, the implementation of the system algorithm should be in real time. VOLUME 8, 2020 To meet the first and second requirements of the system, in this paper, a radar chip that operates in frequencies that range from 23.5 GHz to 27.5 GHz with a configurable chirp duration and that has one transmitter and two receivers is used for the gesture air-writing. The internal block diagram of the chip is shown in Fig. 2. The chip is mainly composed of one transmitter, two receivers, a phase-locked loop (PLL), a pattern generator, analog to digital converters (ADCs), and so on. By configuring the internal chip register, the chip can transmit the FMCW signal, and then, the echo signal is received by two receiving antennas. In each receiving channel, the returned echo from the target is mixed with a replica of the transmitted signal, and the resulting beat signal is low-pass filtered and then sampled at the analog-to-digital converter (ADC). Therefore, the radar chip meets the needs of ranging and angle measurement. To meet the third requirement of the system, the hardware system prototype based on the millimeter wave radar chip is developed. As shown in Fig. 3, the hardware prototype of the entire radar system is composed mainly of one transmitting antenna, two receiving antennas, a millimeter wave radar chip, a digital interface, a FPGA, a USB 3.0 port and a host computer. The antenna gain is 6 dBi.
In the prototype of the radar system, first, the host computer configures and sends the register values to the millimeter wave chip; second, after receiving the register values, the millimeter wave radar chip transmits the corresponding FMCW signal waveform, and the echo signals are received by the two receiving antennas. Finally, after the internal processing of the chip, the baseband signals obtained by the two receiving channels are transmitted to the FPGA through a digital interface, and then, data are transmitted to the computer by the FPGA through a USB 3.0. Based on this self-developed radar system, our algorithm and gesture airwriting can be implemented in real time.

B. SIGNAL MODEL OF FMCW RADAR
To understand the principle of measuring the range, velocity and angle in our radar chip, the basic signal model of FMCW radar will be introduced.
By configuring the register parameters of the millimeter wave radar chip, the parameters of the FMCW transmitted signal, such as the bandwidth: B, the duration time of the chirp signal: T , and the repetition period time: T c , can be configured. The model of the chirp signal in our radar system is shown in Fig. 4. The expression of the transmitted chirp signal is (1) where f 0 is the center frequency of the chirp signal, T is the duration of the chirp signal, B is the bandwidth, and t is a fast time domain. In Fig. 4, τ 0 is the delay between the transmitted and received signal of the target. Suppose that the range from a target to the radar is R, and the velocity is v; then, the delay τ 0 = 2(R+vt)/c, where c is the speed of light. From [31], the range-Doppler signal model of the FMCW radar is where N is the number of chirps in a frame, T A is the ADC sampling interval time, M is the number of points sampled in the fast time domain, and N z and M Z are the matrix size of S 2D after zeros-padding. The peak frequencies of the twodimensional FFT are Therefore, the velocity and range information of the moving target can be obtained using this 2D-FFT model. In our radar system, one transmitting antenna and two receiving antennas are used. The geometric model of the range detection for a single target is shown in the Fig. 5.After 2D-FFT of the chirp signals, the phase difference between the two received channels is caused by only the antenna spacing 152730 VOLUME 8, 2020 The received signal model of two antennas in our radar system. Here, d is the distance between the two antennas, and θ is the angle of arrival. and the angle of arrival. The phase difference can be written as Where d is the distance between the two antennas, θ is the angle of arrival, and λ is the center of the wavelength.

C. RADAR PARAMETER DESIGN
In our radar system, the range of the swept frequency of the chirp signal can be from 23.5 GHz to 27.5 GHz, which covers a 4-GHz bandwidth. The A/D sampling rate at baseband signal is 2 MSa/s; each sampling point is 16 bits wide, and the slope of the frequency modulation is 6.25 MHz/us. The space between the two receiving antennas is 6 mm. Therefore, after understanding the basic parameters, the structure of the radar chip and the signal model in our radar system; the parameters for gesture air-writing based on the radar chip can be designed. Next, the definition of the basic parameters in the radar system will be displayed. In the radar system, the range resolution and bandwidth have the following relationship: where c is the speed of light, B is the bandwidth of the chirp signal, and R res is the range resolution. Regardless of the transmit power, the maximum detection range of the radar is where AD_Rate is the A/D sampling rate in the radar chip, SlopeRate is the slope of the frequency modulation, and R max is the maximum detection range of the radar system without considering the transmission power. The maximum measurement velocity is determined by the interval time between two chirp signals. The value is where λ is the wavelength of the center frequency, T c is the interval time between two chirp signals, and v max is the maximum measurement velocity of the radar system. The velocity resolution of the radar is inversely proportional to the frame time and is given by where N is the number of FFT points in the slow time domain, NT c is the frame time, and v res is the velocity resolution of the radar system. As shown in Fig. 5(b), the antenna spacing is d. The maximum measurable nonblur angle θ max is According to the above equations, the basic parameters of the radar system for the gesture recognition will be designed.
To increase the range of the swept frequency as much as possible, the duration of the chirp signal is set to 640 us. In this configuration, the chirp signal can cover a 4-GHz swept bandwidth. In addition, at a 2 MSa/s A/D sampling rate, 1280 data points can be obtained from the baseband signals, and the bit width of the data point is 16 bit. To simplify the operation, among these 1280 data points, the first 1024 data points are used to represent the baseband signals. Therefore, the effective sweep bandwidth used in the chirp signal is 3.2 GHz, and the range resolution is 4.69 cm.
The max velocity of the dynamic gestures is no more than 3 m/s here. Thus, the slow time interval is set to 1 ms, and the maximum measurement velocity is 2.94 m/s. The frame time is set to 32 ms to obtain an instantaneous range-Doppler map, and thus, the velocity resolution of the radar is 0.18 m/s, and 31.25 range-Doppler maps can be obtained in one second.
In the system design, the antenna spacing is 6 mm, and thus, the maximum measurement angle is ±67 • . The calculated radar parameters are listed in Table 2. Under these radar system parameters, the system has a good range and velocity resolution and can capture the instantaneous motion of hand gestures.

D. SIGNAL FLOW
In our configuration, the radar chip can achieve excellent range and velocity resolution and obtain the instantaneous motion characteristics of the gestures. However, without considering the transmission power, the maximum detection range in this chip configuration is 48 m; in addition, the baseband digital signals is composed of 1024 data points. It is a waste of resources and computation. Therefore, from the perspective of signal processing, the signal flow should be optimized.
In our gesture air-writing, the effective range is set to within 1 m, and therefore, to reduce the cost of the hardware, the 1024-points input baseband digital signal is filtered to 64 points by downsampling. The down sampling rate is 16. Downsampling does not change the range resolution; it only makes the maximum measurement range smaller, but this approach has met the needs of close-range gesture recognition. Thus, the range resolution is 4.69 cm. After down sampling, the maximum detection range is 3 m without considering the transmit power.
In this paper, a cascade integrator comb (CIC) filter [32] and CIC compensation filter are used to extract the baseband digital signals to save the computing resources. Compared with the FIR filter, the CIC filter does not require multipliers in the filtering, and it can save a substantial amount of computing resources.
In our design, the number stages of the integrator filter and comb filter are both 5. The down sampling rate is 16. The structure of the CIC filter is shown in Fig. 6(a), and the corresponding filter magnitude response is shown in Fig. 6(b). The CIC filter frequency response does not have a wide, flat pass band. To overcome the magnitude droop, an FIR filter that has a magnitude response that is the inverse of the CIC filter is applied to achieve a frequency response correction. Additionally, the order of the FIR filter is 40. The magnitude response of the CIC compensation filter and the total response are shown in Fig. 6(c). As seen from the figure, the pass-band of the total response becomes flat, and the stop-band of the total response is below −40 dB. Therefore, each signal becomes 64 data points after CIC filtering and CIC compensation filtering. In our radar system, the zero-padding method is used for the range FFT to improve the fineness of the spectrum. First, a Hamming window is performed on each 64-point baseband digital signals, and then, a 256-point range FFT is performed on the data with the zero-padding method. The zero-padding method does not change the range resolution of the radar, but it will make the spectrum more delicate and make the range of a single target more accurate. Thus, the range resolution of a single target is changed from 4.69 cm to 1.17 cm. The parameters in this section are also listed in Table 2. In the actual measurement of the radar system, due to the echo signal of the static targets and the strong signal crosstalk between the two antennas, a clutter removal module is needed after the range FFT. In our paper, the recursion filter in Fig. 7 is used. From Fig. 7, the following formula can be obtained: The α is 0.9 in our paper. Every 32 chirp signals in our radar system form a frame, and then, the system performs a 64-point Doppler FFT with the zero-padding method in the slow time domain. The realtime range-Doppler results can be acquired, and the velocity resolution of a single target can be changed from 0.18 m/s to 0.09 m/s. Similar to the results of Google Soli in reference [13], under the above signal processing flow, a dynamic range-Doppler map can be obtained, which changes with time and can obtain the range, velocity, time-frequency and other information on the gestures. It is these dynamic characteristics that enable the radar chip to recognize dynamic gestures. The fast and slow time processing results with the history of our radar system are shown in Fig. 8. Based on the above derivation, the basic signal processing flow can also be obtained. A schematic diagram of signal processing is shown in Fig. 9. First, the raw ADC data of the two receiving channels from the radar chip are filtered by the CIC filter and CIC compensation filter, and then, the output data of 1024 points are downsampled to 64 points. By down sampling, the number of data operations will be reduced without changing the radar range resolution.
Second, a 256-point range FFT in the fast time domain by the zero-padding method and clutter removal module are performed in each channel. The moving target will be obtained.
Third, every 32 chirp signals are combined into a frame. To make the spectrum more delicate, a 64-point Doppler FFT is performed in the slow time domain by the zero-padding method.
Finally, the range-Doppler maps from the two channels are conjugate multiplied, and the gesture features are extracted from the RD maps.

III. TARGET DETECTION AND TRACKING
As a device for moving target detection, radar can detect the range, angle, and velocity of the target. Therefore, using the information detected by the radar, the trajectory of the target can be tracked and generated. A schematic diagram of the radar moving target detection is shown in Fig. 10. From the figure, the coordinate system of the measured value of the radar and the coordinate system in the tracking of the trajectory are different. In actual calculations, a coordinate system conversion is often required. In section II, real-time range-Doppler maps are obtained after a series of signal processing flows. To reduce the computational overhead, some features are extracted from the range-Doppler results in this section. The radar system has a maximum measurement angle of ±67 • and only two receiving antennas. Therefore, under the restriction of these two conditions, the measured angle will be flipped at some larger angles, and it is also easily disturbed by noise. To solve these two problems, the extracted features must be filtered and smoothed before the gestures or handwritten letters in the air are generated. Next, the feature extraction method and the smoothing algorithms in our radar system will be introduced.
In addition, the constant false alarm rate (CFAR) algorithm is used to determine the start and end time of the effective motion. The shape of the characters drawn in the air can be reconstructed through the smoothed features. This has the following advantages: first, it can intuitively observe the characters drawn in the air. Second, it is convenient for future recognition work.

A. INFORMATION DERIVATION
After a series of signal processing steps in the fast time and slow time domain, the real-time range-Doppler map can be obtained. To generate the air-writing symbols, some features are extracted from the range-Doppler map.
To reduce the number of calculations in the system, in our feature extraction, only the values of the strongest energy reflection point in the RD map are selected. VOLUME 8, 2020 Since the effective range of the gesture is within 100 cm, the velocity V i and range R i that correspond to the peak point can be found in the Doppler map.
The phase value φ i of the peak point is The amplitude value Amp i of the peak point is Because only the feature values of the strongest scattering point in the RD map are extracted, it is easily disturbed by the noise. Therefore, a sliding average window is used to smooth the features. The sliding average filter is To determine the effectiveness of the hand movement, in this paper, the CFAR algorithm will be used. The amplitude value Amp i of the peak point in the RD map is used to detect the effective action area. The moving average is where µ is a smoothing factor. When the log amplitude Amp__Log of the gesture action is greater than the threshold value, the action is determined to be effective.

B. RAUCH-TUNG-STRIEBEL SMOOTHER METHOD
The Rauch-Tung-Striebel smoother method [33] is used in this paper to perform denoising on the input feature values. A schematic diagram for the data processing is shown in Fig. 11. First, the extracted range and phase features are smoothed through a sliding average window, and then, the feature values are passed through a Kalman filter. In addition, the extracted amplitude features are used to define the effective action area through the CFAR algorithm. After determining the effective motion area, the RTS smoothing algorithm is used to denoise, and finally, in the character generation module, the range and phase features are transformed into the X/Y coordinate axis to obtain the character shape.
In the Kalman filter [33], the dynamics and measurement models are where x k is the state, y k is the measurement, q k−1 ∼ N(0, Q k−1 ) is the process noise, r k−1 ∼ N(0, R k−1 ) is the measurement noise, and the prior distribution is Gaussian x 0 ∼ N(m 0 , P 0 ). The matrix A k−1 is the transition matrix of the dynamic model, and H k is the measurement model matrix.
The prediction step is The recursion is started from the prior mean m 0 and covariance P 0 .
From the Kalman filter, the RTS smoothing algorithm is given as where m k and P k are the mean and covariance computed by the Kalman filter. The recursion is started from the last time step.
To smooth the gesture feature values, in our model, A = [1, 1; 0, 1] and H = [1,0]. After the range and phase features are smoothed by the RTS algorithm, the values are used to transform the feature values to the XY coordinate system. The calculation formulas are as follows: whereR(k) andφ(k) are the smoothed range value and phase value at the frame k instant, λ is the wavelength, and d is the antenna interval space.

C. ALPHA-BETA FILTER METHOD
The alpha-beta filter is often used in radar systems as another effective smoothing method. Therefore, to verify the filtering performance of the RTS smoothing algorithm and the alphabeta filter, the two algorithms will be compared. The diagram of the data processing with the alpha-beta filter is shown in Fig. 12. The extracted range and phase features are also subjected to a sliding window average process, and then, the feature values are passed through an alpha-beta filter. In addition, the extracted amplitude features are used to define the effective action area through the CFAR algorithm. After determining the effective motion area, the character symbol is generated. In the alpha-beta filter, the prediction equations are as follows:X where T is the measurement update interval,X (k|k − 1) and X (k − 1) are the predicted and smoothed signal value at frame k instant, respectively, andV (k|k − 1) andV (k − 1) are the predicted and smoothed feature slopes at the frame k instant, respectively. The update equations are as follows: where Z (k) is the measured value of the feature at the frame k instant. The values of the correction gains α and β are chosen empirically, and in our case, they are set to α = 0.3 and β = 0.12.

IV. EXPERIMENT SETUP
After building the radar system, signal processing flow and data processing methods, the experiments will be established in this section to verify the system performance and detect gesture air-writing symbols in a planar area. Next, experimental procedures and test methods will be introduced.

A. SIMPLE GESTURES EXPERIMENT
The basic features of the range, velocity and phase values can be measured by our radar system. To verify the basic functions of the system, five simple gestures will be tested and classified and the full hand is used to perform gestures in this experiment. As shown in Fig. 13, five simple hand gestures are defined in the radar irradiation area. The first gesture is clockwise rotation, the second gesture is counterclockwise rotation, the third gesture is the palm moves forward and backward, the fourth gesture is the hand sliding from left to right, the fifth gesture is the hand sliding from right to left. After extracting the range, velocity, and phase features from the range-Doppler map of each frame, the features are smoothed by a sliding average window, and the window length is 10. Then, the smoothed results are stored in a 100 * 3 vector as a sample. For each gesture, 300 samples are collected, 240 samples for the training set and 60 samples for the testing set. Finally, we use the long short-term memory (LSTM) network [34] to distinguish the five gestures. The network structure and parameters are shown in Table 3. Through simple gesture recognition, the basic functions of the system will be verified. We also compared the proposed LSTM network with two machine learning algorithms, Support Vector Machine (SVM) [35] and 5 Nearest Neighbor (5NN) [36]. We measured the performance of the proposed system by using the 5-fold cross-validation with the data set collected from the 5 simple gestures. The data set are divided into 5 sets, four of which are used for training and the rest for testing.

B. SMOOTHING ALGORITHMS COMPARISON
In reference [25] and [26], the alpha-beta filter and Kalman filter are used to smooth the signal respectively. In this paper, three different smoothing algorithms are introduced, the alpha-beta filter, the Kalman filter and the RTS smoothing algorithm. The RTS smoothing algorithm is based on the Kalman filter. To verify the effect of these three smoothing algorithms on our generated characters, the experiment will be established and tested.
The experimental implementation scenario is shown in Fig. 14. A rectangular plastic plate with a length of 20 cm and a width of 15 cm is placed above the radar system. By placing the radar system, the radar can detect the range, velocity and phase features in the plastic plane. The hand with pointing finger is moved along the edge of the plastic plate. The radar is used to measure and track the edge shape of the plastic. We will generate the edge trajectory of the plastic through three different smoothing algorithms and then compare the trajectory shape generated by three different algorithms. The flow charts of the two algorithms are shown in Fig. 11 and Fig. 12. To independently verify the smoothing algorithm, the sliding average window size is set to 1.

C. GESTURES AIR-WRITING EXPERIMENT
After finishing the plastic edge detection, 10 number symbols from 0 to 9 are drawn in the area of the plastic plate. The trajectories of these number symbols are tracked using the radar system. The experimental test scenario is shown in Fig. 14.
Since the length of the plastic is only 20 cm and the width is only 15 cm, drawing number symbols on the plastic plane is a very demanding experimental condition for radar systems.
Finally, the plastic is taken away, and 9 alphabetic characters: 'B', 'C', 'D', 'N', 'G', 'M', 'X', 'E', 'F', are tracked and reconstructed in the 1-m planar area that the radar can illuminate. The corresponding results are also observed by the radar system.
Through this experiment, 19 characters will be generated using the radar system. The purposes of the experiment are first, to test the radar system's ability to detect character trajectories in plane area, and second, to verify the reliability of the algorithm.

V. RESULTS
In this section, the results of the three experiments in the section IV are shown. The first part is the result of simple gestures recognition, the second part is the comparison result of three smoothing algorithms, and the third part is the result of gesture air-writing.

A. SIMPLE GESTURES RESULTS
The phase, range and velocity feature values of the five simple gestures are shown from Fig. 15 (a) to Fig. 15 (e). It can be seen from these results that the patterns are different when the features of each gesture are combined together. The five gestures are classified by the LSTM network, which is defined in Table 3. The confusion matrix results are shown in Fig. 16. In the confusion matrix, label 0 is the clockwise gesture, label 1 is the anticlockwise gesture, label 2 is the front-back gesture, label 3 is the left-right gesture, and label 4 is the right-left gesture. The train validation accuracy is 97.6%.
From the features of different gestures, the motion trajectory of each gesture can be generated by using (22). The trajectories of the five gestures can be seen from Fig. 17. The trajectory shapes of the clockwise gesture and anticlockwise gesture are all circles. The trajectory shapes of the left-right gesture and right-left gesture are all straight lines.
In Table 4, the results of three different classification algorithms are shown. It can be seen from the table that compared with SVM and 5NN algorithms, LSTM network has a higher average recognition rate and a smaller recognition   variance. Our gestures are more suitable to be recognized by LSTM network.
The above results show the following: First, combined with deep learning, our radar system has the ability to recognize simple gestures, and it can be used for human-computer interaction. Second, through the extraction of the phase feature values and the range feature values from our radar system, the movement trajectory of the gesture can be generated, and thus, the radar system has the potential of drawing characters in the air.

B. ALGORITHM COMPARISON RESULTS
Here, the air-writing methods comparison is listed in the Table 5. In our paper, a radar system with only one transmitting antenna and two receiving antennas is used, and the number of RF channels in our radar system is 3. However, with the triangulation method used in references [25] and [26], three radar systems are needed and each radar system has a transmitting antenna and a receiving antenna. The number of RF channels is 6. The number of RF channels is reduced by half in our method. Therefore, hardware resources can be saved. In addition, the range resolution is 4.69 cm in our radar system. It can also achieve centimeter resolution compared to other radar systems. From the perspective of signal processing, the triangulation method only calculates the range of the hand movement, and then, it analyzes the position in space. The method in our paper is to calculate the character position by the range and angle. A problem encountered in studying these algorithms is that noise will strongly interfere with the signal. Therefore, how to effectively remove noise from the signal becomes a key step in the signal processing process. In reference [25], the alpha-beta filter is used as a denoising method to smooth the range. In reference [26], the Kalman filter is used for denoising. Compared with the denoising method in references [25] and [26], since we only use two receiving antennas to measure the angle, it is more susceptible to noise interference.
To compare the performance of the three smoothing algorithms in gesture air-writing in our radar system, the experiment on plastic edge detection is verified here. The results are shown from Fig. 18 to Fig. 20. The phase feature values are shown in Fig. 18, the range feature values are shown in Fig. 19, and the generated plastic trajectory by using (22) are shown in Fig. 20. In each figure, the black line represents the feature values that are directly generated by the raw feature values, the red line represents the feature values that are   generated by the alpha-beta filter, the magenta line represents the feature values that are generated by the Kalman filter, the blue line represents the feature values that are generated by the RTS smoothing algorithm.
As seen from Fig. 18 and Fig. 19, the phase feature values and range feature values in the black lines are very susceptible to noise interference. The reasons are as follows. First, only the range value and phase value of the strongest scattering point in the range-Doppler map are selected as the feature values. Second, due to nonrigid body movement of the hand, with radar, it is easy to find the wrong target when capturing the hand movement.
From the resulting plastic edge trajectory in Fig. 20, the following can be seen. First, the plastic trajectory in the black line can hardly show any shape when calculated directly from the original features. Second, after smoothing with the alphabeta filter and Kalman filter, the plastic trajectory in the red line and magenta line can be observed, but the vibration is very large in the two filters. Finally, after using the RTS smoothing algorithm to denoise the features, the obtained plastic shape trajectory in the blue line is the smoothest. The rectangular shape of the plastic can be obtained from the results of the RTS smoother.
Therefore, comparing the three smoothing algorithms, the RTS smoothing algorithm has the best smoothing performance than the alpha-beta filter and Kalman filter. In this paper, the RTS smoothing algorithm is selected to generate the character trajectories. The character generation method is shown in Fig. 11.

C. LETTER SYMBOLS TRACKING RESULTS
In Fig. 21(a) and Fig. 21(b), the phase feature values and range feature values of the drawing number symbols in the area of the plastic plate are shown. From the values of these features, it can be seen that different number symbols have different patterns for the phase and range values.
In Fig. 22, the results of the plastic plate edge and drawing number symbols from 0 to 9 in the area of the plastic plate are shown. In the figure, the black line is the result of tracking the plastic edge. The red lines are the results of tracking number symbols on the plastic. The results show that the 10 tracked numeric symbols are all within the plastic border area.
In Fig. 23, the results for the 9 alphabetic characters are shown. From the figure, symbol 'B', symbol 'C', symbol 'D',  symbol 'N', symbol 'G', symbol 'M', symbol 'X', symbol 'E' and symbol 'F' can be seen very clearly.
From Fig. 22 and Fig. 23, it can be seen that 19 characters can be tracked and reconstructed through our SIMO radar system.

VI. CONCLUSION
In this paper, a SIMO radar system was used to solve the problem of drawing characters in a planar area. Through signal processing, feature extraction and a smoothing algorithm, the following work has been accomplished: First, based on the SIMO millimeter wave radar system, the framework of the signal processing was constructed. The dynamic range-Doppler heat map, range, velocity and timefrequency of the gestures can be obtained.
Second, to verify the effectiveness of the feature extraction, five simple gestures were defined in the system: clockwise gesture, anticlockwise gesture, front-back gesture, left-right gesture and right-left gesture. A simple LSTM network was used to distinguish the five different gestures, and the train validation accuracy was 97.6%. In addition, the trajectories of the five gestures can also be generated.
Third, the methods of the RTS smoothing algorithm, alphabeta filter and Kalman filter were compared using the plastic edge trajectory experiment, and the results show that the RTS smoothing algorithm has better denoising performance and a smoother plastic edge trajectory. Therefore, the RTS smoothing algorithm was used for character reconstruction in this paper.
Finally, 10 numerical symbols in the plastic plane and 9 alphabetic symbols written in two-dimensional space were tracked and reconstructed. Therefore, through our system and algorithm design, we can track and reconstruct the trajectory and the shape of the characters while recognizing simple gestures.
In the paper, although 19 characters were tracked and reconstructed, the generated characters were not recognized. Therefore, in future work, the recognition algorithm of the generated characters will be studied.