Device-Free WLAN Based Indoor Localization Scheme With Spatially Concatenated CSI and Distributed Antennas

Various machine learning (ML) based localization schemes using channel state information (CSI) in wireless local area networks (WLANs) have been investigated recently. Adopting a proper feature selection technique is important to achieve further improvement in detection accuracy. As described herein, we propose a device-free indoor localization scheme using a lightweight ML model with compressed spatially concatenated CSI in WLAN systems with distributed antennas. In this scheme, feedback beam-forming weights (BFWs) are collected at a CSI capture terminal. Then, current and past BFWs are concatenated as accurate feature data to characterize the object behavior. Additionally, we propose the use of a frequency-domain sampling scheme for a low-complexity real-time target position detection with a small number of datasets. Using ray-trace based simulation analysis and experimentally obtained results from an indoor environment, we demonstrate that the proposed scheme using the concatenated CSI is effective not only for achieving more accurate real-time detection, but also for reducing the necessary complexity for both off-line training and on-line classification compared with other reference schemes.

Abstract-Various machine learning (ML) based localization schemes using channel state information (CSI) in wireless local area networks (WLANs) have been investigated recently. Adopting a proper feature selection technique is important to achieve further improvement in detection accuracy. As described herein, we propose a device-free indoor localization scheme using a lightweight ML model with compressed spatially concatenated CSI in WLAN systems with distributed antennas. In this scheme, feedback beamforming weights (BFWs) are collected at a CSI capture terminal. Then, current and past BFWs are concatenated as accurate feature data to characterize the object behavior. Additionally, we propose the use of a frequency-domain sampling scheme for a low-complexity real-time target position detection with a small number of datasets. Using ray-trace based simulation analysis and experimentally obtained results from an indoor environment, we demonstrate that the proposed scheme using the concatenated CSI is effective not only for achieving more accurate real-time detection, but also for reducing the necessary complexity for both off-line training and on-line classification compared with other reference schemes.
Index Terms-Beam-forming weight, localization, spatially concatenated channel state information, wireless local area network.

I. INTRODUCTION
W IRELESS sensing technologies integrated with wireless communications are key technologies for development of 6 G and beyond [1], [2], [3]. To be more specific, future wireless networks are expected to provide not only data transmission but also additional functions to support new application services such as sensing by radio signals. Recently, indoor object detection and localization using radio signals of existing communication infrastructure such as wireless local area Osamu Muta is with the Center for Japan-Egypt Cooperation in Science and Technology, Kyushu University, Fukuoka 819-0395, Japan (e-mail: muta@ait. kyushu-u.ac.jp).
To this end, an effective device-free CSI acquisition scheme for WLAN-based sensing is proposed wherein feedback frames conveying beam-forming weights (BFWs) are collected at an off-the-shelf WLAN device, after which they are used to train a machine learning (ML) model and to detect object positions and their behaviors [21], [22]. Although this method is effective for collecting CSI without explicit measurements, the available CSI and its achieved accuracy are limited when WLAN is equipped with only a single antenna. Although the localization accuracy can be improved by increasing the number of antennas in WLAN systems [24], [26], the achieved performance might deteriorate when a lightweight algorithm with a small dataset is used. In addition, because radio propagation characteristics are sensitive to antenna locations and their surrounding environments, investigating an effective feature selection technique for ML-based object detection is important. However, as described above, the effects of surrounding radio propagation environments on the achievable performance have not been analyzed sufficiently. To enable real-time sensing at a mobile WLAN device with limited This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ hardware and energy resources, the required complexity and dataset must be minimized. Therefore, further investigation and analyses are necessary to develop a lightweight algorithm with a small dataset. The original contributions of this paper are three, as presented below.
r As described in this paper, we aim to design a real-time device-free indoor ML-based localization scheme with compressed spatially concatenated CSI in WLANs with distributed antennas. In this scheme, CSI feedback frames are collected continuously at an off-the-shelf WLAN device similarly to a process described by Murakami et al. [21]. Then current and past BFWs are concatenated as more accurate feature information. Subsequently, they are used as training data for an ML model. Unlike the conventional scheme [21], more accurate and low-complexity detection is possible by learning concatenated CSI that includes channel information in a spatial domain.
r Second, we propose the application of frequency-domain sampling-based simple compression to concatenated BFWs to reduce the inherent complexity. In this method, the collected BFWs are sampled at every several subcarriers in the frequency domain so that the inherent computational complexity can be reduced while improving the object detection accuracy to a considerable degree. Additionally, we theoretically discuss an optimum compression ratio that minimizes the data size (i.e., required complexity) under a given object detection performance, based on the fact that frequency-domain sampling of BFWs is equivalent to delay-time domain windowing of its inverse Fourier transform.
r After implementing the proposed design to an IEEE802.11ac-based WLAN, we conduct experimental evaluations and ray-trace based simulation to demonstrate the effectiveness of the proposed approach in an indoor environment under a multi-user (MU-)MIMO scenario. By collecting feedback frames and by building a database of the concatenated CSI, we clarify that the proposed approach achieves much faster execution while achieving object position detection more precisely than when using the conventional scheme. Notation: Vectors and matrices are expressed respectively as lower case and upper case letters in bold typeface. Superscript H denotes the Hermitian transpose of a matrix. R a×b and C a×b respectively denote the real and complex matrix fields of dimension a × b. Z represents the set of integers. * stands for the convolution integral. The notation and variables used for this study are listed in Table I.

II. RELATED WORK
Various WLAN-based sensing techniques have been proposed in the literature [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [24], [25], [26]. One typical approach is to use measured RSS data to characterize target objects. In one earlier study [13], an improved RSS fingerprinting algorithm for indoor positioning is presented, for which a clustering algorithm is adopted to delete noisy samples. The authors in [14] earlier proposed a data-rate based fingerprint method that achieves comparable localization performance without measuring RSS, where the transmission power and resultant data rate are used in the fingerprint database. In one earlier reported method [15], RSS data of large amounts measured using the distributed massive MIMO systems are exploited for estimating the location of a user, where RSS data clustering is applied to reduce complexity. As investigated in the existing works described above, RSS data and other related information are available in most WLAN devices. They are readily applicable for sensing purposes. However, exploiting RSS data alone is insufficient for accurate sensing because those results indicate only the received power averaged over the signal bandwidth. An alternative approach to achieving additional performance improvement is to exploit CSI, which exhibits radio propagation characteristics. The authors of one earlier study [16] presented an overall system design for location-oriented activity identification, where CSI in OFDM systems obtained through existing WLAN devices is exploited for a walking user's recognition of various activities. One earlier report [17] presents a design of a smoking detection system in which a specific motion such as smoking activity is detected by extracting meaningful CSI variation information from WLAN signals. Articles in [27], [28], [29] proposed hybrid schemes that solve the localization problem through hybrid RSSI and time of arrival (TOA) measurements (or time difference of arrival). Unlike these approaches assuming that the target has wireless devices, the proposed method in this paper relies on a device-free concept that can detect an object with no wireless device. Another report [18] presents a through-the-wall human detection system design using commodity WLAN devices by which principal component analysis based filtering is applied to extract meaningful features from CSI, where correlated subcarriers are selected to extract features for robust human detection. Nevertheless, the main emphasis of subcarrier selection is detection of the existence of human beings. Therefore, an effective subcarrier selection scheme is still necessary for achieving lightweight device-free localization. In [30], a subcarrier power allocation scheme for OFDM-based localization is developed in which subcarrier power allocation at the anchor nodes is optimized to reduce positioning errors. In [31], orthogonal multicarrier-based proving signal designs for base station (BS)-based or mobile station (MS)-based localization are presented where non-overlapping subcarrier allocation is necessary for MS based localization to differentiate signals from different BSs. In [32], a RSSI based positioning scheme is proposed in which RSSI of a selected reference subcarrier is exploited for trilateration-based positioning. Unlike the works described above, this paper presents consideration of simple but effective frequency-domain sampling (i.e., subcarrier selection) that is applicable to compressed CSI specified in IEEE802.11, i.e., beam-forming weights. Reportedly, more reliable results can be achieved by adopting a majority-vote-based detection scheme using MIMO technique. In another report [19], a fingerprint quality classification method is presented for improving CSI-based positioning accuracy, whereas a convolutional neural network based approach has been applied for CSI-based positioning in another earlier report [20]. One remaining challenge of CSI-based approaches is acquisition of a sufficient number of CSI samples. In some earlier reports [21], [22], [23], [24], [25], [26], effective IEEE802.11ac-based CSI acquisition schemes are presented in which feedback CSI frames from all nearby devices are collected and analyzed for sensing purposes. The experimentally obtained results are also discussed in an earlier report [26]. Consequently, the sensing area can be expanded easily without installing additional measuring stations (STAs). In one earlier study [23], a key feature extraction scheme using PCA is adopted for acquired CSI samples for WLAN-based human detection with a deep neural network and numerous CSI samples. Although this method is used to reduce the dimensions of huge CSI samples, it remains unclear whether this method is effective for extracting key features from a few CSI samples. Unlike deep-learning-based methods trained using massive datasets, the method described in this paper is a lightweight approach with a small data set. Spectrogram-based detection and estimation techniques have been investigated in the literature [33], [34], where multidimensional analysis of time-frequency signals is considered. In [34], spectrogram-based analysis is reported as effective for R-R Interval estimation in vital sign monitoring in the health care field. According to these existing works, the detection performance is expected to be enhanced by analyzing channel characteristics multidimensionally.
Unlike the works described above, the main emphasis on this paper is the design of an effective spatially concatenated CSI-based localization scheme that works with a small dataset 1 . To this end, we propose a device-free WLAN based localization scheme with spatially concatenated CSI and frequency-domain sampling. We also discuss the achieved detection performance through simulation and results obtained through real-time experimentation in an indoor MU-MIMO scenario. Fig. 1 depicts a WLAN-based block diagram with AP, a single antenna user device (STA: station), and a CSI capturing terminal to extract feedback BWFs from the captured feedback frames and analyze them, where IEEE802.11ac-based OFDM transceivers [36] are used. Here, M , N , and S respectively denote the number of transmit antennas at AP, the number of users, and the total number of streams. Here, S ≤ min(M, N ). Also, min(a, b) is a function that selects a smaller value (either a or b). This section defines signal representations at AP and user devices. Details of the CSI capturing terminal are presented along with the proposed design in the next section.

III. SYSTEM DESCRIPTIONS
The OFDM signal with K subcarriers is transmitted through M antennas at AP.
and where K represents the number of subcarriers per OFDM symbol. Here, h k,n,m is the channel coefficient between the m-th transmit antenna and n-th user device at the k-th subcarrier. Consequently, the concatenated channel matrix over subcarriers in the frequency domain can be expressed as On the STA side, after passing through the MIMO channel, the received OFDM signal is demodulated with FFT for channel estimation and data detection. The estimated channel matrix at the k-th subcarrierĤ k ∈ C N ×M is decomposed by singular value decomposition (SVD) into respectively represent the left-singular matrix, right-singular matrix, and diagonal matrix for which the diagonal element is a singular value of the channel. If M > N, then the right-singular matrix is reduced In IEEE802.11ac standards, a compressed version of right-singular matrix V k is fed back to AP side as BFWs. To this end, the right-singular matrix V k is converted to an angle information sequence (φ k j,i and ψ k j,i defined in Appendix A) by application of Givens rotation to V k as a linear transformation to create a zero-entry in a matrix [36] as where i and j respectively represent the transmitting and receiving antenna indices. Furthermore, D k i and G k ij respectively denote the M × M diagonal matrix and the Givens rotation matrix [36]. The quantized and compressed CSIφ k j,i andψ k j,i are fed back from STA to AP to perform downlink beam-forming. The analyses presented herein assume thatφ k j,i andψ k j,i are quantized respectively by 6 and 4 bits. This paper examines a device-free localization problem, i.e., the target person has no wireless devices. We consider a multi-user MIMO-based system serving multiple STAs.

IV. PROPOSED APPROACH FOR EFFECTIVE LOCALIZATION
This section explains concepts of the proposed localization approach using the concatenated CSI. The CSI capturing terminal in Fig. 1 captures feedback frames from user devices and detects the BFW matrixV signify the p-th concatenated BFW matrix. Fig. 2(a) presents the concept of the proposed approach using spatially concatenated CSI, where current and past BFW matrices are concatenated to obtain more accurate feature data. Here, "spatially concatenated CSI" denotes a group of multiple CSI samples obtained when the target person is located at different positions. As this figure shows, when the target object is moved from point (A) to point (B), the channel state between the transmitter and receiver will change in time. Because this paper presents consideration of a device-free scheme that requires no additional functions of user devices for sensing purposes, the CSI capturing terminal analyzes channel fluctuation in a spatial domain by capturing these BFWs sequentially. The concatenated CSI over space and frequency domains is expressed aŝ where U is defined as the concatenated CSI length, which shows the number of BFWs in each concatenated CSI, and where p > U is assumed.
To reduce the required complexity, we propose application of a simple frequency-domain sampling scheme to the concatenated BFWs by compressing the total data size. The concept of the frequency domain sampling based compression is presented in Fig. 2(b), where BFWs over frequency-domain V k are selected (sampled) at every D s subcarrier before application to an ML block. Thereby, the data can be compressed to 1/D s . For purposes of explanation in this paper, we define the compression ratio by frequency-domain sampling as R c = 1/D s . Application of the frequency-domain sampling can reduce the total data size of the concatenated CSI while avoiding loss of the essential channel information. After concatenating current and past U − 1 BFWs as single feature data, they are used for both off-line training and on-line detection. Fig. 3 presents basic principle of frequency-domain sampling and its equivalence relation to delay-time domain windowing. As shown in this figure, we assume that frequency response H(ω) is sampled at every ΔΩ duration. Therefore, the discrete-frequency-domain response is given as Assuming that the Fourier transform pair H(ω) ↔ h(t) is given, the inverse Fourier transform of H(ω)δ D (ω) is given as the following.
Therein, ΔT = 2π ΔΩ = 1 D s Δf . Also, D s Δf and ℱ −1 [·] respectively denote the sampling interval in the frequency-domain and inverse Fourier transform operation. As presented in (3) and Fig. 3, the inverse Fourier transform of H(ω)δ D (ω) is a periodic function with period ΔT . Therefore, it is clear that the CSI size can be reduced to R c = 1/D s without loss of feature information if the length of impulse response h(t) is less than ΔT , i.e., the frequency domain sampling rate should meet the following condition: As discussed later in Sect. V, the required execution time is increased with the increase of the sampling rate. Therefore, the sampling rate can be optimized by selecting τ as a proper value to minimize the required complexity while avoiding the loss of key feature information in frequency-selective fading channel environments. In addition, because the CSI compression by frequency-domain sampling is equivalent to delay-time domain windowing of impulse response h(t), the achievable minimum compression ratio in the frequency-domain is the same as 1/D s . It is readily apparent that the same compression effect is obtainable in either a time domain or a delay-time domain. 2 As an example of space-frequency channel matrix and its right-singular matrix (i.e., BFW), three-dimensional plots of |h k,n,m | and |v k,n,m | with respect to the subcarrier index k and the antenna index m in terms of different user index u are shown respectively in Fig. 4. These plots show that rightsingular matrices (i.e., BFWs) fluctuate depending on the current channel condition (H). Therefore, they are useful as feature information to characterize object behaviors. The l2-norm of BFWs over users is normalized (i.e., V k,m = 1), unlike the l2-norm of the channel matrix over users H k,m , where V k,m = N n=1 |v k,n,m | 2 . In other words, although antenna and subcarrier-wise channel strength information in H is lost in V, it is noteworthy that each beam-forming vector is a channeldependent weighting factor as indirect space/frequency-domain channel state information.

A. Simulation Setup and Scenario
We conducted a computer simulation to clarify the effectiveness of the proposed schemes. The simulation block diagram is the same as that presented in Fig. 1. Simulation parameters are presented in Table II. To calculate various channel impulse responses (CIRs) in indoor radio propagation environments, a three-dimensional ray launching algorithm is used [37]. The maximum numbers of reflections and the maximum number of diffraction are set respectively to three and one. The other environment setting and physical property values of objects are described in Fig. 5. For this simulation, we assume a device-free type localization, i.e., the target object has no wireless device for sensing. In this context, CSI between AP and the user device  is measured when a target object is placed at one of the grids in Fig. 5, where the grid size is set as Δ = 0.05 m, so that 400 different CIR measurements (OFDM symbols) per area are obtained. Stated more specifically, an indoor field is divided into R areas labeled as r = 0 · · · R. A target object is located in one of the areas. Label numbers r = 1 · · · R correspond to the area number in Fig. 5. The label number r = 0 is defined to represent a case in which no target object exists in any area. It is noteworthy that r = 0 corresponds to a case in which no target object exists in any area. We consider a multi-class classification problem to detect the label of the area in which a single target exists. Random forest with four-split cross-validation is used as a supervised machine learning model and its evaluation scheme. We consider an IEEE802.11ac-based WLAN system [36] in which feedback frames containing CSI (φ k j,i and ψ k j,i defined in Appendix A) are sent from the user device to the AP 3 .
As a performance metric, we define the probability of position detection as a conditional probability that the detected label 3 Feedback CSI includes 6 angle values per subcarrier when (M, N ) = (4, 1) (φ k 1,1 , φ k 2,1 , φ k 3,1 , ψ k 2,1 , ψ k 3,1 , and ψ k 4,1 ), and includes 12 angle values when  number s o is matched to the actual one s a , which is where r ∈ {0, 1, · · · R}. For comparison, we consider another subcarrier selection scheme, designated as "localized sampling (LS)". The sampling in a localized manner is presented in Fig. 6, where the signal after sampling includes only a portion of the overall frequency response (i.e., L adjacent subcarriers), unlike the proposed approach. Here, L = K D s , where × is a floor function that outputs a maximum integer value that is less than or equal to x.  1) is assumed. Here, U = 1 corresponds to the conventional scheme in [21]. These results indicate that the proposed scheme (U = 2) achieves higher average detection performance at most areas than the conventional scheme (U = 1), even when frequency-domain sampling (R c = 1/13) is applied. This finding implies that frequency-domain sampling is effective at reducing the required data size (i.e., computational complexity) while achieving detection performance comparable to that of the case of R c = 1. Fig. 8 shows the average detection probability of the proposed scheme as a function of compression ratio R c in terms of the concatenated CSI length U , where the number of antennas at AP is M = 4. For comparison, the average detection probability of a case with localized sampling is also shown. The label "Interleaved" denotes a case with frequency-domain sampling used in the proposed scheme. It is apparent from this figure that higher detection performance is obtained as U increases. Additionally,  we can confirm that "interleaved" sampling achieves higher detection performance in a lower compression ratio region because the overall spectrum information is not contained in CSI after localized sampling, unlike the interleaved one in the proposed scheme. Fig. 9 shows the average detection probability of the proposed scheme in terms of the number of users N and the concatenated CSI length U in the MU-MIMO scenario, where the compression ratio is set as R c = 1, 1/13, and 1/26, respectively, where N = 1, 4 and U = 1, 2 are assumed. It is apparent from this figure that the proposed scheme with U = 2 achieves better detection performance than the conventional scheme (U = 1), even when the compression ratio is R c = 1/26. Similar results are also obtained in cases for which U = 4. Particularly, one can find that the proposed scheme achieves good detection performance comparable to that obtained in the case without sampling (R = 1), even when the compression ratio of R = 1/26 is adopted.

C. Experiment Setup and Scenario
For an experimentation-based demonstration of the performance achieved using the proposed scheme, we implemented  our designed algorithm to an IEEE802.11ac-based system as shown in Fig. 1 and conducted experiments in an indoor environment. Fig. 10 shows the experiment setup and an indoor environment with an AP, a CSI capturing terminal, and user devices (STA), where IEEE802.11ac-based AP and STA are placed on both sides of the room. As the figure shows, four antennas and a single antenna are equipped respectively with AP and STA. Some system parameters used for experimentation are presented in Table III. The AP equips a linearly placed antenna array, where the distance between adjacent antenna elements can be extended using coaxial extension cables. The user device (Galaxy S7 edge; Samsung Electronics) is fixed with the tripods as shown in Fig. 10. It regularly sends feedback frames to the AP. At the CSI capturing terminal, the required functions are implemented on a stick-type PC (Compute Stick STK2M364CC; Intel Corp.) to capture the feedback frames and to build the measured CSI database. Random forest is used as a supervised Fig. 11. Example of measured beam forming weights included in a concatenated CSI. machine learning algorithm. For experiments, off-line training is conducted to construct the model beforehand, whereas on-line object detection is conducted in real time. In offline training processing, the person is moving within area-i while the object detection terminal captures (overhears) feedback frames from STAs to BS and extracts the CSI. After applying preprocessing and frequency-domain sampling to the extracted CSI, they are labeled as "i". The person moves from area 1 to 32 so that CSI dataset for labels 1-32 are constructed. The machine learning model is then trained by the constructed dataset. This device-free detection scheme obviates the need for user devices to have any dedicated function for sensing purposes. In addition, no need exists for a target object to have any wireless device.
Similarly to the simulation scenario, we consider a multiclassification problem for detecting a single object position 4 (i.e., the corresponding label number), where the indoor area is divided into R = 32 regions labeled as r = 0, . . . , R. A human is placed at one of the areas as a target object. 5 It is noteworthy that R = 0 corresponds to a case in which the object does not exist. To evaluate the detection accuracy that we achieved, we define the average position detection probability of the target object as in (5). For comparison, we also evaluate the detection performance in a case with the localized sampling scheme. As illustrated in Sect.IV, frequency-domain sampling rate is set to 1 D s Δf , where Δf = 3.8 × 10 5 Hz.   (4,2). One can find that the detection performance is improved by analyzing the CSI samples from multiple user devices. Similarly to simulation results presented in Fig. 9, one can observe that the proposed scheme with the concatenated CSI and frequency-domain sampling achieves better detection performance than in the case without concatenating CSI (U = 1), even when R c takes a small value. Fig. 13 shows the area-wise detection probability of the proposed scheme in the cases of U = 1 and 4 for three experiments (experiment 1, experiment 2, and experiment 3), where R = 32 and (M, N ) = (4, 1) and (4,2). The two experimentally obtained data from experiment 1 and experiment 2 are measured in the same room on different days. Consequently, experiment setups in both experiments are the same, but the CSI dataset differs. All experiments were conducted for the same scenario. Although the overall performance of experiment 1 was found to be better than that of experiment 2, the results clarify that the proposed scheme with U = 4 achieves much better performance than the conventional one (U = 1) [21]. The results also indicate that the overall detection performance can be improved by increasing the number of user devices N in MU-MIMO systems because the object behavior is characterized more accurately by collecting various CSI samples from different user devices. Hereinafter, results in experiment 2 are used for additional discussions. Fig. 14 presents confusion matrices for classification results in cases with the proposed scheme (U = 4) and the conventional scheme (U = 1). In Figs, 14(b) and 14(c), a compression ratio of R c = 1/13 is adopted for the proposed (U = 4) and the conventional case (U = 1). For these figures, the probability of detection is drawn as a heatmap, where the horizontal and vertical axes show s o and s a in (5), The results indicate that more accurate detection is possible using the concatenated CSI (U = 4), even when the CSI data are compressed as R c = 1/13. Fig. 15 shows the average detection probabilities of the proposed scheme as a function of compression ratio R c = 1 D s in terms of U , where U = 1, 2, 3, and 4 are used. For simplicity of discussions, D s is selected among the divisors of 52 carriers (i.e., 1, 2, 4, 13, 26, and 52). Solid and dotted lines respectively represent cases of interleaved sampling and localized sampling.

D. Experiment Results
The results indicate that the proposed scheme with the concatenated CSI and frequency-domain sampling achieves better detection performance than in the case with localized sampling. They also indicate that the data size can be compressed by choosing a proper compression ratio for a given U and N . The  results also clarify that almost the same detection probability is achieved when R C is higher than 0.25 (i.e., D s ≤ 4). This result indicates that R c = 0.25 (D s = 4) is the near optimum value that approximately meets the condition in (4) 6 . Fig. 16 shows the average detection probabilities of the proposed schemes in terms of different antenna spacing at AP, where N = 1 and 2. The orange and blue bars respectively correspond to the cases of U = 1 and 4. Results indicate that the achieved detection performance tends to be improved as the antenna spacing at the AP side increases. It is also apparent that the achieved performance is improved by collecting CSI from  multiple user devices irrespective of the antenna spacing d. To analyze the results further, we define the correlation between CSI samples for off-line training and those for online detection as jQ ] ∈ R 1×Q denotes the j-th training CSI labeled to the i-th area. Here, Q denotes the number of features per CSI sample. y lQ ] ∈ R 1×Q denotes the l-th CSI for online detection when the target exists in the k-th area. Here, ρ (i,i) j,l represents the correlation between CSI samples collected when the target is located in the i-th area for offline training and online detection. Fig. 17 shows correlation between the CSI samples for offline training and those for online detection in antenna spacing d where U = 4 is used. Here, "i = k" (red box) and "i = k" (blue box) denote the correlation between the same label numbers (i.e., correct label) and that between different label numbers (i.e., incorrect label). Results presented in Figs. 16 and 17 indicate that, as the antenna spacing becomes wider, ρ (i,i) j,l (i = k) takes higher values than  j,l , i = k. Consequently, the detection performance improves as the antenna spacing widens. 7 In the evaluations described above, CSI samples for both off-line training and online detection are collected from the same person. To clarify the detection performance when the trained model is applied for different persons, i.e., when CSI for online detection is collected from different persons, the average detection probabilities of the proposed schemes for different targets (persons A and B) are shown in Fig. 18. Here, the model is trained using CSI collected for person A (170 cm height, 50 kg weight) and applied to on-line detection of person B (174 cm height, 58 kg weight). From these results, it is readily apparent that the proposed method achieves similar detection performance for both persons A and B. This result implies that online detection performs well if CSI is collected for similar targets, even when the model is trained using CSI for different persons. Fig. 19 presents the average detection performance as a function of the number of training CSI samples in cases of U = 4. Here, each CSI sample includes feature information corresponding to beam-forming matrices for all subcarriers. Results show that the proposed method achieves similar average detection performance, even when the number of CSI samples per label is reduced to around 20. The results reveal that the proposed method works well with a small dataset.
To clarify the implications of channel aging, we evaluate the average detection performance as a function of elapsed time after  CSI acquisition and training, as depicted in Fig. 20. Results show that the detection performance is degraded gradually because of channel aging. The environmental conditions might change over time even if the state inside the room remains unchanged. However, it is also apparent that the proposed method using the concatenated CSI (U = 4) achieves higher detection performance than for the case of U = 1.
In a report of an earlier study [23], a CSI-based device-free human detection method using principal component analysis (PCA) was examined. For that method, the CSI samples are compressed with PCA and are used for training a deep-learning model. To clarify its effectiveness further, we applied this method to the same machine learning models with a small dataset and compared them with the proposed method. Fig. 21 shows the average detection performance of the frequency-domain sampling-based method and the PCA-based method as a function of the compression ratio in cases with various machine learning models. Here, we consider six typical ML models: Random Forest (RF), decision tree (DT) [39], logistic regression (LR) [40], support vector machine (SVM), K-nearest neighbors (KN) [42], and Gaussian naive Bayes (GNB) [43]). Results show that the proposed method with frequency-domain sampling achieves higher detection performance than the PCA base. This is true because the frequency-domain sampling method reduces the CSI size better without losing key feature information in the case of a small dataset.
To assess the relation between the detection accuracy and the necessary complexity, we evaluate the respective execution Here, similar to Fig. 21, we consider seven typical ML models: RF, DT, LR, SVM, linear SVM (LSVM) [41], KN, and GNB. Execution times were measured using the same workstation (64 GB memory, Core i7-10750H CPU; Intel Corp.). The results presented in these tables clarify that a shorter execution time can be achieved by adopting frequency-domain sampling (R < 1).
It is also noteworthy that the proposed scheme (U = 2) achieves a shorter execution time with higher detection performance than the conventional scheme with U = 1. This finding implies that concatenating multiple CSI is effective not only for precise detection, but also for accelerating the ML model training when the total amount of BFWs is the same. The relation between the detection performance and execution time for on-line target detection is presented in Fig. 22, where (M, N ) is set to (4,1) and (4,2). Here, the execution time is defined as the total calculation time to obtain 7128 ML results. The yellow separated area in the figure represents an enlarged view of the yellow part of the figure. Similarly to the off-line training case, the results show that shorter required execution time can be achieved by adopting frequency-domain sampling with a compression ratio of R c . It is noteworthy that the proposed scheme using concatenated BFWs (U = 4) achieves not only better detection performance, but also lower execution time than the conventional scheme (U = 1) in all cases. This is true because the available number of features (i.e., concatenated BFWs) in U = 4 is one-fourth of the case with U = 1, which leads to a shorter execution time. However, as the number of user devices (STAs) N increases, the execution time increases because the total amount of CSI samples is increased by a factor of N .

VI. CONCLUSION
As described herein, after developing a lightweight localization scheme with a small dataset, we used an indoor experiment and ray-tracing based simulation to demonstrate that it works in real time. The proposed approach uses concatenated BFWs as spatial feature information and for both training an ML model and for detecting the existence of a target object and its positions. Through both simulation and experimentally obtained results, we have obtained findings indicating the proposed approach as effective at enhancing the object position detection probability while reducing its inherent complexity where a supervised ML model is adopted. Practical experiments also indicate that localization performance can be improved by applying the proposed schemes to an MU-MIMO system with a distributed antenna array and number of users. The concept is applicable to any wireless communication system, including single carrier systems, if the acquisition of CSI between transceivers is possible. Application of the developed algorithms with a more powerful ML model under more various scenarios such as outdoor environments is left as a subject for our future work.
where I fed back to AP, where −π ≤ φ k j,i ≤ π and 0 ≤ ψ k j,i ≤ π/2. Fig. 23(a) presents an example of angle information φ k j,i . It takes discontinuous values (significant change) at the boundary of π irrespective of the object status, which might degrade its object detection accuracy. To mitigate this negative effect, this paper adopts pre-processing that transforms the angle information to trigonometric functions sin(φ k j,i ) and cos(φ k j,i ), as shown in Fig. 23(b). This difficulty can also be resolved by decompressing φ k j,i and ψ k j,i to V k . Because transformation to trigonometric functions is simpler than decompression to V k , this paper adopts the pre-processing explained above 9 .