Indoor 3-Dimensional Visible Light Positioning Based on Smartphone Camera: Error Metric and LED Layout Optimization

We consider 3-dimensional (3D) visible light positioning (VLP) based on smartphone camera in an indoor scenario. Based on the positioning model in the quantized pixel-domain, we characterize the 3D normalized positioning error metric (NPEM) through the partial derivative of the positioning function, and evaluate the NPEM for horizontal and non-horizontal receiver camera positions. Moreover, under horizontal receiver terminal position, we explore the relationship between the NPEM and the light-emitting diode (LED) cell layout, approximate the relationship between the NPEM and the number of LEDs captured by the camera, and evaluate the approximation accuracy according to the simulated positioning error. Based on the approximation results, we optimize the LED transmitter cell layout to minimize NPEM assuming structured square cell layouts with certain distance parameters.


I. INTRODUCTION
I NDOOR positioning system (IPS) has attracted extensive attentions in a wide range of applications, e.g., the indoor positioning in shopping malls and museums. Up till now, plenty of IPSs have been proposed using various technologies such as WiFi [1], [2], bluetooth [3], infra-red [4], fingerprinting [5], [6], radio frequency identification (RFID) [7], and ultra-wide band (UWB) [8], [9]. WiFi/bluetooth-based IPSs typically suffer low positioning accuracy as a result of multi-path effects. Fingerprinting/RFID-based IPSs realize positioning by comparing with the received signal to the information stored in the database, which requires a large data base and may lead to low positioning accuracy under small change of electromagnetic propagation environment. The above issues can be addressed by visible light positioning (VLP) [10]. In VLP, a photodetector (PD) is adopted as the receiver [11], [12], [13], [14], which detects and analyzes the optical characteristics, e.g., time of arrival (TOA)/time difference of arrival (TDOA) [15], received signal strength (RSS) [16], [17], [18], and phase of arrival (POA)/phase difference of arrival (PDOA) [19]. Among them, the RSS-based VLP system generally assumes that the transmission power is perfectly known and remains constant with time, while POA/PDOA-based and TOA/TDOA-based VLP systems need extremely accurate time/phase information [19]. Besides, the light power detected by a PD is sensitive to the incident direction of light beam, leading to a severe limitation on the user mobility [20]. An alternative solution of VLP is adopting an image sensor (IS), e.g., a camera, as the receiver [21], [22], [23], [24], [25], [26]. Ref. [21] demonstrated a VLP system employing both PD and camera, which can achieve centimeteraccuracy via simulations. Ref. [24] employed a camera adopting a fish-eye lens to reduce the positioning error with more captured LEDs. Ref. [26] proposed a VLP system where differential 3-dimentional (3D) space coordinates are sent from each LED transmitter to an IS receiver using a camera. In recent years, various high-precision methods and various positioning systems using smartphone camera as the receiver have been proposed [27], [28], [29], [30], [31]. A detailed overview on the VLP approaches with the corresponding accuracy was provided in references [32] and [33]. In this work, we are no longer committed to a specific positioning approach, but pay more attention to the fundamental factors related to the positioning error. We consider a camera-based 3D VLP system in an indoor scenario, and utilize the transformations of different coordinate systems to estimate the user position at any rotation angle [34], [35]. Compared with the research on positioning accuracy and positioning algorithm, we pay more attention to the 3D normalized positioning error metric (NPEM). Specifically, we characterize the NPEM through the partial derivative of positioning function, and investigate the NPEM under horizontal and non-horizontal terminal positions, which are not further explored before. Moreover, we approximate the relationship between the NPEM and the number of captured LEDs under parallel transmitter and receiver planes. The accuracy of the approximate relationship is demonstrated via the simulation results. Finally, we propose a general form of LED cell layout optimization problem and optimize the LED cell layout parameters under square layout assumption to minimize the NPEM, which may provide the reference metric to the possible indoor dense LED layout.  The remainder of this paper is organized as follows. Section II introduces the translation and rotation model in camera imaging and 3D VLP. Section III analyzes the NPEM and corresponding simulation results. Section IV explores the relationship between the NPEM and LED cell layout for parallel transmitter plane and receiver plane, evaluates the accuracy of approximated results according to the simulated positioning error, and explore the NPEM in infinite LED cell layout space under parallel and non-parallel transmitter plane and receiver plane. Section V optimizes the LED transmitter layout to minimize the NPEM under the assumption of structured square LED cell layout and provides the possible LED layout schemes. Finally, conclusion is made in Section VI. Furthermore, in order to make reading quite easier, we list the main abbrevations involved in the work and the corresponding full spelling in Table I, and the meaning  of variables with physical importance in Table II. The partial derivative matrix in the pixel domain, denoted as ΔC w , is the partial derivative of (u p i , v p i ) for all 1 ≤ i ≤ n with respect to x w c , y w c , z w c . Assuming n LEDs for positioning, the partial derivative matrix ΔC w can be written as (12). Note that for horizontal receiver plane with θ x = θ y = 0, we have R(θ x , θ y , θ z ) = R z (θ z ) and A i = z w i − z w c , which is constant given receiver height h.

A. VLP System With Four Coordinate Systems
We consider a 3D VLP system, where multiple LEDs are adopted as anchor points with known positions, and a camera is adopted as the terminal to be positioned. The LEDs in the 3D space are displayed on the 2-dimensional (2D) pixel array of the terminal, which is utilized to estimate the camera center position. Such VLP system explores the geometric relations between the 3D LED positions in the real world and the 2D projection positions on the image plane.
We consider 4 coordinate systems for the camera-based VLP system, namely world coordinate system (WCS), camera coordinate system (CCS), image coordinate system (ICS), and pixel coordinate system (PCS) [29], [35], [36]. The dimensions of WCS and CCS are three, while the dimensions of ICS and PCS are two.
In the WCS, the camera center position to be estimated is denoted as (U, V, W ). In the CCS, such position is denoted as (X, Y, Z), where the origin is located at the camera center; x and y axes are parallel to the phase plane; and z axis is the lens optical axis. The imaging plane is located from the camera center with focal length f . In the ICS, the position in the CCS is projected onto the imaging plane, denoted as (x I , y I ). In the PCS, the image in the ICS is quantized into integer or half-integer pixels in the camera's CCD/CMOS chip, denoted as (u p , v p ).
1) The Coordinate Rotation: Assume that the rotation matrices are all calculated in the right-handed coordinate system, where counter-clockwise is the positive direction. Assume rotation angles θ x , θ y and θ z in the counter-clockwise manner around x, y and z axes from WCS to CCS, respectively. The rotation matrices around the three axes can be expressed as The 3D rotation matrix R(θ x , θ y , θ z ) is obtained by multiplying the above three matrices to the left according to the rotation order, given by 2) Translation and Rotation Model: Consider the transformation from WCS to CCS, as shown in Fig. 1. Such transformation from WCS to CCS can be expressed as (3), where (x w c , y w c , z w c ) is the origin coordinate of CCS in the WCS to be estimated. It is seen from (3) that The transformation from ICS to PCS involves translation and scaling, as shown in Fig. 3. The ICS origin is the intersection of the camera's optical axis and the film plane, located at the image center point. The x−axis and y−axis are parallel to the u−axis and v−axis, respectively; and (u 0 , v 0 ) is the coordinate of the ICS origin in the PCS. Let s x and s y be the length per pixel in the x−axis and y−axis, respectively, with unit mm/pixel. Then, the quantization from ICS to PCS can be written as where Q[·] represents the pixel quantization operation to integer pixel values. Such operation introduces pixel quantization error and further leads to positioning error.

B. 3D VLP Using Camera
Let L w i , L c i , L I i and L p i denote the coordinates of LED i in WCS, CCS, ICS and PCS, respectively (i = 1, 2, 3, · · · ). Similarly, let C w , C c , C I and C p denote the corresponding coordinates of camera center in the four coordinate systems. The above coordinates and centers can be summarized as follows, The transformation from WCS to CCS is given by (3); and the transformation from CCS to ICS is given by (4). Based on the pixel coordinates in the PCS plane, the transformation from ICS to PCS is given by (5). Starting from (2) and substituting x c i , y c i , z c i in the transformation from WCS to PCS based on (3)-(5), the above coordinate system transformations can be expressed as (7), where can be estimated along with rotation angles θ x , θ y , θ z .
Remark 1: In this work, we provide an analytical metric for indoor camera-based positioning error. Then, we show that such metric is in inverse proportion to the number of captured LEDs. Finally, based on the inverse proportion, we provide an LED layout optimization framework. It is shown that the optimized LED layout from the inverse proportion leads to lower positioning error.

A. Normalized Positioning Error Metric
From (7), we have where To characterize the effect of pixel-domain quantization on the receiver positioning estimation, we first investigate its inverse, i.e., how the pixel-domain projection varies with the receiver position. Firstly, the partial derivative of F x,i with respect to x w c is given by Then, the pixel-domain projection varies with the receiver position, given by Denote (8) and (10), we have Assuming square pixel as that for common image sensor, we let s x = s y in the following analysis. Without loss of generality, we normalize the pixel size to 1 for numerical convenience.
The partial derivative matrix in the pixel domain, denoted as ΔC w , is the partial derivative of (u Assuming n LEDs for positioning, the partial derivative matrix ΔC w can be written as (12), shown at the bottom of the page. Note that for horizontal receiver plane which is constant given receiver height h. Remark 2: Note that matrix ΔC w provides the linear relationship between the small variation of image-plane position and the small variation of receiver position. We adopt the Moore-Penrose (MP) pseudo-inverse of matrix ΔC w to characterize the small variation of receiver position assuming small variation of image-plane position due to pixel-level quantization. As will be shown in the following, the expectation of receiver position variation is calculated as the summation of squared singular values of the pseudo-inverse matrix and adopted as the NPEM.
Assuming independent pixel-domain quantization error, we investigate the WCS positioning error due to the quantization error. Denote MP inverse (ΔC w )  quantization error, satisfying E[ee H ] βI m for certain constant β due to the independent quantization error assumption, image drift from slight position variance and light intensity fluctuation, while constant β is also related to the camera and its pixel plane characteristics. Given pixel quantization error e in VLP using camera, we adopt least-square (LS) criterion and find the solution with the least norm, which can be obtained via multiplying MP inverse (ΔC w ) + introduced above, as the performance metric of positioning error. The expectation of its Frobenius norm is given by Accordingly, we define the normalized positioning error metric as NPEM, to characterize the positioning performance, given by (15) for camera-based positioning,

B. NPEM of 3D Rotation in O-xyz Space
Consider the 3D rotation in the O-xyz space, where rotation angles (θ x , θ y , θ z ) and position (x w c , y w c , z w c ) are both unknown with θ x ∈ (− π 2 , π 2 ), θ y ∈ (− π 2 , π 2 ) and θ z ∈ (0, 2π). Fig. 4 shows the positioning configuration under rotation matrix R(θ x , θ y , θ z ). The coverage area of a camera is a cone with (x w c , y w c , z w c ) as the vertex and the receiver FOV as the radiation angle. Then, the LEDs captured by the smartphone camera are those contained within the intersection of the conical surface and the transmitter plane.

C. NPEM of 2D Rotation in xoy-Plane
Considering the 2D rotation in the xoy-plane with multiple LEDs, we have the following theorem.
Theorem 1: NPEM is constant with θ z under the 2D xoyplane rotation for parallel transmitter plane and receiver plane.
Proof: Since the transmitter plane is parallel to the horizontal receiver plane, we have θ x = θ y = 0 and R x (θ x ) = R y (θ y ) = I 3 . Thus, the 3D rotation matrix is recast as Then, M i and N i in (12) are given by where . Thus, (12) can be rewritten as Using (18), we have (19), shown at the bottom of the next page. Furthermore, based on the relationship of eigenvalue of matrix (ΔC w ) H · ΔC w and singular value σ i of matrix ΔC w , we have (20), shown at the bottom of this page, where According to Equations (15) and (20), the NPEM does not depend on θ z for parallel transmitter plane and receiver plane.
For the rotation in 2D xoy-plane, we have R(θ x , θ y , θ z ) = R z (θ z ). Then, (12) can also be simplified using pixel coordinates (u p i , v p i ), given by Consider the 2D xoy-plane rotation with 3 LEDs in Fig. 6(a), where the camera resolution = 2560 × 1536,   [37] and [38]. Table III shows the measurements of (u p i , v p i ) under 8 values of θ z at point (150, 50) [37], [38]. The singular values and NPEM are shown in Fig. 6(b) according to (21). Such experimental results verify Theorem 1 that the NPEM is not sensitive to θ z if the transmitter plane and receiver plane are parallel.

IV. THE RELATIONSHIP BETWEEN NPEM AND LED CELL LAYOUT
We consider the scenario of parallel transmitter plane and receiver plane, for example, both planes are horizontal. One potential application lies in wheeled mobile robots, where the LED transmitters are installed on the ceiling parallel to the horizontal plane, and the camera receiver is positioned on the top of robots also parallel to the horizontal plane. For parallel transmitter plane and receiver plane, we have {θ x = θ y = 0, θ z ∈ [0, 2π)}. We further consider the NPEM under square, hexagonal, triangular cell layouts, as shown in Fig. 7, where each LED is located at the centroid of one cell. The LED density of the three cell layouts is one LED per square meter, such that the LED spacing distances of square, hexagonal and triangular cells are

A. The Relationship Between NPEM and the Number of Captured LEDs
We investigate the relationship between NPEM and the number of captured LEDs under three cell layouts for dense receiver points at different receiver heights.
For parallel transmitter plane and receiver plane (i.e., θ x , θ y = 0), we obtain the NPEM along with the number of captured LEDs at receiver points x w c ∈ (0 : 0.1 : 2.5), y w c ∈ (0 : 0.1 : 5) and z w c = h. The fitting result of the NPEM with respect to the number of captured LEDs at h = 1 m for square cell layout is shown in Fig. 8. It can be seen that the NPEM and the number of captured LEDs under square cell layout can be well approximated by relationship y = kx −1 . Similar results can be obtained for hexagonal and triangular cell layouts. The relationship between NPEM and the number of captured LEDs for all the three cell layouts at different receiver heights is shown in Fig. 9. Table IV shows the fitting parameters and the approximation accuracy at receiver heights of 0 m, 0.5 m and 1 m for the three cell layouts. The R-square represents the quality of data fitting. In the range of [0, 1], the closer R-square is to 1, the better the fitting quality is. It can be seen that the NPEM is approximately in inverse proportion to the number of captured LEDs, with close fitting coefficients for all the three cell layouts given receiver height h. For a certain layout, take the square cell layout as example, it can also be seen that coefficient k decreases as receiver height h increases, due to shorter transceiver distance that leads to lower NPEM under the same number of captured LEDs.
We can obtain the mean NPEM at different receiver heights for the three cell layouts from Fig. 9, which is shown in Table V. It can also be seen that under the density of one LED per square meter, the triangular cell layout with smaller LED spacing leads to lower NPEM, due to more possibility of capturing more LEDs, while hexagonal cell layout shows higher NPEM due to larger    LED spacing. However, the hexagonal cell layout is more preferred for communication due to weaker inter-cell interference. Thus, the optimal layout for communication and positioning may not be perfectly aligned, which raises another problem on the joint design of transmitter layout for communication and positioning.

B. The Relationship Between Simulated Positioning Error and the Number of Captured LEDs for Square Cell Layout
To show that the inverse proportion approximation relationship y = kx −1 is valid for camera-based positioning, we carry out simulations and figure out the relationship between the simulated positioning error and the number of captured LEDs, under square cell layout for parallel transmitter plane and receiver plane. Considering that θ z does not affect the NPEM, we set θ z = 0 such that rotation matrix R = I 3 . We denote the coordinate estimate of user position as (x w c ,ŷ w c ,ẑ w c ), and define the positioning error as ( Based on the transformations among WCS, CCS, ICS and PCS, we have for R = I 3 without pixel quantization error. Then, with pixeldomain quantization, we adopt the following for terminal positioning, Given z w c , we can obtain the estimates ofx w c andŷ w c by averaging the solutions of (23) for the coordinates {(x w i , y w i , z w i )} of the captured LEDs, i.e., We define the following square-based distortion Heightẑ w c can be estimated aŝ z w c = arg min The optimization problem (25) can be solved through exhaustive search. With the estimatedẑ w c , we can obtain the estimates of x w c andŷ w c based on (24). We investigate the simulated positioning error under different pixel sizes. Specifically, we let s x = s y = 1.675 × 10 −3 under square cell layout constraint [37], [38], take the values of pixel size from 0.5s x to 5s x with 50 values in equal steps. The simulated positioning error versus the number of captured LEDs at different receiver heights is shown in Fig. 10(a). Fig. 10(b) shows the mean simulated positioning error versus the number of captured LEDs. Table VI shows the fitting parameters and the approximation accuracy between the mean simulated positioning error and the number of captured LEDs at receiver heights of 0 m, 0.5 m and 1 m, respectively.
It can be seen that the mean simulated positioning error and the number of captured LEDs can be approximated by relationship y = kx −1 + c, which verifies the fitting results of NPEM in Section IV-B except a constant term. Such constant can be attributed to the approach in Equations (24) and (25) and numerical approach in the positioning simulation, but can validate the objective of minimizing the NPEM. Coefficient c can be justified by the non-perfect solution of the proposed positioning apporach, but the inverse relationship between the NPEM and the number of captured LEDs can still be verified. Moreover, fitting coefficient k decreases as receiver height h increases, i.e., shorter transceiver distance leads to lower simulated positioning error under the same number of captured LEDs.

A. Optimization Problem Formulation
Based on the fitting relationship between NPEM and the number of captured LEDs under different cell layouts, we investigate the LED cell layout optimization under parallel transmitter plane and receiver plane.
We optimize the LED cell layout based on the NPEM of I points. Let n be the number of transmitter LEDs, and n i c be the number of captured LEDs at receiver (x i , y i ), for i = 1, 2, . . ., I, which can be obtained according to the illustration of captured LEDs in Section III-B. Fig. 11 shows the coverage radius r, given by r = d tr · tan( F OV 2 ), where d tr is the distance between the transmitter and the receiver, and n i c is the number of LEDs inside the circle with center (x i , y i ). We optimize the LED cell layout to minimize the mean NPEM over all the I points, via optimizing the transmitter LED coordinates (x w j , y w j , z w j ), for example with constant height z w j = 2.75 m for j = 1, 2, . . ., n. We set a lower bound on the minimum distance between two LEDs, denoted as d min .
Moreover, since from both fitting and simulated position, the positioning error is approximately in inverse proportion to the number of captured LEDs, we convert minimizing NPEM into minimizing the mean of 1/n i c , i.e., We aim to optimize the LED layout pattern based on the inverse proportion of the positioning error with respect to the number of LEDs captured. The LED layout pattern can be optimized assuming that each LED can be at any position on the ceiling. However, such optimization suffers high dimension, which increases the optimization difficulty. Thus, we propose a non-uniform rectangular pattern for the LED layout to reduce the optimization dimension. Moreover, the distance between two adjacent LEDs significantly affects the illumination homogeneity, where larger distance leads to weaker illumination in the area between the two LEDs. Thus, for illumination homogeneity, a lower bound on the minimum distance for the LED layout optimization is necessary.
Genetic algorithm is adopted for the layout optimization. Numerical results show that the proposed non-uniform rectangular pattern shows lower positioning error over that assuming that each LED can be at any position, due to lower optimization dimension that improves the search capability of Genetic algorithm.

B. Rectangular LED Cell Layout Optimization
We investigate the NPEM under non-uniform M × N rectangular cell layout considering common indoor room, as shown in Fig. 12. Such layout yields rectangular pattern but the spacings between adjacent rows and columns can be designed. The row and column spacings, denoted as d r,i (i = 1, 2, 3, . . ., M + 1) and d c,j (j = 1, 2, 3, . . ., N + 1), respectively, are symmetric, i.e., row spacing d r,i = d r,M +2−i and column spacing d c,j = d c,N +2−j . For illumination uniformity, we have d r,i , d c,i ≥ d min for i ≥ 2.
Firstly, we consider the NPEM of uniform 5 × 5 square cell layout as an example, i.e., M = N = 5, d r,i = d c,j , spacing d r,i = d c,j = 0.5 m, i, j = 1 and d r,i = d c,j = 1 m, i, j = 2, 3, as shown in Fig. 13(a). The NPEM is shown in Fig. 13(b), where xand yaxis represent the receiver points. The mean NPEM is 0.5831 for all the receiver (x i , y i ) ∈ {[0 : 1 : 5], [0 : 1 : 5]}.  Then, we optimize the NPEM under non-uniform 5 × 5 rectangular cell layout, i.e., M = N = 5. The row spacing d r,i and column spacing d c,j are symmetric, respectively, as shown in Fig. 14(a). Without loss of generality, we select d min = 0.5 m in layout pattern optimization. Fig. 14(b) shows the optimized non-uniform rectangular cell layout with mean NPEM 0.5422, which is lower than that of the uniform square cell layout in Fig. 13. It is seen that the optimized non-uniform rectangular cell layout shows lower mean NPEM in VLP. While uniform square cell layout is generally preferred in VLC and commonly adopted in real indoor lighting, such result raises a challenge to find a balance between the communication and positioning for the joint design of VLC and VLP. Meanwhile, we solve Problem (26) through genetic algorithm (GA), which consists of coding, fitness function, and genetic operators (selection, crossover and variation). We set selection rate to be 0.5, crossover rate to be 0.7, variation rate to be 0.001, while the number of iteration depends on initial population size. In addition, to accelerate the iterative convergence rate and guarantee the convergence to a high fitness value, we consider selecting individuals with high fitness to the following generation. Under selection rate of 0.5, we select the individuals with the highest half of fitness as the parents in the next generation. Fig. 15 shows the convergence of fitness and corresponding layout for initial population of 5071 and 10 iterations.
Similarly, we can obtain the mean NPEM of 0.5777 under the optimized layout of GA for the receiver (x i , y i ) ∈ {[0 : 1 : 5], [0 : 1 : 5]}, which is higher than that obtained under nonuniform rectangular cell layout. It is seen that the GA yields higher mean NPEM compared with the optimized non-uniform rectangular cell layout, due to high dimensions of optimization.
Moreover, we simulate the positioning error for the uniform pattern, optimized pattern from the proposed non-uniform rectangular constraint, and the optimized pattern from GA (as shown in Fig. 13(a), Fig. 14(b), and Fig. 15(b), respectively). Table VII shows the mean NPEM and positioning error under different LED cell layouts, where the proposed non-uniform rectangular pattern shows lower mean NPEM and positioning  error. Meanwhile, it can also be seen that lower NPEM leads to lower positioning error for different LED layouts, which also validates the LED layout optimization using NPEM as the metric.

VI. CONCLUSION
We consider a smartphone camera-based 3D VLP in the indoor scenario, analyze the 3D positioning error metric NPEM with pixel quantization error through the translation and rotation model under the 4 coordinate systems, and evaluate the numerical results of the NPEM under horizontal and non-horizontal receiver planes. It is shown that the number of captured LEDs is symmetric about the rotation angle due to symmetric layout, and the NPEM does not depend on the rotation angle around z−axis for parallel transmitter plane and receiver plane. Moreover, we approximate the relationship between the NPEM and the number of captured LEDs for parallel transmitter and receiver planes. It is shown that for different receiver heights, the NPEM is approximately in inverse proportion to the number of captured LEDs. Finally, we optimize the LED transmitter cell layout to minimize the NPEM, and provide a non-uniform rectangular structure to reduce the optimization dimension. Genetic algorithm is adopted for the nonlinear optimization. Numerical results show that the LED layout can be further optimized to reduce the positioning error over that of the uniform layout.
Note that the inverse proportion relationship between the NPEM and the LED number captured by camera holds for parallel transmitter plane and receiver plane, for example, both planes are horizontal. Besides, the above NPEM analysis and layout optimization are based on the ideal camera imaging with negligible distortion, while the positioning with nonlinear distortion of optical lens needs to be further explored in future work.