A Single-Shot, Pixel Encoded 3D Measurement Technique for Structure Light

The Structure Light System (SLS) is a general concept and it is one of the cheapest methods for the non-contact-based 3D reconstruction. The existing single-shot SLS which is primarily based on the spatial encoding techniques are not optimal in terms of resolution and digitally encoded patterns. Those schemes are not flexible, controllable, and designed up to the level of the pixel. So, to increase the resolution and to implement a flexible controllable pattern we proposed a novel heuristic method based on the spatial neighborhood. In this paper, we propose a multi-resolution SLS which can be implemented with a set of 25 geometrical shaped distinct symbols or alphabets to use in the projection pattern as shape primitive. The size of each symbol is well defined in pixels which enabled us to have access and control up to the full resolution of the projector. The shape descriptive parameters for each symbol or alphabet are also defined and computed. To spread the alphabets in a controllable manner, a method is defined to generate a robust pseudo-random sequence of any required size with a certain number of alphanumeric bases, to be employed in the projection pattern concerning the measured resolution. This arrangement will enable us to design the projection patterns according to the required surface area and the resolution. A new technique is developed for the decoding of the captured image pattern. The decoding process depends upon the classification of symbols which is based on shape descriptive parameters. The searching in the neighborhood of a symbol is carried out through computing the location information, grid distance, and direction information to find the codewords which are used to establish the correspondence.


I. INTRODUCTION
Fast, real-time, single-shot 3D measurement has become the most challenging task and it has been widely used in industrial manufacturing, the range sensing applications, the inspection and modeling in the automation industry, reverse engineering, and medical imaging applications. In this research, we will define a complete procedure for the development of the fast, single shot, dynamic 3D vision measurement techniques based on the structured light spatial encoding projection.
Many techniques have been evolved during the past two decades for the designing of SLS. Many reviews are The associate editor coordinating the review of this manuscript and approving it for publication was Kathiravan Srinivasan .
available [1]- [3]. Structure light uses the principle of triangulation for the 3D measurement. In the stereo vision techniques, more than one camera was utilized to solve the correspondence problem between two or more views of the object [4]. In the structure light projection technique, one of the stereo vision cameras is replaced with a lightemitting projector [5]- [7]. Thus, the correspondence between two images is transformed into a perspective looking for corresponding points between the projected pattern and the captured image [8]. The structure light techniques can be divided into two main classes: spatial neighborhood and temporal coding [9]. Geng [10] further differentiates the temporal encoding techniques into the sequential projection techniques which include binary patterns, gray coding, phase shift, photometric, and hybrid techniques. He differentiated the spatial projection techniques into full-frame spatially varying color patterns, Stripe Indexing (Single Shot), and Grid Indexing: 2D Spatial Grid Patterns.
Fast 3D measurement techniques have become more vital during the past few decades since it was used in the development of real-time applications. The researchers in optical metrology emphasize the techniques based on fringe patterns [11] which were used for the development of real-time 3D applications [12]. The fringe patterns have limitations of lower accuracy and resolution. Rusinkiewicz et al. [13], Hall-Holt and Rusinkiewicz [14] developed a real-time 3D shape measurement system based on the stripe boundary code. In their approach the image acquisition and processing time was high and four patterns were required to reconstruct one 3D model. Their technique was not a single-shot, and it was difficult to reach a pixel-level spatial resolution at high speed since the stripe width must be larger enough to cover the whole projector resolution. Consequently, the single-shot 3D measurement techniques remained the hot research area. The researchers from the computer vision proposed spatial neighborhood encoding techniques that are being single shot as well as fast, which can be used for the development of realtime and dynamic applications.
In the spatial codification techniques, the codeword of a specific location is extracted from the surrounding points.
The key idea is to ensure the distinctiveness of the codeword at any location in the whole range of the pattern. The examples of spatial neighborhood are the patterns based on De Bruijn sequence [15]- [20], non-formal coding [21], and M-arrays [22]- [30].
The De Bruijn pattern was designed by Boyer and Kak [15], used in a single encoded grid of colored light stripes to measure 3D. Similarly, Hugli and Maitre [16] and Je et al. [17] proposed color encoded strip patterns. Likewise, Zhang et al. [18] use alternating color strips and a multi-pass dynamic programming algorithm, which aids in the elimination of global smoothness and strict ordering constraints. Pages et al. [19] proposed an optimized pattern for single-shot shape acquisition. Vuylsteke and Oosterlinck [20] proposed a single shot binary encoded pattern derived from the pseudorandom noise sequence. This approach was used for extensive feature extraction but at the cost of more time consumption. All of these techniques were not suitable for high-speed measurements, and practical for a colored object or dynamic scenes.
The two-dimensional spatial neighborhood techniques evolved from the grid encoding. Pennington and Will [31] were the first who introduce grid-coding for automatic extraction of the range data. Thus, the grid pattern combines the advantages of both the simple point and the line pattern as sharp discontinuities may indicate abrupt changes at several points on the object surface. But grid coding implies weak constrictions on the physical objects [32], since the labeling of intersecting points of the grid is time-consuming, especially if some parts of lines are occluded [27]. So, each new label is dependent on previously labeled points.
When compared to the simple coding technique such as point and shoot through laser light, as proposed by Strat and Oliveira [33], which is based on a relative labeling approach, but were not able to robustly deal with the occlusion problem. Chen et al. [34], [35] proposed a grid-pattern based on uniquely color-encoded codeword. The codeword at any location is defined from the color value at that location and its 4 adjacent neighbors. Since the pattern was designed with multi colors and so it was immune to noise and can be warp through the intrinsic color of the measuring surface. Our technique is employed with all the advantages of grid coding while ignoring the limitation of dependency on the labeling process of previous values since we employ a predefined labeling scheme.
Griffin et al. [23] used a multi-valued pseudo-random array, instead of a binary array. In his pattern, each spatial position is represented with a mini-pattern as a special codeword. He used multiple colors to illuminate the projection which has problems in the colored scene and measured surface reflections. These developments lead several researchers to use the theory of perfect maps [36] for employment in the structured-light spatial encoding schemes. The spatial encoding techniques proposed in [24]- [26] are based on the small size pseudo-random sequences which resulted in lower resolution. Petriu et al. [24] introduced a pseudo-random fourcolor encoded grid pattern composed of the rows and columns grid lines applied on a simple cubic surface. The pattern was based on small pseudo-random sequences and only 59% of feature points were detected successfully. Morano et al. [27] used an iterative algorithm to generate a pattern based on a pseudo-random array with 45 × 45 features. Instead of symbols, he used a multi-color dot matrix and time multiplexing on/off assignment to illuminate the surface. So the approach was not a single-shot and it was difficult to read labels of each dot hence lesser feature points were deducted. Albitar et al. [28] proposed a pattern consist of 27 × 29 features with three symbols applied for the small surface. Many authors try to increase the feature size of the pattern by augmenting M-array. Lu et al. [29] used an M-array with three alphabets and a feature size of 48 × 52. Xiao-Jun et al. [30] proposed 10 alphabets based M-array with a feature size of 79×59 but he used similarly shaped symbols, so lesser feature points were detected. All these authors try to increase the feature points but their patterns are not completely designed up to the levels of the pixel.
Recently the Microsoft RGB-D cameras are also gaining popularity that mapped depth through either structured light or time of flight calculations [37]. The disadvantages come with the resulting shape that often misses thin geometric structures since due to lesser coarse of resolution with depth map, quantization error, and noise [38]. Most of the RGB-D camera-based SLS has a depth resolution of 640×480 pixels. Whereas we proposed a flexible system to design your own choice of pixel resolution and it can also be implemented through RGB-D camera systems. On the contrary, the proposed method is based on its own choice of designing pattern VOLUME 8, 2020 resolution and size calculation according to the requirements of a covered surface area.
Thus, at present no method is available which may extract more feature points, flexible and controllable enough to design the pattern at own choice of resolution. So this deficiency motivates us to propose this system to deal with the occlusion problems as well as to increase the feature size and designed pattern by utilizing the whole resolution of the projector. Our method is more suitable due to being the utilization of monochromatic light and robust symbols which combine the power of both grid-coding and predefine labeling techniques which were previously implemented with multi colors patterns.

II. ENCODING PROCESS
In this section, the whole process of encoding the projection pattern with the proposed method will discuss. The process of encoding includes choosing symbols from the set of proposed symbols or alphabets. The chosen symbols are spread in the projection pattern by using robust pseudo-random sequences. The sizes of robust pseudo-random sequences are generated according to the size of the symbol and the dimensions of the projector, all measured in pixels. The robust pseudo-random sequences are generated through MATLAB and their robustness is ensured through validation of window property. The whole process is explained below.

A. DESIGN OF SYMBOLS
In this research, we have proposed a multi-resolution SLS. We have designed and proposed five sets of 25 geometric shaped symbols or alphabets. The basic requirements for the design of alphabet or symbols are their uniqueness i.e. able to be differentiable from other symbols, the properties of being easily decodable and the robustness, their characteristic to carry direction information along with the curvature of the surface and the specific size in term of pixels. So we have designed symbols while keeping in mind the above necessities for the employment as shape primitive in the design of projection pattern. Our proposed symbols are varying in sizes from 8×8 to 16×16 pixels. Each set of symbols corresponds to a different level of resolution on the measuring surface at a certain value of depth (z). From these unique sets of '25' symbols, one can choose few symbols (minimum 2 and maximum up to 8) to design a projection pattern of their own choice of resolution and the corresponding measuring area. Our method of using more symbols in a projection pattern will provide more flexibility and robustness in the design of a projection pattern. The five sets of proposed symbols are shown in figure 1.
To decode these symbols, we use and compute shape description parameter ratios [39], [40]. So, the classification of symbols is based on shape description parameters [40]- [42]. Since shape descriptor parameters based object understanding is more stable against sensor noise and it is more prone to illumination changes and color variation. Most of the shape descriptors are computed through regional moments [43], [44]. Essential Shape description parameters utilized in the classification of symbols are defined in table 1. The definition of each proposed symbol and the computed values of the shape description parameters are given in table 10 in the appendix. Each symbol has unique geometric property and thus has unique shape description parameter values so they can easily decode with minimum or least chances of errors. The threshold values of these parameters will use in the classification of symbols in the decoding process. The threshold value of each parameter conveys some important information. The use of more shape description parameters will allow us to decode more symbols in the captured image pattern.
The implementation of deep-learning algorithms such as CNN [45] requires three basic elements, 1) Large-scale 3D datasets, 2) Obtainable structure and training, and 3) Graphics processing units (GPU) for the acceleration of the system. All of these three factors are essential elements for the employment of deep-learning methods. Furthermore, the challenges of forming data sets and their training make them slow, and complex processes for the computation. Our method of decoding or classification of symbols or alphabets is based on the shape description parameters, which is a simpler process when implemented since it needs low computational power when compared to the deep-learning. So the complexity of the algorithm and the hardware requirements will reduce which will be a significant advantage, and thus the decoding performs well.

B. GENERATION OF RPRS
Pseudo-random sequences are widely used in many applications [46], [47]. The pseudo-random sequences with more alphabets or symbols or alphanumeric basis make them suitable to use with the large dimensions and provide flexibility to the user to generate a sequence of their desirable dimensions. We employed the Mersenne twister method to generate the pseudo-random sequences [48], [49]. The Mersenne twister has the advantages of being fast, high equal distribution, and a very long period as compared to the shift register method [50], [51]. These properties of Mersenne twister provide principal advantages to the designer to generate very high-quality pseudo-random numbers. Thus larger dimension perfect maps can be formed with a lesser number of alphanumeric bases. The theory of perfect maps such as M-arrays [36] is already employed in the SLS [28]- [30]. We also generated a robust pseudo-random sequence (RPRS) by using the theory of perfect maps. In perfect maps, no codewords will repeat itself and each codeword and its location is unique in the sequence. So the coarse correspondence can be established easily. The RPRS with a specific size is the requirement for the proposed system to spread the symbols in a controllable manner to form the projection patterns. First, we will generate a raw pseudo-random sequence. The raw pseudo-random sequences are converted to robust pseudo-random sequence by validating their window property. In our method, we utilize the window property of 3 × 3 to check the robustness of the sequence.
The use of a more alphanumeric basis in the pseudo-random sequence provides more robustness since the chances of errors are reduced. The increase in alphabets will increase the robustness and the Hamming distances between the codewords. The repetition of codewords will not occur if the pseudo-random sequence is generated with more symbols or alphabets or alphanumeric bases. So there is a direct relationship between the robustness, the Hamming distances, and the number of alphanumeric bases of the pseudo-random sequence. The robustness of each pseudo-random sequence will ensure through calculating the Hamming distances between the codewords of each independent window of size 3 × 3. Hamming distances are the ability of the robustness. Since the codewords in each window are based on an alphanumeric basis. So, the difference at any location of two windows while comparing will increase the Hamming distance. Thus, Hamming distances are the measurement of differences in between the codewords of length 3 × 3 of two independent windows. As the alphanumeric bases are increased in the pseudo-random sequences, the range of alphabets used in the codewords will also increase, so the Hamming distances or robustness is improved. The codewords are considered to be robust if the Hamming distances between them are greater than or equal to '3' for the window property of size '3 × 3'.

1) SIZE CALCULATION OF RPRS
The first step in the generation of a robust pseudo-random sequence is to find the required dimension of the sequence. The size of the pseudo-random sequence is derived from the size of the projector resolution, size of the symbol, and the spacing between two consecutive symbols, which are all defined in pixels. As we proposed a multi-resolution system that depends upon symbol sizes varies from 8 × 8 to 16 × 16 pixels. The smaller symbol size will lead to a larger size of pseudo-random sequences and vice versa as inferred from the equation (1) and (2). The smaller symbol size and spacing will lead to more shape primitive in the same area and thus result in higher measuring resolution. Since more feature points or alphanumeric basis are required to fill the same resolution of the projector. The X and Y dimensions of the desired size of the pseudo-random sequence can be calculated as: where; X PP , Y PP , X RPRS , Y RPRS , S z , and P S , are the X & Y dimensions of projector resolution, dimensions of RPRS, Symbol Size, and pixel spacing respectfully.
The calculated X and Y dimensions are rounded up to the next nearest integer and further increase up to the nearest multiple of three to get each dimension divisible by 3. This is done to ensure the validation of the independent window property. VOLUME 8, 2020  Table 2 summarizes the calculation of 'X' and 'Y' dimensions of robust pseudo-random sequence (RPRS) for different symbols size, the spacing between consecutive symbols, and the number of alphabets or symbols or alphanumeric bases used.

2) GENERATION OF RPRS
To generate the pseudo-random sequence of required dimensions with a specific number of alphanumeric bases we used the Mersenne twister generator engine available in the MATLAB programming language to implement our algorithm. However, we need to ensure the robustness of each raw pseudo-random sequence by validating the window property. For the purpose, we compared each independent window of size 3 × 3 with any other independent window in raw PRS. If there might not find the same window than pseudo-random sequence (PRS) is verified to be the robust pseudo-random sequence (RPRS). During the process of comparison if there may found a similar window than that raw pseudo-random sequence will discard. The whole process is repeated until a robust pseudo-random sequence (RPRS) will obtain. Once the robust pseudo-random sequence (RPRS) is obtained it has been stored in the memory for later use in the formation of a projection pattern. The flowchart in figure 2 shows the whole process of the generation of robust pseudo-random sequence (RPRS).

3) COMPUTATIONAL REQUIREMENTS FOR RPRS
The total number of independent windows and the total number of comparisons required for the desire robust pseudo-random sequence of size (m X n) using window property of (r X v) can be estimated as: Total comparison required where; Nw is the total number of independent windows in PRS. Table 3 summarizes the calculation of independent windows, average Hamming distance, robust codeword, and comparison required for each robust pseudo-random sequence (RPRS). Figure 3 represents the Hamming distance profiles and the percentile of their codewords for each robust  pseudo-random sequence (RPRS) generated while using our method.

C. FORMATION OF PROJECTION PATTERN
A considerable number of projection patterns can be formed by using the proposed symbols and the robust pseudo-random sequences generated. The robust pseudo-random sequences consist of an alphanumeric basis which is represented with the symbols or alphabets in the projection pattern. The size of each projection pattern will be calculated by rearranging the equation no (1) and (2) as: The size of the projection pattern obtained by using different symbol sizes and the corresponding robust pseudo-random sequence are slightly greater. The sizes of the projection patterns obtained are shown in table 4. The extra pixels greater than 800 × 1280 will cut from any two sides. Figure 4, shows the texture and portion of six projection patterns obtained while applying with six different RPRS as mentioned in table 4. Each RPRS is generated for different symbol sizes and spacing. So each projection pattern will cover the same area but with a different number of shape primitives and therefore different density. For example, the projection patterns designed with large symbol size (16 × 16) have lesser primitives; while on the other hand, the projection patterns designed with small symbol size (8 × 8) have more primitives. It is important to highlight that the similarly shaped symbols may not use in the same projection pattern. Another very useful property is direction information since every symbol will follow the texture of the surface, so the direction is utilized in the searching of neighborhood symbols for decoding the projection pattern. So the projection pattern must consist of at least one symbol which must carry direction so that the other directionless neighborhood symbols may acquire their direction or orientation from that symbol. If more symbols in the projection pattern may carry direction than it will make the decoding process more efficient and simpler.

D. PROJECTOR AND CAMERA CALIBRATION
Calibration is a critical and necessary step for the projectorcamera-based SLS. In the calibration process of SLS, the camera-projector devices have to be calibrated to optimize the parameters for the minimization of re-projection errors. The projector can also be seen as the backlight camera path. So, it also requires calibration of intrinsic and extrinsic parameters just like the camera. Our system is first calibrated using the traditional calibration method to get the primary calibration parameters. A reference plane with some precisely printed markers is used for the optimization of primary calibration parameters since the traditional calibration methods rely mostly on the standard reference or the corresponding image model. Thus, before applying the projection patterns over to the measuring surface, the projector and the camera VOLUME 8, 2020 have to be calibrated with any of the techniques available in [52]- [55].

III. DECODING PROCESS
In this section, the whole process of decoding will be explained after applying the encoded pattern on the measuring surface. The captured image pattern will first undergo through image contrast enhancement and noise removal, so that image thresholding will carry out to get a binary image. The binary image is used to generate a 3D point cloud of measuring surface. The symbols or the alphabets which are originally spread through a pseudo-random sequence in the binary image are then labeled as specific region number. All the detected regions in the binary image are then classified using the threshold values in table 10 in the appendix. After the classification of each symbol, the location of each symbol is determined by calculating the centroid positions. The direction information and the grid distance of each symbol will also compute. After determining the centroid positions, the direction information, and the grid distances for each symbol, the process of neighborhood searching and decoding of 3 × 3 codewords of the robust pseudo-random sequence (RPRS) will carry out to establish the correspondence between projected and captured image patterns. Finally, the principle of triangulation is used to reconstruct the 3D of the measuring surface. The whole process of decoding is summarized in the flow chart shown in figure 5.

A. PREPROCESSING
The first step of decoding is image preprocessing i.e. to prepare the captured image for decoding. In this step, the captured pattern of an image which is obtained from the measuring surface first undergoes the image contrast enhancement and noise removal through filtering, and so the image segmentation is performed to obtain a binary image. The segmentation has a significant impact on the decoding process. As if the segmentation process is weaker than the neighborhood symbols will merge and if it is stronger than so many binary regions may be deleted which can be decoded as symbols. So balance is required in the segmentation process. The best results can be obtained through optimum global thresholding using Otsu's method which is used to perform clustering-based image thresholding [56].

B. LABELING
The binary regions obtained through segmentation are labeled with specific region numbers through the employment of the algorithm specified by Haralick and Shapiro [57].

C. SHAPE DESCRIPTOR PARAMETERS
After labeling each binary region in the captured image, the shape description parameters as described in table 2 are calculated for each binary region to classify these regions into the specific symbol or alphabet.

D. CLASSIFICATION OF SYMBOLS
After the labeling and the computation of the shape descriptor parameters of each binary region in the captured image pattern, the classification of symbols or alphabets is carried based on the shape description parameters or ratios, by comparing with their earlier computed threshold values in table 10. The utilization of more shape descriptor parameters will reduce the chances of errors in the process of classification. Robust classification is obtained by utilization of more number of shape description parameters than alphabets or symbols.

E. COMPUTATION OF LOCATION, DIRECTION & GRID DISTANCE
In this research, we employed a new technique to search in the neighborhood of a symbol. The technique is based on the computation of grid distance, centroid position, and orientation. The neighborhood searching is carried out after the classification of each symbol and the identification of the centroid positions. The neighborhood searching is carried out to determine the location of each window of RPRS to establish the correspondence between the projected pattern and captured image. So, before establishing the correspondence these parameters will be determined.

1) CENTROID
After the classification of each symbol, the location is determined through the computation of the centroid or center of gravity. So, the location of each symbol is determined and the corresponding label number has been assigned. This location information i.e. centroid position is utilized in the searching of the neighborhood symbols. The centroid position or center of gravity will be determined through the following formulation: or Centroid (x,ȳ) = ( m 10 m 00 , m 01 m 00 ) (8) where; N, i, m 00 , m 10, and m 01 are the total number of pixels, ith position of a pixel in a symbol, zero-order moments along the center, x-axis, and y-axis respectively.

2) DIRECTION OR ORIENTATION
The direction information is necessary to find the neighborhood symbols or alphabet in the projection pattern. When a symbol or alphabet falls on the measuring surface it will follow the curvature. So each neighborhood symbol is determined inline or in the direction of the previous symbol. If the alphabets or symbol carry the direction as their inheritance property then it will make the process of computation simpler. The direction of each symbol can be computed from the following equation: where; µ 11 ,. µ 20 , µ 02 are the second-order moments along the center, x-axis, and y-axis respectively. So it is necessary to compute the direction for all those alphabets or symbols which do not possess it initially as inherited property. The simplest way is to acquire from the neighborhood. So, it is attained from the symbol which is present at the close range. The distance between two neighborhood symbols 'd min ' can be computed using Euclidean distance formula between the centroid positions of two neighborhood symbols. This distance must be within close range or proximity, 'R C '. The formulation is shown below: where; (x o , y o ) is the centroid of a symbol that does not possess direction initially. (x n , y n ) is the centroid of the neighborhood with inherit direction property. Note: The value of the close range, 'R C ' is selected in such a way that the acquired direction is from the closest alphabet and it is usually 2.5 times of grid distance.

3) GRID DISTANCE
The computation of grid distance follows the computation of direction for each symbol. The grid distance can be defined as the smallest distance between the centroid positions of the two consecutive neighborhood symbols. The grid distance is varied with the surface orientation. The grid distance can be computed through either direct computation from the equivalent diameter of a symbol or acquiring from the nearest square symbol as they utilize the maximum area in the symbol space, so they have a maximum equivalent diameter. To find the nearest square object of an alphabet the procedure of searching in the close range is utilized. Hence grid distance can be defined as equivalent diameter cumulative with pixel space between two consecutive symbols. Therefore it can mathematically express as: (11) or grid = ED ns + S pix (12) where; the grid is the grid distance between two neighborhood symbols. ED S is the equivalent diameter of a symbol. ED ns is the equivalent diameter of the nearest Square symbol. S pix is pixel space. FAR is the filled area ratio. The estimated grid distances for different sizes alphabets and pixel spacing are shown in table 5.

F. SEARCHING IN THE NEIGHBORHOOD
The searching in the neighborhood means to find the next neighborhood and consecutive of symbol and their next alphabet in both vertical and horizontal direction, to find out the codewords having windows size of 3 × 3. This will utilize in the establishment of correspondence between the projected pattern and the captured image. The necessary parameters required for searching in the neighborhood are the centroid positions, direction, and grid distances which were computed earlier. The next neighborhood and consecutive alphabet are determined by the estimation of the location (centroid position) of that symbol. The estimated centroid positions of neighborhood symbols are calculated by using the current centroid positions, direction, and grid distances. Simple trigonometric rules can be applied to calculate the expected centroid positions of right and downside neighborhood symbols. The geometrical concept for the calculation is explained in figure 6. The whole process is mathematically described as: x rnb = x alp + grid * cos θ (13) VOLUME 8, 2020 y rnb = y alp − grid * sin θ (14) x dnb = x alp + grid * sin θ (15) y dnb = y alp + grid * cos θ where; (x alp , y alp ), (x rnb , y rnb ) and (x dnb , y dnb ) is the centroid position of the current alphabet, right and downside neighborhood respectively. grid is estimated grid distance between two neighborhood alphabets. 'θ' is the direction of the current alphabet whose neighborhood is to be determined. Since the grid distance between the neighborhood symbols varies with the surface orientation and the curvature. So the estimated centroid positions may not be the exact centroid positions. The actual grid distance will determine after reading the label value at the position of estimated centroid positions. The label value will identify that symbol and then actual centroid will retrieve from the memory. After knowing the actual centroid position of neighborhood symbol than actual grid distance between two neighborhood symbols will be calculated. The Euclidean distance formula is used to calculate the actual grid distance.
The initial centroid position was known so the actual grid distance for the right and downside neighborhood is determined as follows: where; (x rnb, y rnb ), (x dnb, y dnb ) are the actual centroid position of the alphabet when searching towards the right and downside of the neighborhood respectively. If the label value found in the right or downside neighborhood appears to be zero due to any reason, then searching is made around the estimated centroid position along the diagonal at 45 degrees angle along the upper and lower sides for a distance started from a pixel up to half grid distance. When an alphabet is detected within this range along the diagonal than the process of searching is completed. The calculation for searching along the diagonal from the estimated neighborhood is expressed as: y ud = y nb − d i sin π 4 (20) x ld = x nb − d i cos π 4 (21) y ld = y nb + d i sin π 4 (22) where; (x nb , y nb ) is an estimated centroid position for neighborhood symbol. (x ud , y ud ) and (x ld , y ld ) are upper and lower side searching positions calculated along diagonal at 45 degrees from the estimated centroid position. d i is the iterative distance measured in pixels varies from 1, 2, . . . . . to grid 2 (half grid distance).

G. ESTABLISHMENT OF CORRESPONDENCE
Once the neighborhood of an alphabet or symbol has been identified the procedure for searching in the neighborhood is repeated for every next symbol to find it's right and downside neighborhood to obtain a codeword of a window having size 3 × 3. This window is used to establish the correspondence between the pattern originally projected and the pattern obtained from the captured image. When a matching window of a codeword with size 3 × 3 alphabets or symbols is found in the captured image pattern it is used to establish correspondence with the pattern originally projected, by finding a similar window. Since the location of each symbol is known in the captured image and projected patterns. So this location information of each symbol with its deviation from the original pattern is utilized in the measurement of 3D.

H. RECONSTRUCTION OF 3D
The purpose of decoding is to establish the correspondence between projected and captured image patterns from the measuring surface. The need of establishing correspondence is to find a grid of matching points in the projection pattern and texture of points obtained from the measuring surface so that principle of triangulation may be applied to reconstruct or obtain the 3D shape. In this sub-section, we will define the whole procedure adopted for the reconstruction of 3D, from the corresponding points obtained by matching the related points in between the projected and the captured image patterns on measuring surface through window-based operation as described in the previous section.
1) LINEAR CAMERA MODEL [58], [59] The ideal linear camera model is shown in figure 7. Here, any point 'P' in the world coordinate space is observed on the imaging plane from the optical center 'C' of a camera through the intersection at a point, 'x p ' whereas; 'm' is the corresponding computer screen image. As evident in figure 7, the following coordinate system can be observed and described here:

a: WORLD COORDINATE SYSTEMS
The spatial coordinates of any point P in the world coordinate system can be written as: (X w , Y w , Z w , 1) T

b: CAMERA COORDINATE SYSTEM
The camera is used to describe the position of any object in the space and its surrounding environment. The Camera coordinate system can be expressed as: (X C , Y C , Z C , C).
Where; the point 'C' being the optical center of the camera coordinate system, Z c is the optical axis, XOY is the camera imaging plane which is parallel to the X c Y c plane.

c: PIXEL COORDINATES
The camera image plane is transformed into a digital image, 'm' on the computer. In figure 7; uo'v is the computer screen coordinates axes formed from the camera imaging plane. The origin is o' which is at the upper left corner of the screen. The points in the digital images are represented with pixels.

2) COORDINATE TRANSFORMATION
The point 'P' in the world coordinates system lies in space have a coordinate position of (X w , Y w , Z w , 1) T , is converted to two-dimensional digital image 'm', with the pixel coordinates of (u, v, 1) T , through coordinate transformations. First, the world coordinate system is transformed to the camera coordinate system by the following relationship; where; R is 3 × 3 rotational matrix, T is 3 × 1 translational matrix, M is outside camera parameter matrix. R & T are outside camera calibration parameters.
The perspective camera view is then normalized into the digital image by using simple geometry and the pinhole camera model. So, the following equation can be obtained: where (u n , v n ) are X -Y coordinates formed by imagecapturing plane of the camera, (X C , Y C , Z C ) are the camera coordinates system. The camera lens distortion can be accommodated by including radial distortion and tangential distortion, which can be expressed as: where; (u d , v d ) are the coordinates of the image plane in the camera after accommodating the effects of lens distortion. The first term represents radial distortion; here in the first term k 1 , k 2 are the radial distortion parameters. The second term represents tangential distortion; here in second term dx is the tangential distortion and it is defined as: dx = 2p 1 u n v n + p 2 r 2 + 2u 2 n p 1 r 2 + 2v 2 n + 2p 2 u n v n (26) where; the parameters, p 1 , p 2 are tangential distortion. While 'r' can be defined as: Finally, the transformation from digital image coordinates stored in the computer to the coordinates of an image capturing plane in the camera after accommodating the effects of lens distortion is given by: where; (u 0 , v 0 ) are the exact Position coordinates of center point 'o' in computer image, α, β represent scale factor for u-Axis and v-Axis in the digital image. K is the parameter matrix inside the camera unit.

3) 3D MODEL
The establishment of correspondence may lead to the grid of matching points in between the captured image from the measuring surface and the projected pattern. So we know the grid points of the captured image pixel coordinates, can be represented with (u 1 , v 1 , 1) T , and similarly, the projected image pixel coordinates can be represented with (u 2 , v 2 , 1) T . So with the corresponding relationship between these two coordinates, we can reconstruct the pixel coordinates of the world coordinate system (X W , Y W , Z W , 1). Equation (24) can be rewritten in the form of matrix multiplication as follows: Usually, we do not need to consider the tangential lens distortion since now a day the camera lens distortion effects are overcome and accommodate during camera manufacturing; therefore, the equation (25) and (26) will be simplified as follows: Finally, the equations (23), (28), (29) and (30) defines the relationship between the coordinate points of the world coordinate in space and the pixel coordinates of captured image: Eventually, it can be written as follows: Similarly, we can find the relationship between the world coordinate system and the coordinates system of the projected pattern with a point to point transformation as follows: n 11 n 12 n 13 n 14 n 21 n 22 n 23 n 24 n 31 n 32 n 33 n 34 where; Z C1 and Z C2 are the coordinates of point 'P' in the camera coordinates system and an optical axis of the projector coordinate system respectively. 'm ij ' is the ith row and the jth column of the matrix 'M', while 'n ij ' is the ith row and the jth column of matrix 'N' respectively.
From the above two equations (32) and (33)  In algebraic form it can be represented as (35), as shown at the bottom of the page. Thus, the above system of equations (34) or (35) is constrained of a linear system, which is composed of four equations with three unknowns. A theoretical unique solution can be obtained directly. However, practically the extracted data applications may contain noise. Therefore, using the least square method to solve for the world coordinates of point P is equivalent to the sum of squares of the minimum distance for two rays emitted by the camera. So, the equation number (34) can be rewritten as follows: where; Then the Point 'P' in the world coordinate system can be determined as:

IV. RESULTS AND EXPERIMENT
The experiment is carried out to validate our method and to evaluate the performance of our system.   whereas the camera is placed at a distance of 200 cm from the measuring surface, while the camera and projector are 18 cm apart. We perform our experiment on three surfaces: 1) A simple plane surface which is approximately 800 mm wide and 600 mm long, 2) The cylindrical surface which has 150 mm radius and 406 mm of height 3) The sculpture we used has the size of 223 mm of height, 190 mm of width and 227 mm of depth. The standard deviation of the cylindrical surface is equal to its radius which is 150 mm. The estimated standard deviation of the textured surface, i.e. sculpture is approximately between 200 to 225 mm. For experimentation and proving our methodology, we form two projection patterns that are implemented by using three symbols of size 16 × 16 pixels. So our measured resolution will be about 18 mm (refer to table 8). The two patterns are formed using diagonally arranged squares (symbol 5), filled square (symbol 7), horizontal bar (symbol 22), and right isosceles triangle (symbol 25). The only difference between the two projection patterns is that in the second pattern we replaced the symbol of the diagonally arranged square with an isosceles right triangle. To spread these symbols we use an RPRS of three alphanumeric bases with a size of 45 × 72 primitives and window property of 3 × 3. The spaces between two consecutive symbols are of 2 pixels. The textures of these two patterns are shown in figure 8.  To decode these patterns on the measuring surface we apply the method discuss in the previous section. The four symbols used in these patterns are classified using shape description parameters. The relevant shape description parameters used to classify four symbols in these patterns and their corresponding threshold values used for the classification are shown in table 6 which is a subset of table 10 at the appendix. Since each pattern consists of three symbols, therefore three shape description parameters may be  enough to classify symbols. The shape description parameters like rectangularity, eccentricity, aspect ratio, convex to area ratio, parameter to area ratio, area function, and direction or orientation can be enough and used to classify the symbols in both the patterns used in experimentation. If more shape description parameters will use to classify symbols than chances of errors due to the wrong classification will remove.
In the first step, rectangularity measure is used to differentiate diagonally arranged squares or isosceles right triangle from the two other alphabets or symbols which are filled squares and horizontal stripe. The square and the horizontal stripe have a higher value of rectangularity ratio almost equal to '1' when compared to diagonally arranged squares or isosceles right triangle which is only 0.53 (see table 6). After differentiating symbols using rectangularity further classification is made by using eccentricity, and the area function. Squares are separated from stripes using eccentricity. The value of eccentricity is utilized for classification among square or stripe. The eccentricity ratio for horizontal stripe symbol is the highest, i.e. equal to '1', and for the solid square symbol it is the lowest i.e. equal to '0'. Further classification is based on area function since the solid square symbol has the largest area while the horizontal stripe symbol has the lowest value of area function so they can be easily differentiated.
Our method of classification of symbols or alphabets in the projection pattern is validated through applying our algorithm on the original pattern and it has observed that 100% of symbols are decoded, and then we apply our method by projecting pattern on the plane surface, curved or cylindrical surface, and surfaces with more texture such as sculpture. Table 7 shows the number of symbols or primitives decoded or classified for both the pattern used in experimentation when applying to different measuring surfaces. It is evident from table 7 that our decoding algorithm worked well when compared to other methods such as Petriu et al. [24], Albiter et al. [28], Chen et al. [34], [35] and Wijenayake et al. [26]. Reference [24] was able to decode only 59% of primitive while [28] able to decode 95% of primitives on the cylindrical surface on the other hand we have decoded 97% of primitives. Due to the utilization of a monochromatic pattern having only two intensities of colors, white and black; our decoding algorithm will extract more feature points when compared to [26] and [34]. Since more shape description parameters are used to detect the symbols, therefore, no false detection or wrong classification of symbols has been observed. Our result shows that 100% of symbols have been decoded when applied to the original projection pattern which shows the method is the most reliable one. Figure 9 shows detected and decoded or classified primitives or symbols for two patterns used in our experimentation purpose. It shows decoded primitives for 1) original (projected patterns), 2) patterns from a plane surface, 3) curved cylindrical surface, and 4) the textured surface i.e. sculpture. To differentiate symbols from one and another centroid positions of each symbol are represented with different colors. So, the centroid positions of diagonally arranged small squares (in pattern 1) or isosceles right triangle (in pattern 2) are represented with a blue star ( * ). Similarly, the centroid positions of solid filled squares are marked with a red star ( * ) whereas the centroid positions of horizontal stripes are shown with a green star ( * ). The centroid positions of the symbols which are unable to decode or classify are represented by pink plus (+) sign.
The direction or orientation of the measuring surface is calculated from two symbols in each pattern i.e. from diagonally oriented square symbol (in pattern 1) or isosceles right triangle (in pattern 2) and the horizontal stripe symbol (in both patterns). The direction information of solid filled square will be calculated either from two of these symbols in the neighborhood as described in the previous section. The grid distance is calculated as described earlier in the previous section.
Our system can perform well from 40 to 250 cm of depth range. Table 8 shows the resolution obtained and area covered on a plane surface, in between the depth of 40 to 250 cm by using our system, and also compared with the systems proposed by other researchers such as Albiter et al. [28], Chen et al. [34], [35] and Wijenayake et al. [26]. These resolutions are achieved by using our experimental setup and projector. The resolution or accuracy achieved with our method is significantly higher than the previous method. We can achieve a resolution of 1.9 mm at the depth distance of 40 cm while the previous methods such as [28] has an accuracy of 6 mm, [35] has 5.2 mm and [26] has 3.7 mm.
The time durations were measured on the average core i5 computer for different processes. The time calculations are based on average times for each process, while each event is optimized and processed many times, so the optimum results are being shown here. All-time calculations are measured in milliseconds. Table 9 shows the time durations for different processes involved during decoding. Each process has a specific time duration. The preprocessing time increases with the complexity of the surface while it decreases with the decrease of detected primitives. The number of decoded primitives are higher in the original pattern and simple surface, when compared to the complex or textured surface such as sculpture or cylinder. A similar phenomenon is observed in the processes such as; labeling, computation of shape description parameters, classification of symbols, and the establishment of correspondence. The matching rate increases with the complexity and texture of the surface. The rate of matching is less on the simple surface and high on the texture and complex surface.
After applying the techniques as described earlier the 3D of simple and complex surfaces, such as cylindrical objects and sculpture are measured at the accuracy level of 18 mm and are shown in figure 10.

V. CONCLUSION
In this paper, a single shot novel method is introduced to generate a pixel-level design of projection pattern for structure light system (SLS) based on spatial encoding technique. Unlike the previous methods, our encoding technique is more flexible and designed and controllable up to the pixel level. We have proposed 25 geometrical shaped symbols, well-controlled in size. We have computed '10' shape description parameters for each symbol or alphabet, which will enable us to use and decode up to '8 to 9' symbols in a single projection pattern. So more symbols can be used and decoded in the future. Instead of M-arrays, we use robust pseudo-random sequences to spread symbols in a projection pattern. We present a comparatively easy and flexible technique to generate robust pseudo-random sequences of any required size and also ensuring their robustness through the window property. The use of more symbols in a projection pattern will enhance the robustness of RPRS. With our method, more flexible projection patterns that are controllable up to the pixel level of the projector are implemented and the projection patterns can be designed according to the required size of surface area. With our technique, a large surface area can be covered in a single shot arrangement. Due to control at the pixel level, the resolution of the measured 3D surface is also improved. The comparison with the previous methods shows that the accuracy level or resolution of our system is significantly better. We have implemented a new method for decoding based on grid distance between consecutive symbols. APPENDIX See Table 10.  JUN LU received the Ph.D. degree. He is currently a Professor and a Ph.D. Tutor with Harbin Engineering University. He has written four books. He has published more than 50 academic articles in national and international journals.
QI-DAN ZHU received the Ph.D. degree. He is currently a Professor and a Ph.D. Tutor with Harbin Engineering University. He holds six invention patents. He has published more than 100 high-quality academic papers in national core journals and academic conferences at home and abroad.
LI YONG received the master's degree in detective technology and automatic device from Heilongjiang University, in 2014. He is currently pursuing the Ph.D. degree with Harbin Engineering University (HEU). His current research interests include computer vision and deep learning. VOLUME 8, 2020