Design and Implementation of Remote Controlling System Using GAN in Optical Camera Communication

This article illustrates the controlling mechanism along with data transmission using optical camera communication in both line-of-sight and non-line-of sight (NLOS) environment. The transmitter of the system is mainly composite of microcontroller (Arduino Mega), DHT sensor, two LEDs, and four push button switches. Here, two LEDs are continuously transmitting the control and data signal. The control signal is operated through push button switches. We have used four different control signals that are transmitted from the LEDs and data transmission by collecting data from the surrounding through the temperature and humidity sensor. The entire transmitter operations are processed and controlled through arduino programming in a microcontroller. The COOK modulation technique is used for this system. For data reception, we have used laptop's camera that is controlled through five different exposure times and operated based on the rolling shutter effect of the camera. A logistic regression algorithm is applied to recognize the data transmitting LED from the normalized intensity of the stripe patterns. The receiver is capable to collect data under certain conditions, if the light source is obstructed by an object through deep neural network. Again, Generative Adversial Network (GAN) is used as a decoder to reduce the BER during moving scenario and NLOS conditions. Different activation functions are used in the GAN model to find the optimum solution for the system. However, we have archived the maximum 2.5 m communication distance with 10-4 BER. The receiver program is designed using Python 3.8.

Optical camera communication (OCC) can be one of the best solutions for those problems [1], [2]. OCC has some unique advantages over RF-based systems, such as harmful radiation-free, low cost, low interference, higher security, and high bandwidth. In OCC, the LED is used to transmit data, and the camera (with processor) captures images and decodes data using the rolling shutter effect of the camera. OCC is already used in eHealth [3], indoor positioning [4], [5], V2X communication system [6] etc. In [7], authors collected data from pulse oximeter sensors from the human body, but did not consider motion from the transmitter and the system was quite unstable. Motion control based system is designed in [7], and this system used a neural network to reduce BER. However, the system's transmitter size is too large and not suitable for the remote control device. In previous applications, authors already used OCC for localization or data communication purpose. In [8], authors consider neural network (NN) based OCC system for V2V communication purpose, but the communication distance was in cm level. Many authors have designed the OCC system for indoor localization purposes but the BER is too high [5]. As per the author's best knowledge, we are the first to introduce OCC for remote control device monitoring. Our goal is not only communication but also control of the entire system. To do this, we have included two LEDs for the data communication, and controlling the system in line-of-sight (LOS) and non-line-of-sight (NLOS) case. OCC is not bounded in LOS communication technique. Some authors already developed OCC for NLOS communication system [9], [10]. In NLOS, the performance of the system depends on many factors such as neighboring light's illumination, surrounding reflective surface, interference from other sources etc. If the neighboring LEDs diffuse with more luminance than the data transmitting LED, then the NLOS data collection will not be efficient. Again, the rough reflective surfaces hamper the system's performance. If any object arises before the LOS of the transmitter, then data collection become impossible with the existence of neighboring light sources. But if the object is transparent then we can easily collect the data with the existence of neighboring light sources. If the receiver can detect at least 50% of the transmitter, then data collection will be smooth without affecting the system's performance. The data collection become challenging, when the object is covered up by the transmitter. To resolve this problem, we use deep neural This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ network technique based on feature extraction. In this technique, the upcoming stripe pattern will be predicted to extract the data even though the LED will be fully blocked by any object. To the author's best knowledge, this is the novel idea in the field of NLOS OCC.
The introduction of NN equalizer has impact significant improvement of visible light communication system [11]. Although it is in an early stage, some articles shows tremendous achievement but lack of computational complexity [12] and very poor generation. In [13], authors have used NN for data decoding considering motion by waving their hand. However, their BER is quite high. On the contrary, neural blind deconvolution is used to improve the BER from blurry images, but without considering the changing position of the transmitter [14]. Our device is a remote controlling system, so we have considered motion. To resolve the BER problem, we have included Generative Adversial Network (GAN) [15]. The GAN is mainly a composite of two models; the generative model and the discriminative model. The generative model is used for the data distribution purpose and the discriminative model estimates the probability of the training data samples. Therefore, the Markov chains or unrolled approximate interface network has no use for training or generation of samples. Therefore, the designed GAN will take samples from the bright and dark stripe pattern images and estimate the real bit pattern. As a result, the BER will be improved significantly even in harsh conditions of stripe patterns inside the images.
In the receiver, data transmitting LED classification is another vital problem. There are huge numbers of LEDs in both indoor and outdoor environments [16]. Based on the rolling shutter effect of the camera, the receiver can decode some random information from some high frequency signaling LEDs. Therefore, the receiver must have proper data transmitting LEDs. Most of the articles in OCC have already used ROI [17] based LED detection and classification technique, but the accuracy is very poor. Also, ROI based classification is not applicable for moving system. Some authors have already developed convolution neural network algorithm for LEDs or LED array classification, but the designed network complexity is too high and long computational time [18]. To find the proper data transmitting LED, we have used logistic regression based LED classification algorithm. The algorithm mainly works on the normalized intensity of the capture image's stripe pattern. After thresholding and weight multiplication of the input data, it will produce the proper output signal.
Some VLC systems can provide both white-light illumination and communication simultaneously, which can provide OWC at minimal extra energy cost since the energy used for lighting is reused for communication [19]. Our proposed system transmits signal from two LEDs simultaneously based on the COOK modulation technique. Afterwards, the camera captures the images of the light source based on the rolling shutter effect and detects the LEDs using NN network. Then, the data transmitting LED is classified based on the logistic regression algorithm. Finally, the data is decoded using GAN and process further operation in python environment. If the LEDs are covered up by any object, then the stripe pattern will be predicted using deep neural network (DNN) technique. In this article, our main goals are summarized as follows: i) We have implemented a remote control system using LEDs as a transmitter and the device's camera as a receiver. ii) Afterwards, we have proposed a new frame format technique of our system. Therefore, the transmitter is simultaneously used for data transmitting and control purpose. It will also enhance the security of the system. iii) Here, we have categorized different types of LEDs or LED groups based on the outcome of the rolling shutter effect using logistic regression. iv) An algorithm named DNN using feature extraction is developed to resolve the NLOS data collection due to the blockage of the transmitter. v) Again, we have used GAN for data decoding to minimize the BER problems due to slight motion of the transmitter. The rest of the article is organized as follows; in Section II, an overview of the proposed system design approach is presented; in Section III, how the data transmitting LED is classified using logistic regression is discussed; in Section IV the GAN operation is explained; in Section V, the performance of the system is analyzed; finally, in Section VI, the conclusions of this work are discussed.

II. SYSTEM OVERVIEW
The unique difference of our designed OCC system with the existing OCC system is the controlling function and data collection from NLOS transmitter. The entire designed system architecture is shown in Fig. 1. Here, we have used a laptop's camera, by controlling the exposure time, is operated as a receiver. Besides, the transmitter is mainly designed through microcontroller, sensor, and remote. The remote is a composite of four push buttons and two LEDs. The signal is transmitted as a form of optical signal through both the LEDs. Both LEDs oscillate at a fixed 1.5 kHz frequency.
Here, the controlling is operated from the transmitter side but executed at the receiver side after detecting the stripe pattern. In the data frame, the information signal and control signal combine transmitted as a form of light through LED. The control signal is a list of some symbols. Those symbols are converted into binary bits and combined in the data frame. After modulation and encoding the data is transmitted from LED at a particular frequency. Therefore, control signals are decoded based on the laptop's camera through image processing and AI algorithms. In the LOS case, the stripe pattern is clearly seen from the receiver. But due to the camera motion, mobility, channel effect, noise, and interference, the output images may deform or become scribble. To clearly visualize the stripe pattern in those situations, we have introduced the GAN decoder. In the NLOS case, the LED may be partially visible or not fully visible. When the transmitter is fully covered up the light emitted beam's width must be less than the obstructer width. Otherwise, the stripe cannot be seen. In that case, we have applied DNN by extracting the feature of the image. If we want to differentiate the controlling mechanism in both LOS and NLOS cases, then we can differentiate it through its process. In the LOS case, GAN is executed without DNN. But NLOS case, GAN and DNN both will be executed.

A. Transmitter
The transmitter circuit is mainly a composite of two LEDs (5 mm); both LED perform same function, temperature and humidity sensor (DHT11), push button switches and microcontroller. The LEDs continuously transmitted all along information and control signals in real time. To transmit information, we have used temperature and humidity sensors (DHT11) by collecting data from the surrounding environment. Alongside, we have used four different control codes (actually symbol codes) which are transmitted through LEDs. Four control codes execute four different corresponding tasks, such as locking, sleeping, shutting down, and restarting the PC. The control signals mixed with the information form a single data format to transmit although every data may not be the same. The data frame structure of LED is shown in Fig. 2.
Here, every frame contains a header, payload (information and control), synchronization key, and tail. The header and tail contain 4-bit each. The payload of every frame composite of two types; one is information payload; another one is control payload. The information payload contains information frame number, information type (humidity of temperature), and its corresponding information. Besides, the control payload contains the control frame number and control information code. The synchronization key of every frame contains 8-bit of the same information and placed by dividing 4-bit of each in two different locations. If the corresponding 8-bit (4-bit + 4-bit) unique key in every frame doesn't match, then the collected signal is regarded as an unwanted signal. Every data frame must be contained in a single image frame, but the time delay between two LEDs may mismatch the frame. To solve this problem, we have used the frame number. If the frame number in both data frames is the same then the receiver operation will be executed. Arduino Mega (microcontroller) is used to process all the function by collecting information and modulate using COOK modulation [17] technique.
In the DHT11 sensor, the collected binary bits will be verified through a checksum mechanism. After ensuring proper binary data bits, they will be converted into integers for further processing. The temperature and humidity values are combined with their corresponding characters and converted into a binary stream. Header bits are included at the beginning of the binary stream, then modulated and encoded using COOK and Manchester coding techniques, respectively. Actually, this encoding technique was introduced to encode the clock and binary data of a synchronous bit stream. That processing is also performed in the microcontroller through programming as well. After that, a modulation technique is performed to transmit the data using a light signal. The ON and OFF pattern of the LED depends on the modulation technique being used. In order to operate flicker-free communication, the LED needs to oscillate at high frequencies with proper synchronization between the transmitted symbol and the camera frame rate.

B. Receiver
In the receiver, we have used a 30 fps device camera to capture the images of the optical signal. The transmitted signal is captured through the camera using the rolling shutter effect at five different exposure times. The receiver processes its operation frame by frame through adjusting the image size. The RGB to gray scale image conversion is taken place on the captured image. Afterward, the image may contain noise. Therefore, we have used Gaussian blur with 5 × 5 kernel size to remove noise. After using the blur function, adaptive Gaussian thresholding is used at a certain threshold level. This classifies the image into two levels. The adaptive Gaussian thresholding shows better performance for varying illumination than binary thresholding. After binarization, the LED detection technique is executed by NN. After classifying the LED region, the designed algorithm searches the proper data-transmitting LED region with the help of logistic regression. Then, the region contains stripe patterns with the proper modulated signal. From the stripe pattern, the normalized intensity is extracted. After that, a certain threshold value is applied to a fixed interval so that the binary bit patterns can be estimated. But the bit patterns may contain an error. This kind of error may be coral due to the effect of motion. For this reason, we have used a GAN decoder that minimizes the effect of motion and produces lower BER in both normal and motion conditions. After extracting the proper bit pattern, the header and tail will be identified. After proper demodulation, the information and control signals are stored for further operation.
After binarization, the LED detection technique is used by NN. In an image frame, various light sources can present through light reflection, unwanted light signal. Therefore, NN will classify the proper data transmitted LED. The designed NN takes input image with size 300 × 300. In the input image frame, a certain region is marked as an output level. Therefore, the input region is converted into a one dimensional pixel and processed in the hidden layer after multiplying with the weight matrix. In the hidden layer, we have used SGReLU activation function. This  process continues as we have used three hidden layers. In the final output layer, we have used a softmax activation function. Actually, the initial weight is created from the random value and updated at each step of backpropagation. When the iteration is completed, the final weight configuration is found that will be used during the test conditions. After classifying the LED region, the designed algorithm will search for the proper data transmitting LED region using logistic regression discussed in Section IV.
Then, the region contains stripe pattern with the proper modulated signal. From the stripe pattern, the normalized intensity is extracted. After that, a certain threshold value is applied to a fixed interval so that the binary bit pattern is estimated. But the bit pattern may contain error. This kind of error may be coral due to the effect of motion. For this reason, we have used a GAN based decoder that minimizes the effect of motion and produces lower BER in both normal and motion conditions, which is discussed in Section V. After extracting the proper bit pattern header, the tail will be identified. After proper demodulation, the information and control signals are stored for further operation. The entire receiver operation is summarized as a form of block diagram in Fig. 3. For control purpose, we have designed four different conventional subprograms that are operated through the main program in Python 3.8. The sub-program is developed to shut down, restart, lockup, and sleep the laptop. Whenever the decoded control code matches with the conditions, the corresponding sub-program will be executed. The operating system has predefined functions for those operations.

C. Data Extraction From Hindrance Transmitter
The property of light is to scatter around. If there is an obstacle in front of the light source, then the light is reflected back to the evacuation direction. To collect data from that condition, we have to consider that the transmitter is fully covered up or not. If the transmitter is fully cover up, then to collect data the light emitted beam's width will be less than with the obstructer width. Also, the camera angle of view must be satisfying the same condition with the obstructer. The collected image is controlled by the five different exposure time 15 ms, 2 ms, 100 μs, 25 μs, and 10 μs. By decreasing the exposure time, the background illumination of light source diminishes that is shown in Fig. 4. Therefore, we have taken images that have covered most of the part of the transmitter by any object. The feature is extracted and reduced from those images based on their corresponding output. The algorithm will map those result with the test set. During test case, the cloak transmitter is taken as an input image. Then the feature is extracted from the images using singular value decomposition (SVD) technique. Firstly, the image is converted into a matrix with a dimension of m × n using PIL library in python. Then, the matrix decomposes of three new matrixes, two are orthonormal and one is diagonal consist of arbitrary eigenvectors. It is notified that the input image size kept fixed. Here, the diagonal matrix has the same rank with the input matrix and composite of singular value. The arrangement of eigenvectors produces the orthonormal matrix and the vectors with higher eigenvalues come before those with smaller values. In mathematically, the input matrix can be expressed as Here, O and N and orthonormal matrix and D is a diagonal matrix. To find out the composition of O and N, we assume that the rank of matrix A is r. Here, A T A is a symmetric matrix that is chosen by the orthonormal eigenvectors n j . To calculate the value o T i o j in mathematically as From the equation we can see o i and o j are the eigenvectors of A T A and orthogonal to each other. To implement SVD algorithm in python, we use SciPy library with conjugate gradient method. SVD mainly constitutes a bridge by finding the features from the input test image and deep neural network.
The image size is big that contains huge amount of pixels. But most of the pixels contain zero information because of the exposure time control. For this reason, we need to reduce the dimension of the array for reducing the computational complexity with valuable feature selection. SVD algorithm helps to find out the feature from the images and principle component analysis (PCA) will reduce the feature size of the image. At first, the data set needs to be standardized using Standardscalar() in python. This function will convert the value of pixel at the range of 0-255 to a standard value by considering mean and standard deviation. This method is quite sensitive regarding the variance of the initial variables by leading to biased results. After that, we will compute the covariance matrix to find the relationship between dataset that is symmetric and contains n number of variables. The variables are highly correlated in such a way that they contain redundant information. The covariance variables are correlated or inversely correlated depends on the variables whether it is positive or negative respectively. After that, we have to identify the principle components from the covariance matrix by computing eigenvectors and eigenvalues. Some new variables regard as the principle components that are constructed by the linear combinations of the initial variables. These combinations are done in such a way that the new variables (i.e., principal components) are uncorrelated and most of the information within the initial variables is squeezed or compressed into the first components. In our case, we assume that ten principle components contain maximum possible number of information that represent maximal amount of variance. Now, we have to find the feature vector that contains most significant information by discarding of low eigenvalues. Therefore, the feature vector is simply a matrix that has as columns the eigenvectors of the components that we decide to keep. Finally, our final reduced data set will be created by multiplying two transpose matrixes one is feature vector matrix another one is standardized initial data set matrix. This final matrix will be considered as an input neurons of the DNN to find the predicted final output.
The designed DNN is composited of input neuron, 3 deep hidden layer, and output layer. In the hidden layer we have used Leaky ReLU activation function and in the output layer we have used Softmax activation function. In this network we consider mean square error as a loss function, adam optimization-based backpropagation algorithm and xavier weight initialization. Other parameters and architecture are same with the conventional DNN. Therefore, we give very short description of DNN to avoid redundancy.

III. LED CLASSIFICATION USING LOGISTIC REGRESSION
Real-time data decoding and thresholding cause wrong information if the LED oscillates at a fixed frequency or multiple frequencies. Therefore, we have used logistic regression to distinguish the data transmitting LED and the non-data transmitting LED. At first, the image is converted into grayscale format. Then, the image is again converted based on the binary thresholding. Afterward, the image is operated with a Gaussian filter, intensity gradient and non-maximum suppression to find the edges of different shapes. Furthermore, the program will focus on finding any circular shapes using approxPolyDP. Actually, this function provides the output vertices number for the polygon. If any polygon contains more than 8 vertices, then it will be regarded as a circle and an ellipse. An image can contain more circular shape of objects, but their areas are different. After those operations, the contour based RoI detection [20] is executed in python. Estimating the normalized distribution of pixel intensity, we can recognize and detect the LED(s). The bounding box of the detected region is in a circular shape. A constant value is added to the surface point of the bounding box. The connection line of the new point with the center point of the circle is parallel with the vertex axis of the image.
From the new point, the perpendicular distribution of pixels in the image is taken. That data will be the input of the logistic regression with its corresponding output will be one or zero. The coefficients of the logistic regression algorithm are estimated through training with the use of maximum-likelihood estimation. For the data and non-data transmitting LED cases the prediction result shows close to zero and one, respectively. The intuition for maximum-likelihood for logistic regression [21] is that a search procedure seeks values for the coefficients that minimize the error in the probabilities predicted by the model to those in the data. After finding the proper coefficients, the normalized values will be tested through the sliding window technique. The window size remains fixed in every frame. If the window size is varied, then error probability goes high. Fig. 5 shows the operation of data transmitting LED classification through logistic regression.

IV. GENERATIVE ADVERSIAL NETWORK DECODER
Due to the mobility image become blur [14] or its stripe pattern may deform [18]. As a result, the data decoding become very difficult. In order to improve the performance of the designed system, we have introduced GAN, which has the ability to train the network during harsh condition of images. Due to the camera motion, mobility, channel effect, noise, and interference the output images may deform or become scribble. The designed network will try to produce the best quality of images. As a result, the decoding gets sophisticated with minimum BER. Fig. 6 shows the entire architecture of our designed GAN decoder. Now, the network is mainly a composite of two parts; the generative network that produces fake images from random noise and the discriminative network that compares and processes real and fake images and produces output. In the discriminator, the dimension of output images is 64 × 64 × 3. On the contrary, the generator maps noise from 100-dimention to 8192-dimention intermediate features through three fully connected layers. After that, the program reshaped the linear features of dimension 4 × 4 × 512 to a spatial extent. Those features will be processed to produce output through four transpose convolution layers. Whereas, three layers use the Leaky ReLU activation function with 0.25 slope value and Tanh activation function is used in the output layer.
Batch-normalization (BN) is applied after the end of FC layer including all the convolution layers except the final output. Now, the fully connected layer has a finite number of neurons, which takes the input vector and returns another one. In general, considering the j th node of the i th layer we have the following equations: In the discriminator, the image features are extracted to a dimension of 512 × 512 × 2 in the convolution layers through SGReLU activation function and BN. In the convolutional layer, we apply convolutional products, using many filters, on the input followed by an activation function ψ.
We have: Thus: After that, the convolution layer's processed feature is flattened to a feature vector with 8192 dimensions. Those features will process and produce the output through the operation of FC layers. The SGReLU activation function is used with α value 0.2 in the three FC layers and sigmoid activation function is applied in the output layer. We have also used a max pooling method for down-sampling. For this we have replaced CONV (x, y, 2) layer with equivalent CONV (x, y, 1) layer followed by the max pooling for down-sampling with the pooling size of 2 × 2. This method can support universal decoder that can decrypt data in different conditions as well as different modulations, such as COOK, UFSOOK, PWM, PPM, UPSOOK, UPAMSM, WDM, and UQAMSM.

V. PERFORMANCE ANALYSIS
In this section, the overall performance of a designed remote control system has been analyzed in both LOS and NLOS conditions. The entire experimental set up with data decoding mechanism is shown in Fig. 7. The system is designed by considering indoor scenarios. Usually, we considered slight motion for indoor environment. Therefore, the normal walk speed has considered for our experiment. We moved the transmitter to and fro and we also examined by changing our position within 0.5 -1.5 m distance. The frame rate and IOS (international organization for standardization) are 30 fps and 550, respectively. We have examined our system in both static and moving scenarios. In normal OCC system, the problem arises when we move the corresponding transmitter from its original position and any object appear in front of the transmitter. Besides, noise, interference, and channel effect may deform the output rolling shutter images.
This result shows higher BER, but we have minimized those effects by the inclusion of GAN. The GAN is a powerful class of complex network that is mainly used for image synthesis. The generator and discriminator of GAN map the features through convolution and fully connected layers. After rolling shutter images feed from GAN, an output with high resolution images will be produced. If we change the distance or the input power, the grayscale value of the light decreases. That can create problems during data extraction. Therefore, we have used the binirization method and increased the grayscale value. But, light illumination area decreases through decreasing the power level. Also, the missing bit through lowering the power level cannot be reconstructed properly. We have achieved optimal BER 10 −4 at a 1.6 m distance. The proposed decoder also shows better performance, if the LED is unable to see the transmitter in certain condition. The GAN decoder shows maximum performance if the DNN finds out proper predicted output. For training around 1000 images, DNN and GAN took 14 hours and 12 hours, respectively. We perform validation on both laptop and server. In the server, DNN's performance is 0.014s/image and GNN's is 0.009s/image. In the laptop, DNN operates within 1.2s/image and GNN operates within 1s/image. During the experiment, we use a laptop. At that time DNN takes 1.35s/image and GNN takes 1.12s/image. Fig. 8 represents the data decoding accuracy by visualizing some parts of LED. In the LOS case, the transmitter is fully visualized clearly at that time we can easily decode data from its stripe pattern. Considering this situation, we take the visualized transmitter size as 100%. In this situation at any distance (0.4 -2 m) the data decoding accuracy is 100%. But in the NLOS case, the transmitter can be visualized or not. Let's say some part of the transmitter is visible. In that case, we have to consider 50, 40, 30, 20, and 10 percent of the transmitter seen from the receiver (camera). From those partial transmitters, the stripe pattern is reconstructed by using DNN and we can easily decode data. Finally, when the LED is fully covered by any obstacles, but if the light emitted beam's width is greater than the obstructer width then we can reconstruct the image and collect data. This time the accuracy goes decreases. The best performance is found within a 0.5 m distance with an accuracy of around 70% in a fully NLOS case. In Fig. 8, the result is shown for both LOS and NLOS cases. After decreasing exposure time, the data decoding accuracy increase in the LOS case. In higher exposure time, the background light's illumination becomes higher. If we decrease the exposure time, then the surrounding illumination of light gets darker. As a result, the stripe pattern of the LED gets more visualized at a lower exposure time. For this reason, if we decrease exposure time, then the stripe pattern becomes clearer to reconstruct data.
On the other Hand, when LED cannot be seen from the receiver in a fully NLOS case. If we decrease the exposure time at that moment, then partial illumination created by the data-transmitting LED will not be found. We can reconstruct the stripe pattern using DNN and decode data at a certain value of increasing the exposure time. At this condition, partial illumination is created by the data-transmitting LED. The best performance is found within 0.5 m distance with an accuracy of around 70% in NLOS case.
However, we have achieved maximum communication distance with motion support at a 3 m distance. The BER is not only dependent on motion but also distance. If we decrease the power of the transmitted signal, then the BER will increase. The relationship between the illumination and BER is shown in Fig. 9. From the figure, we can see that the higher power can illuminate the LED at maximum 800 lux, which result in BER being lower than 10 −5 . We have used a very small LED, the result light's illumination (power) effect on the image sensor decrease due to the increase in distance. As we have used high frequency, the stripe patterns are very narrow and sharp. When we zoom  the image to extract the data from the stripe pattern the sharp stripe cannot be detected properly. In 1.2 m -2 m distance very sharp stripe pattern cannot be visualized or considered noise as a result the BER decrease.
We have used GAN in the receiver to improve the BER performance. This GAN is composed of mainly two parts, the generator that creates fake sample data from the random signal, and the discriminator that distinguishes the real and fake data for the proper output. In the convolution layers, fully connected layers, and output layer we have used different activation functions to verify the model performance. We have used LReLU, SGReLU, ReLU, SWISH, Softmax, and ELU as an activation function. The combination of SGReLU [22] and LReLU shows better results than the other activation functions during the motion situation as shown in Fig. 10. The SGReLU has some unique advantages over other activation functions such as, vanishing gradient problem, neural death, and output offset.
For this reason, it shows 89.56% of accurate results. Besides, the size of number of hidden layers has an effect on error correction accuracy. If we increase the hidden layer, the complexity of the network and computational time will increase and the accuracy will decrease. Again, the size of the LED region and its effect on the image are very concise. For this reason, we have to crop that reason and y increasing resolution feed to the network. The maximum optimum size is taken 64 × 64 × 3. If we decrease the size, then the data decoding creates problems.
The interference may be created by the surrounding light source or due to the reflection of the neighboring light source on the smooth surface. By controlling exposure time, the interfering non-data transmitting light effect can be resolved mostly. We have examined our system by controlling the exposure time in the range of 31.3 ms -0.93 ns in a python environment. If the non-data transmitting light source is within the FOV of the camera, then the logistic regression classifies that source and removes that portion of the image. Then it tries to focus on the data-transmitting light source to extract data. Here, we have applied COOK demodulation technique to extract the data from the dark and bright stripe pattern. The logistic regression is mainly applied to the normalized intensity of the image signal. We can see from Fig. 11 that the detection accuracy decreases with the increasing number of stripes. However, the effect is not severe and the accuracy is 99% for our system at maximum distance of 2 m.
On the other hand, the computation time and system complexity of our system compared with others are shown in Table I. We have observed that our data transmitting LED recognition and classification technique shows better results than SVM, DFT and CNN networks.

VI. CONCLUSION
This article proposed and implemented a remote control communication system using optical camera communication in both LOS and NLOS cases. The transmitter circuit was designed using four push buttons to operate the control signal. Here, the Arduino Mega was used to control and process the transmitter function and integrated with temperature and humidity sensors for data transmission. For data reception, a laptop's camera was used and was controlled through different exposure times and operated based on the rolling shutter effect of the camera. A logistic regression algorithm was applied to recognize the data transmitting LED from its stripe pattern. Bit pattern was predicted from the impartial stripe pattern by using DNN in NLOS conditions. Considering slight movement, a Generative Adversial Network based decoder was developed to reduce BER. The designed system worked fluently in real-time. Although the system was designed for the laptop controlling, the program is able to control any kind of device through a CCTV camera. As our main goal is to control the system, the data transmission rate is low around 0.65 Kbits. Our future goal is to enhance the data rate for commercial purpose.