A Demodulation Method Using a Gaussian Mixture Model For Unsynchronous Optical Camera Communication With on-off Keying

We consider optical camera communication (OCC) between a camera receiver with optical lenses and distributed transmitters. This article investigates the features of an OCC system when the periods of reception and transmission are slightly different from each other. We describe a received light signal model for the OCC system with on-off keying and regard the received signals generated from a probability distribution of a Gaussian mixture model. We obtain the parameters of the probability distributions by applying a variational Bayesian inference method and utilize them for channel estimation. In addition, we define cost functions and minimize them to demodulate the transmitted bit sequences. The demodulation procedure uses a maximum-likelihood sequence detection method, which can be implemented by the Viterbi algorithm and estimates a synchronization parameter by minimizing the cost functions. Our new demodulation method requires neither synchronization devices nor training sequences for estimating the parameters. Moreover, the receiver does not need the precise transmission period, which is difficult to know in advance in practical situations because of the frequency tolerance of the clock generator in the transmitter. To validate our developed method, we conducted numerical simulations and compared the results with those from an oracle estimator that knows the parameters other than the bit sequence in advance. We also experimented in a real setup situation, and the results show the efficiency of our developed method.

light-emitting diodes (LEDs). Indoor VLC, which uses room lights from LEDs as a transmitter, can provide a fast and secure communications system because of the massive number of LEDs and controllability of the illumination area [1]- [4]. In addition, great effort has been made to apply VLC to intelligent traffic systems (ITSs) for safer driving and more efficient traffic navigation [5]- [9]. Since both of the above applications will utilize LEDs that are already available for room and automotive lighting and traffic lights, installation costs are expected to be low compared to conventional radiofrequency (RF) wireless communications systems. Furthermore, as visible light currently has no legal restrictions involving bandwidth allocation in many countries, we can freely use VLC in many situations. One type of VLC is optical camera communication (OCC). OCC employs a camera as a receiver and gathers data from distributed devices that have LEDs or displays as transmitters [10]- [12]. Recent advances in cameras with image sensors (ISs) have opened up wide possibilities for OCC systems. Thanks to the receiver's optical lenses, emitted lights are spatially separated on the IS and the massive number of pixels enables the camera receiver to detect the positions of objects. This enables the receiver to trace moving transmitters and communicate with multiple transmitters simultaneously. By taking advantage of these features, OCC systems have been applied to vehicle-toinfrastructure and vehicle-to-vehicle communications in ITSs [5]- [9], positioning systems [13], [14], multichannel acoustic measurement for beamforming via optical wireless microphones [15], [16] and simultaneous biosignal observation of audiences in live concerts [17].
One major issue facing OCC systems is the limited sampling period, i.e., the reception period of ISs compared to that of photodetectors. This limitation complicates synchronization between the receiver camera and transmitters [18]. Undersampled frequency shift on-off keying [19] and undersampled phase shift on-off keying [20] are sophisticated methods to overcome this synchronization issue; however, these methods utilize very short-time exposure time and thus need sufficiently bright LEDs as transmitters, which is unfavorable in terms of power consumption. Pablo et al. adopted an infrared interface that broadcasts the master clock to the transmitters to synchronize the shutter timing of the camera receiver and the bit transition timing of the LED transmitters [15]; however, the additional interface is not This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ desirable in terms of the fabrication cost and power consumption of transmitter devices. Ohshima et al. [21] developed an effective synchronization method for OCC with rolling shutter ISs. Their method enables very fast communication using a conventional rolling shutter IS embedded in a smartphone; however, the simultaneous communication between multiple transmitters is difficult to implement with this synchronization method because it assumes that a transmitter is mapped on the wide area of the IS. Mao et al. [22] proposed a method that can demodulate received signals in OCC without any synchronization devices even when the sampling and transmission periods are different. However, to estimate the background noise and the synchronization parameter, the method needs a training sequence. This additional noninformational sequence reduces the bit transmission efficiency. Taking into account these issues, we suggested a method that requires neither synchronization devices nor training sequences [23]. However, this simple method does not work well in environments with non-negligible background noise.
In this study, we developed a demodulation method for an OCC system that can estimate the background noise level and synchronization parameters without any synchronization devices and training sequences. To achieve this, we employed the variational Bayesian inference method for a Gaussian-mixture model [24] for the channel estimation of the OCC system. By applying the inference method to received signals, we can estimate both the background noise level and the gain coefficient for demodulation. In addition, as in [23], the developed method does not require precise knowledge of the bit transition periods of the transmitters. This aspect is practically important because it is sometimes difficult to know them in advance because of the frequency tolerance of the clock generators in the receiver and transmitters.
This paper is organized as follows. In Section II, we describe notations, assumptions, and the modeling of the light signal at the receiver camera in an OCC system. Next, we provide a parameter estimation and demodulation procedure in Section III. In Sections IV and V, to confirm the efficiency of our demodulation method, we show the setups for and the results of numerical simulations and experiments in a real environment. In Section VI, we present our conclusions.
Note that Figs 1 to 5 were originally presented in our conference paper [23]. This work is an expansion of that study.

II. OPTICAL CAMERA COMMUNICATION MODEL WITH ON-OFF KEYING
Before focusing on the demodulation problem, we describe the channel model discussed throughout this paper. In general, one can model an OCC system with a camera receiver and LED transmitters as a multiple-input and multiple-output (MIMO) communications system, where multiple pixels on an IS of the camera receive lights emitted from the multiple LED transmitters. However, we can simplify the model by stating some assumptions. First, we assume that there is a sufficiently large number of pixels in the IS compared to the number of transmitters in the field of view of the camera. In such a situation, the images of the different transmitters fall on different pixels in the IS of the camera receiver when the transmitters are almost in focus. Second, we assume the camera drives in global shutter mode. Thus all pixel on the IS are charged for the same time range. This enables us to deal with the pixels indexed by Ω l where the image of the l th transmitter falls on as one set of pixels (see Fig. 1). With these two assumptions, one can consider each transmitter independently and regard the OCC system as a collection of single-input and single-output (SISO) communication systems. This spatial separability allows the OCC system to accommodate multiple transmitters and this type of multiplexing is called spatial-division multiplexing [25].
Here we describe a model of one SISO communication system in the OCC system. Let y(t) be instantaneous light power emitted from the transmitter at continuous time t and let t 0 , T RX and τ be the time offset, sampling period, and exposure time of the camera, respectively. Then the received signal value s[i] obtained by summing over a pixel set at discrete time index i is described as, where h[i] represents an attenuation coefficient between the transmitter and a corresponding pixel set. The d[i] denotes a background noise level which corresponds to the signal value when none of the transmitters emit light and n[i] denotes the noise, including shot noise from ambient light and thermal noise at the pixel amplifier. While many sophisticated modulation methods, such as undersampled frequency shift on-off keying [19] and undersampled phase shift on-off keying [20], have been applied to OCC, here, we assume simple on-off keying (OOK) as a basic example. In OOK, the instantaneous light power y(t) is represented as, where A, x[j] ∈ {0, 1} and T TX are the maximum optical power, the transmitted bit information and the transmission period, respectively. The function g(t) represents the pulse amplitude of the transmitter, and hereafter we assume that it is the rectangular function Combining (2) and (3), we can rewrite (1) as, is devoted to obtaining the light signal of the first bit x[j i ] and the remaining exposure time The quality of demodulation in the OCC system using OOK depends on the synchronization status. Thus the relationship between periods and phases of exposure and bit transmission is important. First of all, as Fig. 3 shows, T RX − τ < T TX should hold to ensure that the camera receiver samples every transmission bit x [j]. If the periods and the phases are precisely locked, namely T TX = T RX and a[i] = 1 hold, the OOK signals can be easily demodulated by, for example, thresholding the received signals s[i] sequentially. In our application, however, the phases are not locked while T TX and T RX are nominally the same. Thus, a[i] and b[i] gradually vary with time. We call this synchronization status the plesiochronous mode as in [22].

III. A FLEXIBLE DEMODULATION METHOD FOR PLESIOCHRONOUS OCC SYSTEM
In this section, we introduce a new demodulation method for the plesiochronous OCC system. In Section III-A, we provide a basic strategy to develop the demodulation method. After that, we describe algorithms to estimate the parameters in Sections III-B and III-C.

A. Strategy to Estimate Parameters
We clarify which parameters of the current signal model in (4) should be estimated to obtain the bit sequence. Under the assumption that noise n[i] is generated from a Gaussian distribution with a zero mean and σ 2 n variance, the conditional probability density of the received signal s = ( where Ω denotes a set of indices i, i + 1, . . . , i + l p − 1. Given a received signal s, we estimate the transmitted bit sequence by finding x that minimizes the cost function Maximum-likelihood sequence detection (MLSD) implemented using the Viterbi algorithm allows us to perform this estimation [22], [26], [27]. To do this, we should estimate d

[i], R[i], a[i] and b[i]
in advance. Note that the order of computational complexity of the current Viterbi algorithm for solving the MLSD is linear to the size of Ω. That is because the trellis diagram that the Viterbi algorithm aims to solve has no loop but just correlations between two consecutive lattices. In our application, the channel state between the camera receiver and the LED transmitters changes gradually. In such a case, d[i] and R[i] are supposed to be constant for a large number of consecutive frames, and that allows us to use a large number of received signals to estimate the coefficients. Taking advantage of this, we introduce in III-B a method which utilizes statistical properties of the received signal without explicitly determining the bit sequence to estimate the channel state. We call this estimation method long-term processing.
In the plesiochronous mode, coefficients a[i] and b[i] also change gradually, and we can use several received signals to estimate them. If the rate of the change is similar to that of the channel state, we might also estimate a[i] and b[i] by a statistical method. However, the clock differences between the receiver and transmitters, which determine the stability of a[i] and b[i], are not always sufficiently small to employ a statistical method. To combat this, in III-C, we employ suboptimal per-survivor processing (PSP) [28] to estimate the coefficients. We refer to this estimation method as short-term processing.    4 shows a block diagram of our OCC system. The transmitters send bit sequences with OOK modulation and the receiver extracts the signals from its image sensor. After that, the receiver synchronizes and demodulates the received signal with the channel state estimated by the long-term processing. Note that we only illustrate one of each unit in the receiver; however, in actual situations, we can introduce multiple units other than the image sensor to treat multiple transmitters.

B. Estimating Channel State by Variational Bayesian Inference Method
Here we consider how the receiver can estimate the background noise level d[i] and the gain coefficient and R[i] are constant for frames with a long-term processing length l long , we can jointly utilize received signals {s[i]} i∈Ω long to estimate the coefficients, where Ω long is a set of consequential received signal indices and |Ω long | = l long . When l long is a sufficiently large, statistical properties of the received signals could help in the estimation.
Let  Although it might be difficult to approximate the distribution of the signals in the latter case by GMM, in this paper we assume that the distribution of {s[i]} i∈Ω long follows a linear superposition of Gaussian distributions of means d, R + d, Rā + d and Rb + d. This is because the role of GMM is not just an approximation of the true generation distribution of the received signals but also has the capability to cluster them and this enables us to remove the negative effect of s 01 s and s 10 s to estimate R + d and d using s 11 and s 00 , respectively.
One can estimate d and R by determining the means of the Gaussian distributions. To do this, we apply the variational Bayesian inference algorithm for the Gaussian mixture model (VB-GMM) [24] to the received signals. In this inference algorithm, one aims to find approximations for the posterior distribution p(Z, π, μ, Λ|S) and the model evidence p(S) by considering the joint distribution p(S, Z, π, μ, Λ), where S = are the set of observed variables, the set of latent variables, the mixture coefficients, the set of means and the set of inverses of variances of the K-component Gaussian distribution, respectively. All of the variable are in R 1×1 . In our case, the observed variables are the values of received signals minus their mean. The latent variables have no physical meaning, but they are used for convenience. The mixture coefficients π = {π k } K k=1 are expected to represent the ratio of s 00 s, s 01 s, s 10 s and s 11 s in the received signals. μ = {μ k } K k=1 and Λ = {Λ k } K k=1 are expected to be means and inverses of variances of s 00 s, s 01 s, s 10 s and s 11 s, respectively. We can decompose the log marginal probability as where we defined and q(Z, π, μ, Λ) are inferred probability distributions. Instead of directly minimizing the KL divergence in (9), we maximize the lower bound (8) with respect to the restricted family of distributions q(Z, π, μ, Λ) in a step-by-step manner to obtain estimations of parameters π, μ, and Λ. By applying the VB-GMM algorithm to the received signal {s[i]} i∈Ω long , we can estimate the means d, R + d, Rā + d and Rb + d.
To estimate the parameters precisely, we utilize properties ofā andb and determine the number of components K in the generative distribution of {s[i]} i∈Ω long . There can be three cases: In one, when the values ofā andb are different, there are four Gaussian components whose means are d, R + d, Rā + d and Rb + d. In another, whenā andb take nominally the same value and but not zero, three Gaussian components are enough to express the generative distribution. In the other, whenā =b = 0 approximately holds, we need only two Gaussian components to fit the generative distribution. Unfortunately, we do not know the values ofā andb in advance of the estimation and can not determine the number of components without any additional information. o combat this, we evaluate lower bound (8) with additional factor L(q * ) + ln K! [24] for K = 2, 3, 4 cases, where q * represents the inference probability estimated by VB-GMM, and we adopt the number of components that minimizes this value. Then we obtain the minimum and maximum means as the estimations of the background noised and the gain coefficient R, respectively. We summarized the estimation algorithm in algorithm III.1, where ψ(·) is the digamma function.
Here we explain the computational cost of VB-GMM. As described in [29], the order of the computational complexity of an iteration of the VB-GMM is O(lKd 2 data + KD 3 data ), where l, K and D data are the number of input data, the number of the Gaussian components and the dimension of the data vector, respectively. In the current case, this becomes O(l long K + K) and thus the number of input data l long is a leading factor. Furthermore, we should be careful about the number of iterations for convergence of the estimation. When these factors are considered, a concern might be that the computational time of the estimation could be somewhat too long for practical Set while convergence criteria is not satisfied do 5: for k = 1, . . . , K do 6: lnΛ k = ψ(ν k /2) + ln 2 + ln |W k | 7: for n = 1, 2, ..., N do 9: r nk =π kΛ end for 25: Estimate the parameters with 26: Sort the index k as μ k be in descending order 27: end while 28: Evaluate the lower bound 29: Store the estimations asμ k (K) = m k for every k 30: Store the lower bound with additional factor L(q * ) + ln K! 31: end for 32: DetermineK corresponding to the minimum of the stored lower bound with additional factor 33: Output the estimations asd =μ 1 (K) and R =μK (K) −μ 1 (K) applications. In Section V, we review the computational time of the whole estimation process to dispel this concern.

C. Estimating the Synchronization Parameter and Bit Sequence
Here we consider how the receiver can estimate synchronization parameter a

[i], b[i] and the bit sequence x[i]'s.
To do this, we firstly formulate a cost function to be minimized for the estimation, especially when T RX < T TX holds. Since we assume that the relational difference between the periods is small, we can expect that a[i] and b[i] = 1 − a[i] vary gradually as Fig. 5 shows. Let Ω short be a range of time indices in which the variation is infinitesimally small and a c be the average value of {a[i]} i∈Ω short . Then we obtain the cost function as, where 0 ≤ a c < 1 denotes the fraction of the exposure time spent to obtain the light signal of the first bit andR andd are estimated values of R and d, respectively. When a c = 0 holds, the sampling timing is nominally synchronous with the bit transmission timing in the range Ω short , while they are not in the synchronization state when a c > 0.
It is not easy to exactly minimize the cost function to estimate the above parameters. Therefore, we suboptimally minimize it in a stepwise manner instead. First, we estimate x by the MLSD method with tentative parameter a c . After that, a c is estimated in the PSP manner Thus, we solve dΓ + /da c = 0 using the tentative bit sequencex and obtain, In practice, we initially use 0 as the tentative a c or try various candidate values and choose one that minimizes (10) as in [22]. Usually, we iteratively apply the above procedure to improve the estimation quality.
When T RX ≥ T TX holds, we consider the following cost function: In this case, b[i] and thus a c gradually increases in the range of 0 to 1, as Fig. 5 shows. As in the T RX < T TX case, we can estimate the parameters by, and the transmitted bit sequence x by individually minimizing the cost function for each of them.
Thanks to the small relational difference between the two periods, one can expect that j i+1 = j i + 1 holds in most cases; however, the fraction a c continues to increase and sometimes reaches one, and the equation does not hold. In such a case, we have to reset as a c = 0 and substitute the index j i ← j i − 1 in the T RX < T TX case and j i ← j i + 1 in the opposite case (see Fig. 5) to maintain bit synchronization. In practice, one cannot expect a c to be exactly one; hence, we introduce threshold a th and execute the above reset procedure when the estimated fraction a c exceeds it.
We summarize the above procedure in algorithm III.2. Note that we do not know whether T RX is larger or smaller than T TX in advance. In addition, when a c is 0, Γ + and Γ − are equal. Therefore, we need to execute the algorithm in both the T RX < T TX and in the opposite case for a sufficiently large number of frames N det in which there might be frames with nonzero a c . After that, we choose one in which the sum of the values of the minimized cost functions is smaller.

IV. EXPERIMENTAL SETUPS
We conducted numerical simulations and experiments in a real environment. In both setups, the receiver does not know the gain coefficients, background noise magnitudes, or accurate transmission clock in advance. Our purpose is to verify successful estimation of the unknown parameters by our method and the method's demodulation performance in such a situation.

A. Numerical Simulations
In numerical simulations, we investigated the performance of our demodulation method under different signal-to-noise ratios (SNRs) and parameters. The camera receiver sampling period is T RX , and the exposure time is half of the sampling period. The transmitter sends pseudorandom bit sequences in periods of 0.9995T RX , 0.9999T RX , 1.0001T RX , and 1.0005T RX ; thus, the relative differences in the clock generators are −500, −100, +100, and +500 ppm. These relative differences are within the frequency tolerance of conventional crystal oscillator units [30]. The frame size for estimating the linear combination coefficients and the bit sequence, i.e., the size of the set Ω short is 60 which includes 20 additional overlapping samples. The frame size for the channel estimation, i.e, the size of the set Ω long is 800. We set the threshold a th to 0.9, the number of frames to estimate whether T TX > T RX or not N det = 1000, and the number of iterations to minimize the cost functions (10) or (12) N iter = 2. The R and d relatively vary with the ratio generated by zeromean and σ c = 10 −6 variance Gaussian distributions. The noise is generated by a Gaussian distribution with zero-mean and σ 2 n variance, and the SNRs of the received signals range from 16 to 20 dB, where the SNR is defined as, The number of the bit sequences is 1 Mbits per simulation, and we performed 1000 simulations for each condition. Note that only T RX and τ = 0.5T RX are known in advance. In addition, to confirm the potential demodulation performance for each condition, we also employed an oracle estimator. The oracle estimator knows the exact values of the gain factor R, the background noise level d, and the percentage coefficient a c and estimates the bit sequences by the MLSD method with the parameters. In our problem, one can get the global minimum solution by the MLSD method because there does not exist any correlation, except that the two consecutive grids on the trellis diagram should be solved. Therefore, the estimated bit sequences are expected to be the most accurate compared to those obtained by any other demodulation methods. The parameters used in the numerical simulations are summarized in Table I.
To confirm the robustness of our method against the parameter settings for demodulation, we also conducted numerical simulations with various processing lengths. We employed the shortterm processing length l short = 30, 40, 60, 100, 180, 340, 660, and 1300 which include 20 additional overlapping samples to estimate the bit sequences and a c . Because the relative clock difference should affect the demodulation performance with differences of short-term processing lengths, we conducted the simulations for +100 and +500 ppm cases. We also performed numerical simulations for long-term processing lengths l long = 100, 200, 400, 800, 1600, 3200, 6400, and 12800. In the simulations for long-term processing lengths, we employed the ratio of variation of R and d as σ c = 10 −4 , 10 −6 , 10 −8 to confirm the relationship between the channel state variation and the long-term processing length. We set SNR = 16, the number of trials to be 100 for each condition and other parameters except for those mentioned here to be the same as in Table I.
Since our method uses iterative convergence technique, the computation time required for achieving sufficient accuracy is unknown and may differ with different SNR. We therefore conducted test to show that our method can process data in a practical period od time with satisfactory accuracy. The iteration for the VB-GMM was limited to 100 in one condition and 10,000 in the other. The other parameter settings were the same as in Table I. We performed the experiment 100 times for each condition.

B. Real Environment Experiment
We experimented to validate our demodulation method in practical situations. As Fig. 6, we conducted tests with a highspeed camera as a receiver and a single LED as a transmitter at various distance conditions. The LED transmitter transmitted pseudo-random bit sequences with OOK modulation at about 10 kilobits per second (kbps). The high-speed camera receiver drove at 10000 frames per second with exposure time 0.5 × 10 −4 second and sent each captured image to a field-programmable gate array (FPGA) for further processing. The FPGA extracted values of pixels on which the image of the LED transmitter falls as in Fig. 1 and saved the sum of the values in its local storage. After that, we applied our demodulation method to obtain the transmitted bit sequences and evaluated their bit error rate (BER). For comparative purposes, we employed a conventional demodulation method [23] that cannot estimate the background Fig. 6. Photograph of the experimental setup. The signal from the LED transmitter is observed with a high-speed camera receiver. The observed signals are pre-processed by an FPGA and sent to the PC for storage. noise level. The distance between the transmitter and receiver in the experiment was set from 5 m to 20 m in 2.5 m increments. In order to show the effectiveness of our demodulation method in a more practical situation, we also conducted experiments in which the lens f-value was varied from 0.8 to 4 to change the effective luminance of the transmitter. The parameter settings for the demodulation and equipment used in the experiment are shown in Table II. The processing lengths and other parameter settings for demodulation were the same as in the numerical simulations of performance versus SNR.

V. RESULTS AND DISCUSSIONS
A. Numerical Experiment Fig. 7 shows BERs versus SNR of our demodulation method and the oracle estimator. As the SNR increases, the bit sequences demodulated by both methods become more accurate. BERs resulted for the oracle estimator are lower than those for our method as expected, but they are almost on the same order. Fig. 8 shows BERs versus relational clock differences for our demodulation method and the oracle estimator for SNR = 16. A large relational clock difference decreases the estimation accuracy of our method. This is because it makes the parameter estimation of a c difficult. On the other hand, BERs of the oracle estimator are not sensitive to changes in clock difference. This is simply that it knows the exact values of the parameters. Note that, whether the sampling period T RX is longer than the transmission period T TX or not, the BERs stay in the low range, although signals in the T RX > T TX cases resemble undersampling situation and should be more difficult to demodulate than that in the opposite case. Fig. 9(a) shows BERs of our demodulation method for various short-term processing lengths. The BERs of demodulated bit sequences do not dramatically vary with short-term processing length for l short = 20, 40, and 80 in both +100 and +500 ppm cases. When the short-term processing length is too short, estimating the synchronization parameter is supposed to be difficult. In estimating a c based on (11) or (13), the bit sequence should contain 0 and 1 in a well-balanced ratio because a biased bit sequence degrades the estimation accuracy. Unfortunately, in the l short = 10 case, the bit sequence for the estimation is likely to be such a biased one, and the estimated parameter has large error. This error fails to reset the synchronization parameter to zero and thus the bit index synchronization corrupts. In +500 case, the BERs become nearly 0.5 when the short-term processing length exceeds 160. In these situations, the large change in the true synchronization parameters in the long-term processing lengths decreases their estimation accuracy. This also fails the reset of the synchronization parameter.
The BERs of demodulated bit sequences are stable for longterm processing lengths except for extreme cases as Fig. 9(b) shows. When l long is too small, the statistical approach algorithm III.1 could fail because of a lack of samples for estimating the parameters. The estimation accuracy could also decrease when l long is too long compared to the channel state variation cases. This is because the variations of R[i] and d [i] in such a large number of frames are large and thus the estimationsR andd are not accurate enough to estimate the synchronization parameter and demodulate the bit sequences. When the channel state variation is large, as in the σ s = 10 −4 case, our demodulation method does not work well for the above reasons. Fig. 10 shows the experimental computation time of our demodulation method. We applied the VB-GMM with the number of iterations limited to 100 and with that limited to 10,000 for the same input signal. The left and right axes represent the BER and computation time, respectively. The orange line indicates the upper limit of the computation time allowed for a 10 kbps signal. The limit of 10,000 iterations is meant to prevent the program from ever ending, so the VB-GMM algorithm is expected to converge in this condition. When we execute the VB-GMM until the algorithm converges, it is not able to process in real time in some situations. On the other hand, this problem can be avoided by limiting the number of iterations to 100. These two BER results are almost identical. Therefore, we can say that our method is practically feasible for transmissions of about 10 kbps.  To show that VB-GMM accurately estimates the distribution of received signals, we plot the probability density functions of estimated results and the received signals in the Fig. 11 as examples. As described in Section III-B, the purpose of VB-GMM estimation is to identify the gain coefficient and the background noise. Thus, it is not necessary to accurately estimate the distribution of received signals classified as s 01 and s 10 . The figure shows that VB-GMM correctly clusters s 01 and s 10 and effectively identifies the gain coefficient and the background noise from the estimated distributions of s 11 and s 00 . Fig. 12 shows the results of the real environment experiment. Note that the plot in Fig. 12 is in a logarithmic scale, and the results plotted on 10 −7 mean BERs = 0. Our method demodulated the bit sequence without any error within a 15 m distance and with low BERs in the 17.5 m and 20 m conditions. The conventional method [23] also demodulated the bit sequences accurately in up to 10 m conditions, while its performance degraded significantly at distances of more than 15 m. In such long-distance conditions, the gain coefficient becomes substantially low, so the effect of the background noise is not negligible for the demodulation. Fig. 13 shows the computational times of our method for the real-setup experiment. Within 20 m, our method can demodulate the received signals in a real-time manner.

B. Real Environment Experiment
To check if our demodulation method is working effectively, we show a part of the received signal and the results of demodulation for the 5 and 20 m cases in Fig. 14. Because of the  relatively low background noise level of the received signal at 5 m, both the present method and our previous one in [23] is able to estimate the coefficients. However, due to the relatively large background noise at 20 m, the conventional method which lacks an accurate estimation methods to accurately estimate background noise levels, cannot demodulate the received signal. The present demodulation method was able to perform demodulation stably in all cases. Note that the normalized received signals Fig. 11. Histograms of the received signals normalized as the probability density functions and the estimated probability density function by VB-GMM are shown. Relative clock differences are +100 and +500 ppm.
and the estimated background noise levels in the conventional method are obtained by taking the minimum value of the received signals in the current long-processing frame. Fig. 15 shows the probability density functions of the received signals and those estimated by the VB-GMM method. The algorithm estimated the distributions of the received signal even when the distance was long and thus the relative noise level was high. Fig. 16 shows the received signals and coefficients estimated by our method when the f-value of lens is varied. It can be seen that our demodulation method stably performs the channel estimation and synchronization even though the demodulation becomes more difficult due to the luminance change by continuously changing the f-value. This implies that it is possible to demodulate signals received from a transmitter that varies in distance from the receiver.

C. Short Discussion
The above results validate that our method successfully demodulates received signals under practical situations. Thus, even if only the sampling period and the exposure time are known in advance, our method can estimate the gain coefficient, background noise level, and the synchronization parameter with enough accuracy to demodulate the bit sequences. Moreover, our demodulation method is sufficiently robust against parameter settings. For a wide range of short-term processing lengths, it can estimate synchronization parameters and bit sequences with almost the same accuracy when the relative clock difference between a receiver and a transmitter is within ordinary frequency tolerance of conventional crystal oscillator units. Except for the extreme cases, the channel-state estimation by VB-GMM works accurately for some ranges of long-term processing lengths. In the real-environmental experience, our demodulation method showed its superiority to the conventional method especially when the communication distance is long. In such situations, the estimation of background-noise becomes important for demodulation because the gain coefficient is effectively low compared to the noise. Received signals and estimated gain coefficients and background noise level. We also plot normalized received signals and the estimated fraction coefficients a c . We applied two methods: the one in our previous work in [23] and the method in the present work.

VI. CONCLUSION
OCC systems with distributed transmitter nodes are expected to be used in sensor networks for acoustic measurement, biosignal monitoring and other applications. The sensors in these networks should be fabricated as simply as possible to suppress costs and power consumption. For this purpose, in this paper, we considered an OCC system without any synchronization devices in the transmitters. In such a case, the periods of the sampling at the receiver and the bit transmission generally may differ from each other and the received light signal sometimes becomes a linear combination of two adjacent transmitted bits. To demodulate the bit sequence from the light signals, we introduced a received signal model and a cost function that should be minimized. To estimate unknown channel states, we employ the VB-GMM method to infer the generative probability of the received signals. The demodulation procedure also uses a maximum-likelihood sequence detection method, which can be implemented by the Viterbi algorithm, in combination with suboptimal per-survivor processing to estimate the bit sequence and the synchronization parameter by minimizing the cost function. Our new method can demodulate the signal even when the receiver is not given the precise transmission period, which the conventional method must know in advance. To confirm the efficiency of our method, we conducted numerical simulations and compared the results with those obtained by an oracle estimator that knows parameters other than the bit sequence in advance. The accuracies of bit sequences demodulated by our method are almost on the same order as those estimated by the oracle estimator. We also tested with the demodulation method in a practical environment and showed its superiority over the conventional method, which cannot estimate the channel state precisely when background noise is relatively large.
The results of this study suggest directions for further researches are needed in the future. While we assumed SISO in this study, we expect that our demodulation method could find a wider range of applications when we remodel the channel mode by MIMO, for which one should consider spatial correlations. In addition, although our demodulation method can be performed in a real-time manner at about 10 kbps, we shall aim to develop simpler methods for faster communication.