Introduction
The visible light communication (VLC) systems are rapidly gaining research interest amidst the scarcity of the RF spectrum. This technology complements the RF-based wireless communication system [1]. VLC can be done with a light-emitting diode (LED) or laser diode (LD) at the transmitter (TX) and a photo-detector at the receiver (RX). The advantage of using visible light is that it can be used simultaneously for illumination and wireless data transmission [2]. VLC is also a communication method for use in underwater communication because light has less attenuation than RF signals. Therefore, underwater visible light communication (UVLC) can complement traditional acoustic communication for high data rate communication [3]. Furthermore, VLC can be combined with RF to form a hybrid RF/VLC system [4] or a hybrid RF/UOWC system [5].
A. Research Problem
Research in Visible Light Communication (VLC) encompasses various technical challenges. One significant challenge involves developing LED devices, such as micro-LEDs [6], [7], [8], with bandwidths exceeding 1 GHz. Another critical area of investigation is achieving data rates of 10–20 Gbps [9], [10], [11] through advanced optical wireless communication models, algorithms, and experimental setups. However, these performance metrics are typically obtained under controlled laboratory conditions using high-end equipment, such as waveform generators and oscilloscopes, with multi-giga sample per second (GSPS) capabilities, rendering them impractical for commercial applications.
To bridge this gap, our research focuses on developing a VLC prototype device capable of real-time communication, aligning closer to practical deployment. A fundamental requirement for such a prototype is the integration of TCP/IP support to ensure seamless networking capabilities.
Real-time VLC prototypes can be divided into two categories: low-speed and high-speed. Low-speed devices can be used for applications that require small amounts of data, such as the Internet of Things (IoT). High-speed devices can be used for applications that require large amounts of data, such as video streaming. The state-of-the-art for the VLC platform is described in [12]. In this paper, we modified and added some references to the state-of-the-art summary of the VLC platform as shown in Table 1. This table summarizes the state-of-the-art VLC platform that supports networking with TCP/IP only. This is because TCP/IP support is an important capability for a final product.
Existing VLC platforms can be classified based on the processor used:
General purpose processor (GPP)-based platforms: OpenVLC 1.4 [12] and DenseVLC [16] use a single-board computer (SBC), while EnLighting [18] uses a micro-controller.
Field programmable gate arrays (FPGA)-based platforms: References [15] and [17] use the Xilinx Zynq 7000 programmable system-on-chip (SoC), i.e., has an FPGA in it. There are many references that use FPGAs, but we don’t include them because the majority don’t support TCP/IP.
COTS Wi-Fi module-based platforms: WiFi-over-VLC (WoV) [13] and MIMO WoV [14] use Intel’s commercial-off-the-shelf (COTS) Wi-Fi application specific integrated circuit (ASIC). We classify these references as semi-commercial products because they use a COTS chip dedicated to wireless communication.
Commercial platforms: Some commercial products, such as Trulifi 6002 [19], LiFiMax [20], and LiFi-XC [21] achieve a data rate of tens to hundreds of Mb/s. However, these products are proprietary, so no implementation details are available.
Based on references [12], [16], and [18], the main limitation is that the processing power and sampling rate of GPP are very limited because they are not designed for signal processing purposes. Consequently, the modulation methods available are restricted to digital modulation, such as on-off keying (OOK). Although FPGA offers greater processing power than GPP, references [15] and [17] still utilize OOK modulation. Additionally, packet processing between PHY and TCP/IP is still performed in the Linux user space, which can introduce significant overhead. References [13] and [14] use an interesting method by employing the COTS Wi-Fi module for VLC transmission. This system uses modulation for high-speed data transfer, namely orthogonal frequency division multiplexing (OFDM). Then, because this is a COTS device, there is already an automatic gain control that allows the VLC system to be used at various distances. However, this system is not flexible in terms of research and development because it uses a proprietary Wi-Fi ASIC chip that cannot be modified in its baseband processing system.
As outlined above, a gap exists in developing networked VLC prototypes that are high-speed, flexible, and highly integrated, particularly with TCP/IP networks. To address this, we propose a system-on-chip (SoC) architecture and register-transfer level (RTL) design for an OFDM baseband processor in a network-enabled VLC prototype. This prototype supports high data rates in the tens of Mb/s range and offers significant flexibility. Additionally, we propose integrating TCP/IP processing for comprehensive system functionality.
B. Research Challenges
Wireless communication device design requires cross-disciplinary knowledge and know-how. Knowledge of the OSI model [22] and how to implement it at the OS kernel level, embedded systems, FPGA, digital signal processing, and analog circuit design is needed to perform research in this field. To tackle this challenge, we propose and use a highly integrated SoC FPGA platform. We implemented our architecture on the Ecylpse Z7 [23] board that uses a Xilinx Zynq 7000 programmable SoC FPGA [24]. This board already has an integrated DAC board (Zmod AWG [25]) and an integrated ADC board (Zmod Scope [26]), making development easier. Then, we also used open source references from openwifi [27] to design the Linux kernel module for interfacing between baseband processor and Linux TCP/IP processing.
Another challenge in implementing a real-time OFDM system is the sampling offset problem. In many VLC experiments, this sampling offset problem is usually assumed to be ideal [28]. The system in [29] and [30] achieves more than 1 Gb/s data rate using FPGA. The systems have not yet integrated with TCP/IP, and because of that, we did not include it in the state-of-the-art table. The most important issue is the problem with the system clock used. The system still uses the common clock for both TX and RX, where in real conditions, this is not possible. The problem with sampling offset in an OFDM system consists of two problems, namely sampling frequency offset (SFO) and sampling phase offset (SPO).
C. Our Contributions
To address the above gap and challenges, we propose an OFDM baseband processor for a network-enabled VLC system platform that employs the following key techniques:
Architecture design of the OFDM system based on SoC. We propose an SoC architecture for building the VLC prototype. This architecture is based on a highly integrated SoC FPGA platform. We elaborate and compare our architecture with other state-of-the-arts.
RTL design of the essential OFDM modules. The modules include time synchronization, sampling offset estimation and compensation, and channel estimation and equalization. We elaborate on the RTL design method for time synchronization, channel estimation and equalization. We simplify the method from [39] and [40] to produce an RTL design for sampling offset estimation and compensation that requires minimal computational processing by eliminating the division process.
System integration of the OFDM baseband processor both to the TCP/IP stack via a loadable Linux kernel module/driver design and to the VLC analog front-end (AFE). We use the indoor VLC channel at a distance of 1 m.
Evaluation of the network performance by testing the VLC system with COTS network device to do various network tests using real applications (like
iperf [31], web browser, etc.).
Key results: We evaluated our design and found it achieves data rate improvements of
Proposed System Model
The most widely used modulation in LED-based VLC is intensity modulation/direct detection (IM/DD) [32]. No carrier signal is used. Modulation is done by changing the intensity of the LED. However, the OFDM characteristics used in RF wireless systems are complex and bipolar, but in IM systems, they must be real and unipolar. So, OFDM for RF systems needs to be modified so that it can be used for LED-based VLC [33]. In our proposed model, we also add SFO and SPO model and simulation.
A. OFDM Transmitter
The TX data bit stream is encoded using convolutional code [35]. We use two code rate values,
One OFDM symbol is composed of 256 subcarriers with an index from −128 to 127, as shown in Fig. 1. This subcarrier mapping is based on the IEEE 802.16d standard. In OFDM for VLC, this mapping must be modified from index −127 to index −1. This location contains the complex conjugate from index locations 1 to 127. So, we can define one OFDM symbol in the frequency domain as
There are four types of subcarriers, namely DC, data, pilot, and guard subcarriers. The value of the DC subcarrier is 0. The subcarrier data contains complex IQ data that comes from the modulator. The pilot subcarrier contains the value
Every OFDM symbol \begin{equation*} x(n) = \sum _{k=0}^{N-1} X(k).e^{\frac {j2\pi kn}{N}}=IFFT\{X(k)\} \tag {1}\end{equation*}
One OFDM frame is composed of all OFDM symbols after cyclic prefic addition
B. Channel Model
The simulated channel model in this system consists of sampling offset and AWGN noise effects. Let \begin{equation*} y[n]=x((n+\phi ).T_{dac}.(1+\Delta ) + n_{AWGN}(n) \tag {2}\end{equation*}
In addition to this channel model, we also use the ITU channel model for indoor office (channel A and B) [34] to test the OFDM system.
C. OFDM Receiver
On the RX side, a match filter is used to maximize the signal-to-noise ratio (SNR) of the received OFDM signals. The match filter employs the same structure and coefficients as the pulse shaping filter. After filtering, the signal is downsampled by a factor of 5.
Let \begin{equation*} \Phi (n) = \left |{{\sum _{q=0}^{Q-1} y(n+q).p(q)}}\right | \tag {3}\end{equation*}
\begin{equation*} \Phi (n) \gt th \tag {4}\end{equation*}
Every OFDM symbol \begin{equation*} Y(k) = \sum _{k=0}^{N-1} y(n).e^{\frac {-j2\pi kn}{N}}=FFT\{y(n)\} \tag {5}\end{equation*}
SFO occurs because the oscillators used for the DAC and ADC are independent, so they have a non-zero tolerance. As a result, the frequency sampling is not the same on the DAC and ADC [39]. We perform SFO compensation using the pilot method [40]. The SFO estimation process begins by calculating the angle between the RX pilots and TX pilots, which is defined as \begin{equation*} \hat {\phi }_{k_{i}} = angle\left ({{\frac {Y_{k_{i}}}{X_{k_{i}}}}}\right ),\; k_{i}\in Pilot,\; i=[1,2,\ldots ,8] \tag {6}\end{equation*}
\begin{equation*} s_{n} = \frac {\sum _{i=1}^{8} k_{i}.\hat {\phi }_{k_{i}}}{\sum _{i=1}^{8} k_{i}^{2}} \tag {7}\end{equation*}
\begin{align*} Y'_{i}=Y_{i}.e^{-jis_{n}},\; i=[0,1,\ldots ,127,-128,-127,\ldots ,-1] \tag {8}\end{align*}
Channel estimation is carried out to figure out the VLC channel characteristics. The channel estimation can also be used to estimate SPO and use equalization to compensate it [38]. In our OFDM system, the channel estimation process is carried out by using the long preamble and pilots located in each data symbol. We use the long preamble pattern from the IEEE 802.16d standard as a basis. Then, we modify that long preamble by adding Hermitian symmetry and pilots for SFO estimation purposes.
We use least squares estimation techniques that ignore the effect of AWGN noise to estimate channel response at the long preamble, which is defined as \begin{equation*} H_{LP}(k)=\frac {Y_{LP}(k)}{X_{LP}(k)} \tag {9}\end{equation*}
\begin{equation*} H_{pilot}(l)=\frac {Y_{pilot}(l)}{X_{pilot}(l)} \tag {10}\end{equation*}
\begin{equation*} d(l)=\frac {H_{pilot}(l)}{H_{LP}(l)} \tag {11}\end{equation*}
\begin{equation*} H_{data}(k)=H_{LP}(k).c_{update} \tag {12}\end{equation*}
The equalization method used in our OFDM system is minimum mean squared error equalization (MMSE), defined in (13). This equalization method takes AWGN noise into account. The noise variance \begin{equation*} Y_{equ}(k)=\frac {Y_{data}(k)}{|H_{data}(k)|^{2}+\sigma _{n}^{2}}.(H_{data}(k))^{*} \tag {13}\end{equation*}
The demodulation process is carried out using the soft decision method, which is not exactly the opposite of the modulation process (hard decision). The soft decision method gives quantization for each data bit and leaves the IQ to data bit conversion process to the Viterbi decoder. In this design, we use 5-bit soft decision quantization. Deinterleaver is the reverse process of interleaver. We also use the rectangular block deinterleaver from the Xilinx IP core [36]. Viterbi decoding [41], [42] is used to decode the soft bits from the deinterleaver output.
Proposed System Architecture
In this section, we elaborate on our proposed SoC architecture for the VLC prototype. At the end of this section, we also compare the advantages of our architecture to the state-of-the-arts.
A. System Overview
Fig. 3 shows the system overview block diagram of our proposed VLC system. It consists of a COTS Wi-Fi router that is connected to the internet. This router is configured in client mode and connected to another Wi-Fi AP that has an internet connection. The connection to the host FPGA board is done using Ethernet. A point-to-point connection occurs between the host FPGA board and the client FPGA board. The connection on the downlink line uses the VLC channel, while the uplink line uses a cable. The connection from the client FPGA board to the client laptop is done using Ethernet.
B. OFDM Frame Format
The frame format used in our OFDM system is based on the IEEE 802.16d and IEEE 802.11a standards. Fig. 4 shows the proposed OFDM frame format. It consists of a short preamble symbol, a long preamble symbol, a header symbol, and a variable number of data symbols. The short preamble is used for time synchronization, and the long preamble is used for channel estimation.
The header symbol contains information for processing the data symbols. The contents of the header symbol are modulation type (MCS), data length, and CRC16 to detect errors. We chose a maximum data length of 1600 bytes to accommodate the data size of the Ethernet frame (1518 bytes).
C. Hardware Software Partitioning
Fig. 5 shows the partitioning of hardware and software in our proposed VLC system. This system consists of three layers, namely FPGA hardware, Linux kernel space, and user space. The FPGA hardware layer includes our baseband processor implementation, responsible for sending and receiving OFDM signals to and from the DAC and ADC. The Linux kernel space consists of a system call and network socket as an interface to the user application (
D. SoC Architecture
Fig. 6 shows our proposed SoC block diagram that is implemented on the Xilinx Zynq Programmable SoC. The system consists of a processing system and programmable logic. On the processing system side, we use an SD card for Linux OS, UART for debug terminal, Ethernet for communication, and GPIO for LED indicator. These peripherals are connected to the ARM Cortex-A8 processor via the AHB/APB bus. On the programmable logic side, there is our proposed OFDM baseband processor design. Data transfer to and from DDR memory is carried out using TX and RX DMA via the AXI bus. TX and RX ring control are used to manage data frames from/to DMA. The data frames themselves are stored in FIFO. The TX frame control reads one frame from the FIFO and then writes it to the on-chip RAM. After that, the data bytes are read by the TX OFDM control and sent to the OFDM TX datapath for OFDM modulation. On the RX side, the digitized OFDM signal is demodulated by the RX OFDM datapath. Then, the RX frame control reads one frame from the on-chip RAM and then writes it to the FIFO.
The main TX and RX frame control modules arrange all blocks in the OFDM datapath by providing control signals to the respected modules based on Finite State Machine (FSM), as shown in Fig. 7. In the TX FSM, the system will enter the state
In the RX FSM, the system will enter the state
E. Digital-to-Analog Interface
Fig. 8 shows the block diagram of the digital-to-analog interface. On the FPGA chip part, there are upsampling and downsampling blocks with a rate of
The TX OFDM frame from baseband is converted from digital to analog by using a DAC that runs at 50 MHz. After that, it goes to the LED module. On the RX side, the light intensity detected by the avalanche photodiode (APD) module is converted from analog to digital by using an ADC that runs at 50 MHz. The LED module consists of a bias-tee ZFBT-4R2GW+ [43] from Mini-Circuits, a resistor of
F. Architecture Comparison With State-of-the-Art
For the sake of clarity, we summarize our proposed architecture and provide a comparison to the state-of-the-art system. In [12] and [16], the PHY layer is limited by the maximum BBB sampling rate (2.1 MHz). In [18], the PHY layer is limited by the maximum ATmega328p sampling rate, and it is also not an integrated chip with the Qualcomm SoC. These architectures are based on GPP, which is not specialized for high-speed signal processing. The architecture of our proposed design is essentially different from these designs because we proposed our system with an SoC architecture and implemented our signal processing on an FPGA, which is specialized for high-speed signal processing due to parallelism.
Reference [15] uses digital modulation based on IEEE 802.15.7 PHY.II.1 Standard, while reference [17] uses UART protocol. PHY driver and packet processing are done in Linux user space software instead of kernel space, which causes data rate limitations. The modulation of our proposed design is essentially different from these designs because we proposed our system with OFDM modulation. OFDM modulation is used in many wireless systems today. In terms of the PHY driver and packet processing, we implemented ours as a loadable Linux kernel module that runs in kernel space. With this method, our driver is closer to the TCP/IP processing module of the Linux system. As a result, it will reduce the bottleneck of data transfer [45].
In [13] and [14], the RF signal from this unmodified Wi-Fi ASIC chip is down-converted to match the VLC analog front-end (AFE) specifications. For the RX part, the signal from the VLC RX is upconverted so that it can be processed by the unmodified Wi-Fi chip. For the MAC and TCP/IP processing sections, they use the standard Linux operating system (OS) run on an Intel NUC mini PC. The architecture of our proposed design is essentially different from these designs because we proposed our system with an SoC architecture, whereas the previous designs used a COTS Wi-Fi ASIC that is specialized for RF.
Proposed RTL Design
The proposed RTL design includes essential OFDM modules: time synchronization, SFO estimation and compensation, channel estimation and equalization, as well as the DMA interface and Linux driver. These modules critically impact the receiver’s performance and the overall system. However, the state-of-the-art works listed in Table 1 have not explored the RTL design of these modules. Additionally, the SFO estimation design in [40] was tested using an AWGN channel and an oscilloscope, rather than an FPGA. This motivates our investigation into the RTL design of these crucial OFDM modules. Detailed descriptions of these modules are provided below.
A. Time Synchronization Module
Fig. 9 shows the block diagram of the time synchronization module. The input of this module is the free-running signal from the match filter. The objective of time synchronization is to find the start of the OFDM frame by detecting peaks from the cross-correlation process defined in (3). The cross-correlation process is carried out by quantifying input data from 16-bit to 2-bit to save FPGA resources. Then, the input data goes into the shift register, which stores 64 IQ samples. For each IQ sample that arrives, the contents of the shift register will be multiplied by the contents of the ROM and then added up. These results will be absolute and summed to get the final cross-correlation results.
The peak search control module will compare the cross-correlation results with a threshold value. If it is above the threshold, then it is identified as a peak. The module will count the number of peaks that occur until five peaks are detected. The fifth peak is the start of the OFDM frame. This process is illustrated in Fig. 10.
B. SFO Estimation and Compensation Module
SFO estimation is carried out using the pilot method [39], [40]. In this method, the pilots are located on each OFDM symbol. Every OFDM symbol except the short preamble has 8 pilots, which are used for SFO estimation and also for channel estimation. The SFO estimation process begins by calculating the angle of the pilot, which is defined in (6). In our OFDM system, TX pilot \begin{equation*} \hat {\phi }_{k_{i}} = angle(Y_{k_{i}}),\; k_{i}\in Pilot,\; i=[1,2,\ldots ,8] \tag {14}\end{equation*}
\begin{equation*} s_{n} = \sum _{i=1}^{8} k_{i}\hat {\phi }_{k_{i}}.c \tag {15}\end{equation*}
\begin{equation*} c=\frac {1}{\sum _{i=1}^{8} k_{i}^{2}}=3.7509.10^{-5} \tag {16}\end{equation*}
Fig. 11 shows the block diagram of hardware design. The process is carried out for every single OFDM symbol, which consists of 256 subcarriers. The 8 pilots are separated from the rest of the OFDM symbol by a pilot extractor block. After that, they are stored in FIFO. CORDIC 0 (translation mode) is used to convert from rectangular form to polar form to get the calculation of (14).
Then, the result will be multiplied by the pilot subcarrier index, which is then accumulated in a register. After multiplying by c, the slope value is obtained, which is the output of Eq. 15. Finally, the compensation process is carried out by multiplying the index subcarrier data with the slope. Then, this value becomes the rotation angle of CORDIC 1 (rotation mode), which will rotate each subcarrier of the OFDM symbol.
As a remark, in this SFO module implementation, we have optimized the circuit by eliminating two division operations. One division operation is removed, and the other one is replaced with a multiplication operation.
C. Channel Estimation and Equalization Module
Channel estimation and equalization is a mandatory module for the OFDM system. However, in the state-of-the-art works in Table 1, the design of this module in RTL has not been explored. Therefore, this motivates us to propose this module.
Fig. 12 shows our proposed channel estimator RTL design for estimating the channel response at long preamble
\begin{align*} H(n)& =H(n-1)+\frac {1}{2}(H(n+1)-H(n-1)) \tag {17}\\ H(-1)& =H(-2)+\frac {1}{4}(H(2)-H(-2)) \tag {18}\\ H(0)& =H(-2)+\frac {2}{4}(H(2)-H(-2)) \tag {19}\\ H(1)& =H(-2)+\frac {3}{4}(H(2)-H(-2)) \tag {20}\end{align*}
Fig. 13 shows our proposed pilot channel estimator and equalizer. The first part of the circuit is to calculate the difference between long preamble channel response and pilot channel response
Fig. 14 shows our proposed noise power estimator module calculated by (21). The noise power is estimated from guard subcarrier \begin{equation*} \sigma _{n}^{2} = \frac {\sum _{i=1}^{55}(Re(g_{i})^{2}+Im(g_{i})^{2})}{110} \tag {21}\end{equation*}
D. DMA Interface and Linux Driver
Data transfer between hardware and software becomes important to achieve minimum overhead. This can be achieved using DMA and interrupt methods, which are programmed as a loadable Linux kernel module/driver. The method used is adapted from openwifi [27]. Fig. 15 shows the data flow diagram for the DMA interface and Linux driver.
This is the procedure for how data flows:
When there is a data frame, i.e., Linux’s socket buffer (SKB), that must be sent from the net device, the net device calls the
vlc_xmit() function. This function creates a frame descriptor (FD) that contains frame information such as frame size and pointer data to be sent to the FD register and stored in the FD FIFO.The
vlc_xmit() function performs TX DMA setup by setting the data pointer where the DMA should read data. Then, the TX DMA transfer is executed. The DMA reads data from DDR memory and then sends it to the frame FIFO.The TX frame control requests data from the TX ring control state machine. If a frame is available, then it is sent to the TX OFDM datapath, and the transmission is started. After the process is complete, the done signal is received.
The TX ring state machine waits for the done signal from TX frame control. When the done signal occurs, a TX interrupt signal will be sent. In the
tx_isr() interrupt handler function, the FD that contains frame information will be read, and then the memory location that stores the SKB data will be freed. Finally, the TX done signal toggles an on-board LED to indicate TX activity.In the RX section, the demodulated OFDM bytes are written to the on-chip RAM. A done signal is received when the process is completed. Then, RX frame control sends the frame to the RX ring state machine and to the frame FIFO. The RX ring state machine adds a timestamp to every RX frame.
RX DMA reads the frame from frame FIFO and writes it to DDR memory. After the transfer process is completed, there is a done signal (DMA interrupt complete) to the RX ring state machine. The RX ring state machine sends an RX interrupt to the CPU.
In the
rx_isr() interrupt handler function, the data pointer to the received frame is read and sent to thenetif_rx() function in the form of SKB data structure. Thenetif_rx() is a function from Linux’s net device structure that processes the frame to the network layer. Finally, the RX done signal toggles an on-board LED to indicate RX activity.
Performance Results and Discussions
This section reports and discusses the simulation results of the model, the experimental setup, real-time performance, and comparison with other works.
A. Simulation Results
1) SFO Compensation
Our SFO compensation model is evaluated using a MATLAB simulation. The SFO compensator can reduce the error vector magnitude (EVM) to below 10% from SFO −100 to 100 ppm. The details of SFO performance are published in [47]. Fig. 16 shows the result of bit error rate (BER) simulation under various SFO values with 16-QAM modulation. At SFO values of 20 and 80 ppm without compensation (red and magenta curves), the achieved BER curve reaches the error floor. After compensation, the achieved BER curves (blue and black curves) are approach the ideal condition, which has an SFO of 0 ppm (green curve).
2) SPO Compensation
Our MMSE equalizer can reduce the EVM to below 10% from SPO −0.5 to 0.5. The details of SFO performance are published in [47]. Fig. 17 shows the result of bit error rate (BER) simulation under various SPO values with 16-QAM modulation. At normalized SPO values of −0.3 and 0.12 without equalization (red and magenta curves), the achieved BER curve is at the error floor. After equalization, the achieved BER curves (blue and black curves) are close to the ideal condition, which has an SPO of 0 (green curve).
3) Bit Error Rate
The BER simulation for the whole OFDM system is carried out on the AWGN channel. The normalized SPO value is 0.12 because this is the value at which the EVM result is the worst. The SFO value used is 10 ppm because it is the tolerance of the oscillator on our target FPGA board. Based on the simulation result, the minimum SNR required to reach zero BER is 1 dB for BPSK and 17 dB for 16-QAM.
B. FPGA Resource
1) Overall System
Our RTL design is synthesized and implemented for Eclypse Z7 boards that use the Xilinx Zynq 7000 XC7Z020-1CLG484C chip with the Xilinx Vivado version 2019.1. Our design can run at a maximum clock frequency of 50 MHz. Total on-chip power consumption obtained from the Xilinx Vivado tool is 2.421 W. Table 2 shows the FPGA available resource and its utilization.
For the sake of clarity, we cannot compare our FPGA resource utilization to the state-of-the-art because such data is not available. In work [12], [16], and [18], they use a GPP-based platform, not an FPGA. In work [13] and [14], they use COTS Wi-Fi chip, so the resource utilization is not available. Although in work [15] and [17] they use FPGA but do not provide the data.
2) SFO Optimization
In our SFO module implementation, we have optimized the circuit by eliminating two division operations. One division operation is removed, and the other one is replaced with a multiplication operation. Table 3 shows a comparison of FPGA resource utilization of the required operations before and after optimization. The resource comparison is done between two 16-bit fixed-point dividers compared to one 16-bit fixed-point multiplier.
C. Experimental Setup
Fig. 19 shows a photograph of our experimental setup. We use a collimating lens and a focusing lens from Thorlabs Optics [48]. We also use optical posts and mounting from Thorlabs Optomechanical Components [49]. The LED module is placed at the focal length of the collimating lens so that the light is aligned in a parallel fashion in order to increase the distance. The APD module is placed at the focal length of the focusing lens in order to focus the light onto a 1 mm photodiode. The distance between lenses is 1 m. In this experimental setup, the analog subsystem is currently under development. We are utilizing the digital amplifier within the FPGA, without additional analog amplifiers on the transmission (TX) or reception (RX) sides. Future improvements, including the use of a high-power LED and supplementary analog amplifiers on both the TX and RX sides, are anticipated to enhance the transmission distance. The OFDM bandwidth measured by the spectrum analyzer is 4 MHz.
BER vs. SNR for the overall OFDM system under the following conditions: AWGN, SPO=0.12, SFO=10 ppm, MTU=1600 bytes [47].
Fig. 20 shows our network diagram. Our VLC system has a Linux driver, so it is detected as a standard interface in the Linux system, named
Network diagram of the system. The IP address settings for each interface are shown.
D. Real-Time Performance
1) Bit Error Rate
Fig. 21 shows our real-time IQ constellation plot from our data logging program at a distance of 1 m for QPSK and 16-QAM modulation. BER measurements were carried out on the system. We use test data as many as 1000 frames. Each frame contains 1500 bytes of data, so the total number of test data is 12000000 bits. All types of modulation get a BER value of 0 at a distance of 1 m. This condition is in accordance with the BER simulation results at SNR > 20 dB.
Real-time IQ constellation plot for (a) QPSK and (b) 16-QAM captured from our data logging program.
2) Data Rate
We can calculate the maximum theoretical data rate of the OFDM baseband processor by using (22). Where \begin{equation*} data~rate=\frac {payload~size}{(T_{pre}+T_{head}+T_{data}+T_{space})} \tag {22}\end{equation*}
\begin{equation*} N_{sym}=\left \lceil {{\frac {N_{payload}}{(96*N_{bpsc}*rate)-8}}}\right \rceil \tag {23}\end{equation*}
Table 4 shows the theoretical data rate and measured data rate. The data rate is measured between the VLC host and VLC client at a distance of 1 m. There are two real data rates that are measured using
We evaluate the effect of SFO on data rate in real-time condition. The real-time TCP data rate on the system without SFO compensation is much lower than the system with SFO compensation, as shown in Fig. 22. The system with SFO compensation has an average data rate of 4.35 Mb/s, whereas the system without SFO compensation has an average data rate of 0.47 Mb/s. Our SFO compensation module boosts the TCP data rate by 8.25x on average.
3) Latency, Jitter, and Packet Loss
Table 5 shows measured latency, jitter, and packet loss. These parameters are measured between the VLC host and VLC client at a distance of 1 m. The average latency, jitter, and packet loss for all modulations and rates are 0.905 ms, 0.003 ms, and 0.155%, respectively. The result shows that our system is at an acceptable latency, jitter, and packet loss level.
The packet loss comparison between the system with SFO compensation and the system without SFO compensation is displayed in Fig. 23. This data is obtained from real-time measurements using
E. Comparison of Transmission Quality With Other Works
Our work addresses the limitation of sampling rate by still using a low-cost FPGA, but paired with a DAC/ADC that has a maximum 100 MHz sampling rate. The modulation that we use is OFDM. Table 6 shows a comparison of our prototype data rates and average packet loss with other works. Compared to the state-of-the-art [12] and [15], our system achieves
F. Scalability
To further increase the data rate, we estimate the potential data rate increase that can be achieved if we use a high-cost FPGA that has a higher sampling rate. Our current system uses a baseband clock of 50 MHz and oversampling in the baseband by \begin{equation*} c=\frac {data\_rate}{(bb\_clock/over\_rate/herm\_fact)} \tag {24}\end{equation*}
Our system has limitations on the baseband clock that can be used because the FPGA resource usage is already close to 80% which makes routing more difficult, thereby increasing the critical path of our design. Further improvements can be made with high-end FPGAs, which have larger resources, higher speed grades, and optimization of the RTL design to reduce critical paths. With this FPGA, we hypothesize that the baseband clock can be increased up to 200 MHz. High-end FPGAs with GSPS DAC/ADC usually have an on-chip interpolation filter so that the baseband oversampling factor can be reduced to increase the baseband sample rate, as done on [27]. Finally, Hermitian symmetry can be eliminated, so that it will increase the data rate twice with the same baseband clock but requires a carrier signal, as done in [13] and [14]. This method has been standardized as light communication (LC) in the new IEEE 802.11bb standard [51] for VLC.
GSPS RFDAC/RFADC usually supports super-sample rate (SSR) data transfer, for example, the Xilinx RFSoC FPGA [52]. The SSR enables FPGAs with clock frequencies of only hundreds of MHz to process I/Q samples at GSPS rates [53]. This method requires RTL modules that process multiple samples in parallel in a single clock cycle. There is an FPGA implementation using this SSR method for free-space optical wireless communication [54]. The work proposes a low-latency system, but not the data rate yet. We can also propose this method for our design scalability.
Table 8 estimates our scalable design estimation for the SSR method. This estimation is done based on the RFSoC
Conclusion
In this paper, we have proposed an SoC architecture and an RTL design of OFDM baseband processor for a network-enabled VLC system. We have proposed and optimized the SFO estimation and compensation module by eliminating division operations. We have also proposed the integration with TCP/IP of the Linux kernel and the AFE of VLC. As a result, our designed system is capable of real-time data transmission through the VLC channel. The real-time network performance is measured using a real application (