Reconfigurable Low-Threshold All-Optical Nonlinear Activation Functions Based on an Add-Drop Silicon Microring Resonator

The realization of optical nonlinear activation functions (NAFs) is essential for integrated optical neural networks (ONNs). Here, we propose and experimentally demonstrate a photonic method to implement reconfigurable and low-threshold all-optical NAFs based on a compact and high-Q add-drop microring resonator (MRR) on silicon. In the experiment, four different NAFs including softplus, radial basis, clamped ReLU, and sigmoid functions are realized by exploiting the thermo-optical (TO) effect of the MRR. The threshold to implement NAFs is as low as 0.08 mW. As a demonstration, a handwritten digit classification benchmark task is simulated based on a convolutional neural network (CNN) using the obtained activation functions, where an accuracy of 98% is realized. Thanks to the unique advantages of ultra-compact footprint and ultralow threshold, the proposed nonlinear unit is promising to be widely used in large-scale integrated ONNs.

The NAF unit, which is responsible for nonlinearly mapping the input of the neuron to the output, is capable of enriching the application scenarios of the neural network [18], [19]. In recent years, several photonic approaches have been proposed to realize NAFs [20], [21], [22], [23], [24], [25], [26], [27], [28], [29], [30], [31], [32]. For example, the optical-electrical-optical (O-E-O) method based on electro-absorption modulators [20], [21] and photodetector-modulator structures [22], [23], [24] has been proposed to achieve electro-optical NAFs. This method has potential to achieve reconfigurable NAFs, but it suffers from a large O-E-O conversion loss and a high power consumption. Alternatively, NAFs can also be implemented in the optical domain using all-optical approaches, which have unique advantages of low conversion loss and low power consumption [33]. In the last few years, all-optical NAFs have been implemented in free space [25], using discrete components or semiconductor lasers [26], [27], [28], which can only achieve fixed nonlinear functions. Recently, reconfigurable all-optical NAFs have also been realized based on cavity-loaded MZI structures by exploiting the free carrier dispersion (FCD) effect and the Kerr effect [29], [30]. However, these structures have large footprints of about 10 4 μm 2 and require a high power threshold of 3 mW. In 2020, an MZI-mesh structure is also proposed to implement different types of NAFs, which has a low threshold of 0.9 mW, but it needs five power supplies to precisely control the phase shift of each phase shifter, which increases the power consumption and the controlling complexity of the chip [31]. Reconfigurable all-optical NAFs, can also be achieved based on a germaniumsilicon hybrid integrated MRR, which has a small footprint, but with an increased fabrication complexity [32], making it difficult for mass production. Therefore, an all-optical nonlinear activation unit with a low threshold, a small footprint, and a low fabrication complexity is highly desirable for large-scale integrated ONNs. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ Here, we propose and experimentally demonstrate a novel method to implement various all-optical NAFs with an ultra-compact footprint and a low threshold based on an integrated add-drop silicon MRR. The MRR is fabricated on a standard 220-nm silicon-on-insulator (SOI) platform with a radius of 10 μm and a coupling gap of 0.35 μm. The footprint of the proposed structure is about 10 2 μm 2 . In the silicon platform, the thermo-optical (TO) effect is relatively strong. Up to now, various integrated devices, such as optical switch [34], optical modulator [35], and temperature sensor [36], have already been realized by exploiting the TO effect on silicon. In this work, multiple NAFs including softplus, radial basis, clamped ReLU, and sigmoid functions are implemented based on a TO-tunable MRR, where the wavelength detuning between the optical source and the MRR is controlled properly. For the clamped ReLU function, the nonlinear threshold power is as low as 0.08 mW. Moreover, to verify the applicability of the realized optical NAFs, a handwritten digit classification benchmark task is simulated in a two-layer CNN using the experimentally obtained activation functions, where a test accuracy of 98% is achieved. Benefiting from the unique advantages of ultra-compact footprint and ultralow threshold, the proposed nonlinear unit is promising to be widely used in large-scale integrated ONNs.

II. OPERATION PRINCIPLE
As shown in Fig. 1(a), a neuron in the ONN mainly consists of a linear MVM unit and a NAF unit. The input data can be loaded to the optical domain using an optical modulator. In this work, all-optical NAFs are realized by exploiting the TO effect in a silicon add-drop MRR, as shown in Fig. 1(b). Fig. 1(c) shows the cross-section view of the MRR. The rib waveguide is designed to have a width of 500 nm and an etch depth of 150 nm. As a demonstration, five MRRs are fabricated and The realization of all-optical NAFs is based on the nonlinear effects in the silicon add-drop MRR. When light enters the silicon MRR, it will be absorbed due to the two-photon absorption (TPA) effect, which may result in a change in the refractive index of the waveguide. The value of the refractive index change is proportional to the power of the input optical signal (Kerr effect). Through the TPA process, free carriers are also excited, which may lead to free carrier absorption (FCA) and FCD effects. The inter-band and intra-band relaxation effects from carriers generated by TPA and FCA will result in the heating of the structure and lead to a thermal refractive index change. Among these nonlinear effects, the TO effects are dominant, which will result in a redshift of the resonant peak [37]. Fig. 2(a) illustrates the transmission spectrum at the through port of the MRR when the input optical power is increased. It can be observed that the resonant peak of the MRR is redshifted as the input power increases. When the input wavelength is set at the initial resonant wavelength (i.e., A), the output power increases linearly with the input power at a low power level, and then increases rapidly when the input power is high enough to induce a resonant peak redshift. As a result, a softplus NAF is realized, as shown in Fig. 2(b) (line A). When the input wavelength is set at the right side of the initial resonant wavelength (i.e., B), the output power increases linearly with the input power at a low power level. When the input power is high enough to induce a resonant peak redshift, the output power goes through a slump followed by a rapid increase, resulting in a radial basis NAF, as shown in Fig. 2 Fig. 2(c) illustrates the transmission spectrum at the drop port of the MRR when the input optical power is increased. Firstly, the input wavelength is set at the initial resonant wavelength (i.e., A). When input power increases from zero, the output power increases linearly with the input power. When the input power is high enough to induce a resonant peak redshift, the increase of the output power slows down. As a result, a clamped-ReLU NAF is realized at the drop port, as shown in Fig. 2(d) (line A). When the input wavelength is set at the right side of the initial resonant wavelength (i.e., B), the output power firstly increases linearly with the input power at a low power level, which is followed by a rapid increase when the resonant peak redshift occurs. When the input power keeps increasing, the increase of the output power slows down resulting from the deviating of the resonant peak. Consequently, a sigmoid NAF is realized, as shown in Fig. 2 The TO coefficient of the silicon material is relatively large, which is given by ∂n/∂T = 1.86 × 10 −4 K −1 . Notably, in a silicon MRR, the light wave is confined in a small footprint, which leads to a low nonlinear excitation threshold. Therefore, a NAF unit with an ultra-compact footprint and an ultralow threshold can be realized based on the silicon MRR structure [37], [38], [39], [40].  Fig. 3(b). Fig. 3(c) shows the measured transmission spectrum of this MRR. The free spectral range (FSR) of the MRR is measured to be 9.5 nm. Fig. 3(d) shows the zoom-in view of the resonance mode near 1552.116 nm. The MRR has a 3-dB bandwidth of 32 pm, a Q-factor of 48000, and an extinction ratio of 11 dB. The insertion loss of the device is measured to be ∼12 dB, which includes the fiber-to-fiber I/O coupling loss.

III. EXPERIMENTAL RESULTS
The parameters and performance of these five MRRs, including the radius, the coupling gap, the 3-dB bandwidth, the  extinction ratio, and the Q factor, are shown in Table I. As can be seen, the Q factor increases linearly with the coupling gap. In this work, the MRR with a 3-dB bandwidth of 32 pm, a Q factor of 48000, and an extinction ratio of 11 dB is used to implement NAFs.
The transmission spectra at the through port and the drop port of the MRR are measured using a tunable laser source (TLS). In the experiment, when the wavelength of the laser source is set at the initial resonant wavelength, the transmission spectrum of the MRR is redshifted due to the TO effect. Fig. 4(a) and (b) show the measured transmission spectra at the through port and the drop port of the MRR when the optical power is increased from −10 dBm to 5 dBm. When the input light wavelength is set at 1552.144 nm, which is far away from the initial resonant wavelength (i.e., 1552.116 nm), the optical bistability phenomenon can be observed. As shown in Fig. 4(c), the hysteresis loop presents the relationships between output power and input power at a 28 pm wavelength detuning when input power is swept in two opposite directions. At the position of sharp jumping, we can extract the bistability boundary of the input power, which means the threshold of the input optical power to avoid optical bistability. Fig. 4(d) shows the relationship between the wavelength detuning and the power threshold of bistability. As shown in Fig. 4(d), when the wavelength detuning is small (smaller than 16 pm), the increasing threshold matches well with the decreasing threshold, and the curve of the output power remains stable when the input power is increased or decreased. In this case, the effect of the optical bistability on the reliability of the NAFs is ignorable. Fig. 5(a) and (b) show the experimentally measured NAFs at the through port of the MRR. At the initial resonant wavelength, when the input power is at a low level, the output power first increases linearly with the input power since there is almost no redshift. With the continuous increase of the input power, the resonant peak starts to redshift, resulting in a rapid increase of the output power. Fig. 5(a) shows that a softplus NAF is realized. When the output power increases nonlinearly with the input power, the nonlinear power threshold can be extracted. As shown in the inset of Fig. 5(a), the nonlinear power threshold is measured to be 0.13 mW. When the input wavelength is set to be a bit redshifted from the initial resonant wavelength, the output power first increases with input power at a low power level. Then the output power goes through a drop followed by a rapid increase, which is caused by the redshift of the resonant peak. As a result, a radial basis NAF is realized, as shown in Fig. 5(b).
The experimentally measured NAFs at the drop port of the MRR are shown in Fig. 5(c) and (d). When the input wavelength is set at the initial resonant wavelength, the output power first increases linearly with the input power at a low power level. As the input power keeps increasing, the increase of the output power slows down due to the redshift of the resonant peak. A clamped ReLU NAF is realized, as shown in Fig. 5(c). As given in the inset of Fig. 5(c), the nonlinear power threshold is measured to be 0.08 mW. When the input wavelength is set to be a bit redshifted from the initial resonant wavelength, the output power increases linearly with input power at a low power level. Then the output power goes through a rapid increase, due to the redshift of the resonant peak. When the input power keeps increasing, the increase of the output power slows down resulting from the deviating of the resonant peak. As a result, a sigmoid NAF is realized, as shown in Fig. 5(d).
To evaluate the effect of the Q factor on the implantation of NAFs. The fabricated five MRRs with different Q factors are used to perform NAFs, As shown in Fig. 6, when the input power increases, the output power at the drop port of the MRR is increased nonlinearly. The nonlinear thresholds of these five MRRs are marked in Fig. 6, whose values are 2.95 mW, 1.25 mW, 0.28 mW, 0.18 mW, and 0.08 mW, respectively. As can be seen, the nonlinear power threshold decreases with the increasing Q factor of the MRR. The dynamic response of the all-optical NAFs is measured based on the experimental setup shown in Fig. 7(a). A continuous-wave (CW) optical signal generated by a TLS is launched into an intensity modulator (IM). Electrical signals generated by an arbitrary waveform generator (AWG) are injected into the IM. The modulated optical signal is fed into the MRR, and then detected by a photodetector (PD). The electrical signal generated by the PD is measured by an oscilloscope (OSC). Firstly, a sawtooth signal is sent into the IM and the output signal of the MRR is measured at its through port. The output signal of the modulator is also measured as a reference signal. The output signals are shown in Fig. 7(b), which means that the NAF measured in a dynamic state agrees well with that measured in a static state, as given in Fig. 5(a). Furthermore, to measure the accurate response time, a square wave signal is sent into the device, and the output signals are shown in Fig. 7(c). The rise and fall time are measured to be ∼12 μs, which is the time that the output signal takes to become stable when the input optical power is changed. When the input wavelength is set at the position which is 16-pm redshifted from the initial resonant wavelength, and a sawtooth signal is applied to the MRR, the nonlinear response at the through port is shown in Fig. 7(d). As can be seen, the response curve is in good agreement with the result given in Fig. 5(b). As shown in Fig. 7(e), the measured nonlinear response at the drop port also agrees well with the measured NAF given in Fig. 5(c).
Finally, an MNIST handwritten digit classification task is simulated to verify the applicability of the realized NAFs. The measured four optical NAFs are used as the NAFs in a convolutional neural network (CNN), which are termed as OAF1, OAF2, OAF3, and OAF4 respectively, as shown in Fig. 8. For OAF1 and OAF3, the input optical wavelength is set at the initial resonant wavelength, while for OAF2 and OAF4, the input optical wavelength is detuned from the initial resonant wavelength with a value of 12 pm. Images of handwritten digits with 28 × 28 pixels are fed into the CNN with two convolutional layers and one fully-connected layer, as shown in Fig. 8(a). A convolution layer includes convolution operation, nonlinear activation operation, and maxpooling operation. The output elements of the fully-connected layer are normalized to represent probabilities from digit 0 to 9. As shown in Fig. 8(b), a linear function is adopted as a comparison besides OAFs. It should be noticed that the weight and bias of each neuron in the CNN are set to be non-negative due to optical intensity containing no negative value. We perform the simulation on the PyTorch platform, and the AdamOptimizer is used to train the model with a learning rate of 0.001. Fig. 8 IV. DISCUSSION Table II shows a comprehensive comparison between different all-optical NAFs generation approaches based on integrated platforms. Compared with the MZI-based structures and the Ge/Si-based structures, the proposed add-drop MRR has a smaller footprint, a lower nonlinear threshold, and a lower fabrication complexity. In order to tune the parameters of the NAFs, a planar directional coupler structure [41], or a TO-tuned MZI [42], [43], [44], [45] can be integrated in the MRR, where the coupling coefficient of the MRR can be tuned flexibly. By tuning the coupling coefficient, the Q factor of the MRR can be changed, and the threshold of the NAFs can be changed accordingly. Moreover, the NAFs will also have a much steeper shape when the Q factor of the MRR is increased.
In the experiment, the TO effect (mainly FCA and TPA effects) of a silicon photonic MRR is used to realize the optical NAFs. The all-optical nonlinear response time of the MRR is measured to be 12 μs, which corresponds to a data rate of over 80 kbps. In the future, a much higher data rate up to a few Gbps can be realized by improving the response speed of the MRR with the help of other nonlinear effects, such as FCD and Kerr effects [29], [30]. Meanwhile, self-pulsation is observed in the experiment when the input optical power is high. The NAFs realized in the experiment are all cut off by self-pulsation. Selfpulsation could be further suppressed in MRR by free-carrier depletion [46].

V. CONCLUSION
In this work, we propose and experimentally demonstrate a photonic method to implement reconfigurable and low-threshold all-optical NAFs based on an ultra-compact silicon add-drop MRR. Thanks to the strong TO effect of the MRR, four different types of NAFs are realized by controlling the input wavelength detuning, including a softplus function and a radial basis function at the through port, a clamped ReLU function and a sigmoid function at the drop port. The nonlinear power threshold is as low as 0.08 mW. Moreover, a handwritten digit classification is simulated in CNN with experimentally measured optical NAFs and obtains an accuracy of 98%. As an purely optical operation unit, the proposed add-drop MRR is promising to be widely used in large-scale integrated ONNs as a reconfigurable nonlinear unit.