Scalable and Fast Optical Circuit Switch Based on Colorless Coherent Detection: Design Principle and Experimental Demonstration

We investigate a large-scale and fast optical circuit switch system that uses digital coherent technologies. The achievable port count is expanded by introducing colorless detection which eliminates channel selecting filters at receivers. Wavelength selection is performed using a shared local oscillator (LO) bank, which is configured by combining a multi-wavelength optical source with Silicon-photonic tunable filters (TFs). Microsecond switching times are realized with the thermo-optic effect available with the Silicon-photonic TF. Design optimization of the proposed system handles complex level of freedom, i.e., available wavelength number, space switch port count, optical amplifier location and gain, and device loss. In addition, colorless coherent detection induces penalties, which further complicates switch scale assessment. In this paper, we develop a simulator to clarify how those parameters affect switch performance; the switch throughput is maximized with consideration of the degradation stemming from colorless detection. The design performance accurately matches results measured in various system scenarios. A dual-carrier 256-Gb/s DP-QPSK experiment successfully demonstrates a 1,856 $ \times $ 1,856 optical switch system with switching time under 3.52 μs. The resulting total throughput is 441 Tb/s assuming 7%-overhead HD-FEC. To the best of our knowledge, this is the first demonstration of 400-Tbps class large bandwidth optical switches that use TFs as wavelength selective devices for coherent detection.


I. INTRODUCTION
E MERGING applications such as cloud computing and big-data analysis are driving the explosive growth of datacenter traffic. The annual global data center related traffic is expected to reach 20.6 zettabytes by the end of 2021 given the compound annual growth rate (CAGR) of 25% from 2016 to 2021 [1]. More than 75% of the intra-data center traffic is associated with the east-west traffic inside data centers [2], which necessitates large-bandwidth intra-data center networks. The continued explosion in intra-data center traffic will be driven by the increase in machine learning and artificial intelligence related applications. Modern data centers rely on hierarchical electrical packet switching (EPS) networks which interconnect servers across top-of-rack (ToR), aggregation (Leaf), and core (Spine) switches [3], [4]. Advantages of multistage switching include fault tolerance and network scalability. However, present intra-data center networks face the bandwidth crunch and power consumption limitation of electrical switches. One solution is the introduction of optical circuit switches in combination with electrical packet switches [5]- [9]. The bandwidth and power barriers can be broken by offloading large flows from electrical switches to optical switches. In creating a hybrid switching system, scalable and fast optical switches are critical since they realize flat networks with minimal latency and fewer optical interconnections. Compared to the current multi-tier EPS networks, introducing large-port-count optical switches can substantially reduce (∼75%) transponders/fiber-links and switch power consumption [10].
Many kinds of optical switches have been demonstrated so far based on technologies such as microelectromechanical system (MEMS) [5], [11] and semiconductor optical amplifiers (SOAs) [12], [13]. However, their port counts and/or switching speeds are insufficient to support hyperscale data centers. To resolve this, we have developed an optical switch architecture that combines the two independent dimensions of space and wavelength for realizing high-port-count and fast connectivity. The total switch port count is yielded by the product of the available number of wavelengths (∼100) and space switch port count, and hence even using small (∼16) space switches creates large port counts (1600). The switching time is constrained by the This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ tuning speed of the wavelength-selective devices. Two distinct approaches to implementing wavelength routing have been identified; wavelength tuning at the transmitter side [14], [18] or the receiver side [15]- [17]. Wavelength-tunable transmitters need fast wavelength tunable laser diodes (TLDs), but the availability and reliability of such lasers remains problematic at present. Fortunately, wavelength-tunable receivers can be implemented with tunable filters (TFs) which are simpler and more reliable devices than TLDs. We therefore focus on TF-based optical switches in this paper.
Coherent technologies will play a key role in increasing the port count and bandwidth of optical switching networks. They are now being deployed in inter-data center networks and substantial year-on-year cost reductions are available [19], which warrants their eventual application to intra-data center networks. The hardware complexity of digital signal processing (DSP) can be significantly reduced from long-haul applications due to minimal chromatic/polarization-mode dispersion compensation needed, and coherent technology will soon meet the datacenter power and density requirements [20]. The technical difficulty of using digital coherent transceivers for optical switching is the long convergence time taken to complete receiver equalization. The convergence time can be shortened by the use of burst-mode DSP based on a pilot-aided method. Recent works have demonstrated 816-ns burst-mode reception in a 100-Gb/s/λ coherent time division multiplexing passive optical network (TDM-PON) [21] and convergence of a few nanoseconds for high-order modulation (DP-128/256/512-QAM) bursts [22]. Burst-mode reception techniques enable the application of coherent technologies to fast optical switching networks for intra-data centers. An important attribute of coherent switch systems is their colorless detection capability which removes the TF set in front of the receiver in non-coherent systems. The elimination of TF loss improves receiver sensitivity, resulting in enhanced achievable switch port count. However, the configuration requires a fast-tunable local oscillator (LO) at the receiver for wavelength selection. We propose here to apply cost-effective LOs by replacing the TLDs with a wavelength bank and Silicon-photonic TFs.
Optical switch performance is determined by the complex interaction of various parameters, such as available wavelength number, space switch port count, optical amplifier gain, and splitter port count. Erbium-doped fiber amplifiers (EDFAs) are relatively expensive devices and should be shared by as many ports as possible. The gain (cost), sharing number, and available switch port count are mutually related. The parameter values need to be optimized considering the tradeoffs between system scalability and cost.
In this paper, we propose a large-throughput and fast optical circuit switch that uses a Silicon-photonic TF-based LO bank to realize colorless coherent detection. In Section II, we describe the principle of our proposed switch and its configuration. Section III derives an analytical model to quantify receiver performance for both colored (TF in front of the receiver) and colorless configurations. Using the developed model, the effectiveness of colorless detection is quantitatively evaluated. Based on the model, in Section IV, a simulator is developed for evaluating the port count of the colorless optical switch under reasonable scenarios. The achievable port counts are quantified via extensive numerical simulations that explore the interplay between modulation formats, EDFA gain, and EDFA sharing number. In Section V, we conduct transmission experiments to verify the analytical results; the feasibility of 1856 × 1856 optical switching with switching times under 3.52 μs and total throughput of 475.1 Tbps is verified in a 256-Gbps dual-carrier dual-polarization quadrature phase shift keying (DP-QPSK) experiment using cost effective EDFAs (saturation power of 14.2 dBm). Finally, Section VI concludes this paper. A part of the experimental results has been presented at an international conference [23].

II. LARGE-SCALE OPTICAL CIRCUIT SWITCH
The optical switch offers future-proof large-bandwidth switching capability thanks to its inherent transparency as regards bit rate and modulation format. Fig. 1 shows a generic configuration of an electrical and optical hybrid switching network using large-scale optical switches. Since each high-port-count switch can interconnect all the ToRs or PODs (Point of Deliveries) in a data center building/floor, the optical switch network creates a single-tier network that eliminates most (e.g., 80-90%) of the multi-stage electrical switches [9]. Here the number of parallelism of optical switches usually matches that of ToR (POD) uplink to be connected to the optical switch network [9], [24]. Optical switches and ToRs are connected to a controller via control channel (in-band or out of band). How optical switches can simplify the overall data center network is elaborated in Ref. [10]. The flattened network produces various advantages in addition to the before mentioned substantial reduction in the number of optical transponders and interconnection links needed. The latency and control burden are reduced by the single-hop connection between any communicating node pairs without queuing delay. It is also worth noting that high reliability and utilization are realized without network-wide congestion control by making the best use of optical switch parallelism and field-programmable gate array (FPGA)-based simple connection set-up control [24]. Thus, developing a large-scale and high-speed optical switch is critical if optical switching technologies are to be cost-effectively deployed in data centers. In this section, we explain the concept and architectures of our developed high-port-count and fast optical switches. The key parameters that determine switch performances or available port counts are also explained.

A. Optical Switch Architectures
Optical switching technologies have already been widely deployed in the present core and metro networks as reconfigurable optical add-drop multiplexers/optical cross-connects (ROADMs/OXCs). The basic configuration was first demonstrated in the early 1990's [25] by creating an OXC (the so-called route-&-combine configuration; WSSs at incoming fiber side and optical couplers at outgoing fiber side in the express switch part). This has been widely adopted for commercial ROADM/OXC systems as depicted in Fig. 2(a), most of which utilizes the broadcast-&-select configuration (optical splitters and WSSs are placed in reverse configuration to route-&-combine one). The wavelength-routing mechanism is based on TFs [see add part of Fig. 2(a)] or TLDs [see drop part of Fig. 2(a)], which is reflected in our two "mirrored-twin like" types of optical switch architectures for data center applications [9] as depicted in Figs. 2(b) and 2(c). One of the major performance differences between ROADMs and data center optical switches is the switching latency requirement. Switching can be as slow as the order of seconds for ROADMs (to support optical path protection mechanism it should be about 10 ms), while our present target is less than 10 μs for data center application. The wavelength selective switch (WSS) is a very sophisticated device, and relies on 3D-MEMS or liquid crystal on silicon (LCoS) technologies. Its switching speed including careful beam-steering control during transition period to avoid crosstalk is rather slow (∼10 seconds), which prevents us from using it for intra-data center networks. In addition to this, cost efficiency and scalability are important goals to be met. Accordingly, we rely on Silicon-photonic technologies for switch and TF components as detailed in Section II-B.
In addition to the switch port count, the modulation scheme is a significant determiner of total switch throughput. With regard to Figs. 2(b) and 2(c), Table I summarizes previous studies on different wavelength tunability and modulation formats. It is demonstrated that with the architectures shown in Figs. 2(b) and 2(c), the number of switch inputs and outputs can be scaled up to over 1000 ports. Intensity modulation and direct detection (IM/DD) makes it difficult to attain throughputs above 100 Tbps, although its simplicity eases implementation concerns. Increased port count and throughput are provided by coherent technologies as they allow multi-level modulation and high receiver sensitivity. A recent TLD-based coherent switch demonstrated 182-Tbps throughput bandwidth with the switching time of 185 μs [18]. The drawback of the architecture shown in Fig. 2(c) is the insufficient switching speed of the TLDs. They are also immature in terms of reliability and cost, so we focus on the TF type architecture [ Fig. 2(b)] hereafter. Silicon-photonic TFs provide fast wavelength tuning in a simple and reliable manner, and future cost-effectiveness is expected [26], [27]. However, for TF-based switch using coherent detection, tunable LOs are needed at the receiver side, which kills the major benefit of non-reliance on TLDs. Self-coherent technology such as differential QPSK can eliminate LOs [17], but its receiver sensitivity or available port count becomes lower than that of coherent one. Considering this, we introduce here a TF-based optical switch with shared LO bank that is formed by combining a fixed wavelength bank and TFs. In coherent detection, the system needs careful design to take full advantage of colorless detection (see Section III). This work aims at expanding the throughput and port count simultaneously compared with previous configurations [14]- [18]. The available switch throughput is analyzed in Section IV. it into a coherent receiver to which a wavelength tuned LO is added. LO wavelength tuning is done by selecting one of the N wavelengths in the group. The shared LO bank provides the LO light by extracting one wavelength using a Silicon-photonic TF from a N-wavelength LD bank (LDB). With this configuration, N LDs are shared by MN ports, and thus cost-effective implementation will be possible. Finally, the received signal is demodulated through DSP. In this way, fast-tunable coherent detection is realized for creating large-scale optical switches without relying on TLDs. Please note that transceivers (Tx#k and Rx#k) in Fig. 3 can be located in ToRs. With this implementation, one-hop optical connection is available between each source ToR and destination ToR through an optical switch. The transceivers can be set at an optical switch, which increases the number of optical hops, but enables the use of conventional [non-coherent vertical cavity surface emitting laser (VCSEL) type] transceivers for ToRs. Implementation details are out of the scope of this paper.

B. Proposed Colorless Optical Switch
Regarding LO bank implementation, different configurations are possible. Another example is to configure one LO bank per cluster of ToRs, where an LO bank is shared by multiple (P) ToR switches each hosting, for example, several dozen (Q) Tx/Rx. With this implementation, N LDs in the LO bank can be shared by P × Q (more than several hundred) Rxs, although short reach connections between an LO bank and ToRs are needed. The optimum configuration can be determined after considering the costs of LO bank, cabling, etc. The LO bank control will then be done by the optical network controller via one of the ToR switches, since each ToR hosts a control channel (in-band or out of band anyway) from/to the optical switch controller. Some details of the control network configuration, which is out of the scope of this paper, are presented in Ref. [24].
To evaluate switch scalability, systematic dimensioning is necessary because various parameters such as the available wavelength number N, MCS port count M, EDFA saturation power P E, sharing EDFA number S, and each device loss must be considered and several have complex trade-off relationships. The available wavelength number (N) directly determines the maximum port count, which is dependent on the wavelength band used (such as O, S, C, and L band etc.) and channel spacing. The total port count can be expanded by making use of multiple bands (i.e., C + L band). However, this work assumes the use of just the C band since i) it is the resource most commonly utilized for wavelength division multiplexing (WDM) networks so cost-effective EDFAs are available, ii) increasing N necessitates receiver-side TFs to prevent receiver power saturation, and iii) even the single band can achieve thousands of ports at 256 Gbps (see Section IV). When we assume an 18-dB receiver dynamic range according to OIF specification [28], N ≤60 will be possible as discussed later Of course if we apply port-by-port implementation of the M × M MCS function, virtually no limit will be imposed on M. Based on the above discussion, the following analyses focus in on M values of 64 or less. Our proposed switch architecture is functionally similar to the present ROADM as explained in Section II-A. One critical point to be noted is the stringent cost-effectiveness given the intra-data center application. As a result, EDFA saturation power and sharing number (S, in Fig. 3) by multiple ports are important parameters, since an optical amplifier is a relatively expensive component to add to a switch. On top of this, EDFAs need thermo-electric coolers (TEC) to attain high saturation power (>18 dBm). While the power is low enough, EDFAs with shared pump LD can be applied. We therefore develop a simulator that can clarify the best combination of parameters by assessing suitable arrangements for a variety of future data center requirements such as link speed, number of racks, scalability, and so on. The analysis details are given in the next section.

III. COLORLESS RECEIVER
Direct detection of a WDM signal requires an optical demultiplexer/filter in front of the receiver. A filterless receiver can be based on coherent detection by tuning LO wavelength to the desired signal. Colorless coherent detection can be instrumental in improving receiver sensitivity thanks to the removal of the filter loss and lower costs/size [30], [31], however, the system performance can be degraded by receiver saturation at WDM reception. For current colorless/directionless/contentionless (CDC) ROADMs using coherent transmission, TFs such as MEMSbased ones are commonly used for maximizing repeater spacing (transparent transmission distance) and the number of WDM channels. For data center optical switching applications, a stringent requirement is the tuning speed (< 10 μs) of optical filters, and hence we cannot rely on present slow tuning (>10 ms) devices. A Silicon-photonic fast-tunable filter is a promising candidate, but the loss tends to be larger than that with MEMS technology [32]. Much shorter distance than core/metro networks is targeted for data center application (e.g., ≤2 km), but the loss before the receiver significantly affects the available port count due to the adopted switch configuration. For this reason, colorless detection (no TF in front of the receiver) is expected to expand the optical switch scale. So far, no comprehensive quantitative comparisons between filtered and colorless receivers have been conducted, and the effectiveness of colorless detection in terms of switch port count has not been clarified. To this end, in this section we derive an analytical model to quantify receiver performance for both configurations. Fig. 4 presents the preamplifier receiver model assumed in the analytical comparison of conventional (i.e., filtered) and colorless coherent detection. The receiver preamplifier amplifies the incoming signal. In the conventional receiver, an optical TF is inserted in front of the optical amplifier for wavelength selection. The amplified signal is split into two polarization components with polarization beam splitters (PBSs). The two orthogonal polarizations are mixed with an LO light in two 90 • optical hybrids. The beat between the LO light and signal in each polarization state is detected by balanced photodiodes (BPDs). Frequency offset between the reference light and the channel of interest is assumed to be much less than the baseband bandwidth for intradyne detection. From the receiver output we can obtain the photocurrents I XI (t) , I YQ (t), I YI (t), and I YQ (t) with respect to time t from the four BPDs. By using the four tributaries, the complex amplitudes of the two polarization components are reconstructed as I X (t) = I XI (t) + jI XQ (t) and I Y (t) = I YI (t) + jI YQ (t), where j = √ −1 is the imaginary unit.

A. Receiver Description
The received signal is degraded by the amplitude and phase fluctuations induced by amplified spontaneous emission (ASE), thermal, shot, and LO relative intensity noise (RIN). Additional noise is created by self-beat interferences from out-of-band (OOB) channels in the colorless receiver. This originates from power imbalance and skew mismatch between the two inputs in each of the four BPDs. If they are assumed to be Gaussian random processes with zero mean, the signal quality can be analyzed according to a known receiver design guideline [30]. The signal-to-noise ratio (SNR) at the output of the colorless receiver is expressed using the signal photocurrent I(t) as where σ 2 ASE , σ 2 th , and σ 2 sh are ASE, thermal, and shot noise variances, respectively, σ 2 RIN is beat noise variance between the LO light and RIN from an LO laser, and σ 2 oob is self-beat noise variance from OOB channels. The mean square of the detected total photocurrent is approximated as where P s,d is the signal power on the desired (target) channel after preamplification, P L is the LO power, is the responsivity of photodiodes, and L E is the excess loss of the 90 • optical hybrid. When the dominant ASE noise is associated with the preamplifier, the noise variance related to the LO power is written as where P ASE,d is the ASE noise power loaded on the desired channel by preamplification. The residual RIN comes from imperfect balanced detection and the noise beating with the LO power is given by where CMRR stands for the power ratio of the residual common mode to differential common mode in balanced detection, RIN LO denotes magnitude of intensity noise from the LO laser, and B is the receiver bandwidth. The ability of BPDs to suppress the signals common to the two inputs is characterized by common mode rejection ratio (CMRR). The self-beat noise effect falling from OOB channels onto the baseband can be expressed as where N is the total number of WDM channels, γ is the scaling factor, and P s,i and P ASE,i are signal and ASE noise power of channel i at the preamplifier output, respectively. The thermal and shot noise variances are respectively given by where i th is the equivalent current density to thermal noise and q is the electron charge. The conventional receiver needs extra preamplifier gain to compensate the TF loss (L TF ), that is, additional ASE noise is cumulative and proportional to the increased gain. By taking account of the ASE noise enhancement, we can calculate the SNR for filtered coherent detection as Note that the self-beat noise in the baseband is negligible because of the removal of OOB channels by the optical filter.

B. Analytical Evaluation
To prove the capability of colorless detection, we examined the receiver performance through the analytical formulas derived in Section III-A. Table II summarizes the parameters used in the analytical calculation; they are based on published reports [30], [31]. To simulate the colorless degradation, we assumed CMRR = − 19 dB and γ = 0.1 as the front-end specifications. Another point to be noted is the performance metric used in comparing the receivers. We define the SNR gain as the ratio of SNR CL to SNR TF , to elucidate the benefits of the colorless receiver. Fig. 5 shows the TF loss (L TF ) dependence of the SNR gain for different received optical powers on desired channel (P s,d ). At power levels beyond −10 dBm, the SNR gain over the conventional receiver is proportional to TF loss. This is because the SNR TF linearly decreases with the increased TF loss in the ASE-noise-limited region. Even though the receiver input power becomes smaller (P s,d < −10 dBm), the colorless receiver performs better than the equivalent system with optical  filtering. Fig. 6 plots the calculated SNR gain against the total number of wavelengths (N ) when L TF = 10 dB. The resultant SNR gain shrinks within the range of 3 dB as the channel number increases. The major reason for the gain reduction is performance restriction imposed by the self-beat noise from increased OOB channels. The tradeoff between ASE and self-beat noises is governed by the preamplifier gain and the number of wavelength channels. Regardless of receiving conditions, the colorless implementation always outperforms the conventional receiver by at least 6.5 dB. As a consequence, the adoption of colorless receiver is confirmed to realize a substantial SNR improvement, and thus switch port count expansion.

IV. SIMULATIONS
In this section, achievable optical switch port count is analyzed with the aim of optimally designing the proposed optical switch based on colorless coherent decoction. A simulator is newly developed for evaluating the port count of colorless optical switches under realistic scenarios.
We begin with the numerical assessment that clarifies the impact of the SNR gain on optical switch scale. Port-count scalability of conventional and colorless receivers are compared via Monte-Carlo simulations. The tested signal was 256-Gbps DP-QPSK shaped by a root-raised-cosine (RRC) filter with a roll-off factor of 0.05. Fifty-eight wavelength channels modulated with the signal were aligned on a 75-GHz frequency grid in the full C band (4.4 THz). The optical power from the transmitter was set to 0 dBm/channel. The WDM signals were processed in the proposed optical switch where the losses of MUX, splitter, MCS chip, and MCS packaging were set at 5.0, 3.5 log 2 (S), 3.65 log 2 (M), and 5.1 dB, respectively. Each MCS loss was calculated based on actual values: the MZ switch loss of 0.13 dB/MZ [29] and fiber coupling loss of 1.4 dB/facet [33]. Optical noise was loaded by the EDFA (noise figure was 6 dB). The switched signal was detected with the LO light in the conventional or colorless coherent receiver. The conventional receiver with TF loss (i.e., L TF = 10 dB) measured in the previously reported Silicon-photonic TF [15] is assumed. The preamplifier gain was adjusted to be the output optical power of 0 dBm for the target channel. The receiver parameters except for the LO power were the same as those listed in Table II. The output power from the LO laser was fixed at 16 dBm. System parameters relevant to the numerical simulation are summarized in Table III. The performance was evaluated in terms of the achievable port count needed to satisfy BER = 1 × 10 −3 on the central channel (1546.116 nm; ch29). The simulation result at S = 1 is shown in Fig. 7, where the achievable port count is compared as a function of the EDFA saturation power (P E ) for different receiver configurations. The colorless system can attain four-times larger port count than the conventional configuration with optical TF. This is in good agreement with the SNR gain (≥ 6.5 dB) of the analytical prediction shown in Section III-B. Thus, the port-count of our optical switch system can be substantially expanded by adopting colorless coherent detection.
Various modulation formats were tested to clarify the maximum switch port counts for the optical switch systems using colorless receivers. We simulated the colorless optical switches encompassing 75/135-GHz-spcaed 58/32-channel WDM signals modulated with 256-Gbps DP-QPSK, 256/512-Gbps dualcarrier DP-QPSK, and 512-Gbps DP-16QAM. The numerical simulations were carried out under the same conditions as those used for calculating Fig. 7. The target BER on the center channel was set at 1 × 10 −2 for the 16QAM signal. Figs. 8 illustrate the port-count variation calculated by changing P E and S for the different signal formats. A target port count is achieved in different combinations of the EDFA saturation power and sharing number. Of course, the larger the EDFA saturation power (P E ) or the smaller the EDFA sharing number (S), the larger is the available port count. Figs. 9(a) and 9(b) show contour plots of the achievable port count for 256-Gbps and 512-Gbps link speeds, respectively. For instance, a port count of 1856 is attained with EDFA saturation powers of 18 and 24 dBm for 256-Gbps dual-carrier DP-QPSK and 512-Gbps single-carrier DP-16QAM signals, respectively, at S = 2. Using practical TEC-controlled (cooled pump) EDFAs with the saturation power of 23 dBm, the maximum achievable port count for 256-Gbps link speed is 7424 for DP-QPSK signals, which yields the huge bisection signal bandwidth of 1.46 Pbps (= 200 Gbps × 7424) per single optical switch. To create a data center network, the optical switches are implemented in parallel (e.g., 64 in a single tier [24]) and the resultant total bandwidth is at least one order of magnitude larger than that of the present largest data center. We observe a reasonable penalty of 3 dB for all combinations between dual-carrier 32-Gbaud and 64-Gbaud DP-QPSK signals. These results prove that port counts of more than 1000 can be achieved by means of cost-effective EDFAs (P S ≤ 18 dBm)  with uncooled-LD pumping. These port-count analyses validate the proposed colorless switch and its design guideline.

V. EXPERIMENTS
The previous section derived achievable port counts by extensive calculations using the developed simulator. In this section, we conduct proof-of-concept experiments for validating both theoretical predictions and proposed optical switch. 1856 × 1856 optical switching experiments evaluated system performance relating to EDFA saturation power and switching time. A recently fabricated Silicon-photonic switch [34] and TF [16], [35] are embedded in the MCS and LO bank. Fig. 10 illustrates the experimental setup used for demonstrating a 58-channel dual-carrier 32-Gbaud DP-QPSK signal optical switch. At the transmitter, four dual-carrier 32-Gbaud DP-QPSK signals were generated by the bulk modulation of eight continuous waves (CWs) spaced at 37.5 GHz from 1545.222 nm (ch1) to 1547.316 nm (ch8). A dual-polarization IQ modulator (IQM) was driven by an arbitrary waveform generator (AWG) operating at the sampling rate of 64 GSa/s. At the AWG, an RRC filter with roll-off factor of 0.1 was used for Nyquist pulse shaping.

A. Experimental Setup
The modulated signal was presented to the input of channel and polarization decorrelation stages. A 1 × 4 WSS separated the input signal into four paths with different fiber delays. The path difference between any two of four paths was at least 100 symbols. The outputs were recombined by a 4 × 1 optical coupler before entering a variable optical attenuator (VOA) for power adjustment. The recombined signal was randomly polarized by a polarization scrambler (PS) driven at the reputation rate of 2 krad/ms. The optical spectrum of the uncorrelated sub-channels after the channel and polarization decorrelators is plotted in Fig. 11(a). The decorrelated signal was coupled with spectrally shaped ASE noise to emulate full C-band WDM channels. The noise was produced by passing an ASE light through an optical notch filter. According to the simulation conditions, the optical power from the WDM transmitter was fixed at 12.6 dBm. The transmitted WDM signal was split by a 1 × (58/S) splitter and amplified by an EDFA with saturation power of P E . The amplified signal was further distributed by a 1 × S splitter and delivered to a M × M MCS. In the experiment, the MCS was implemented by an optical splitter and Silicon-photonic switch shown in Fig. 10(a). The 1 × M selector is composed of Silicon-photonic multi-stage MZIs in a tree configuration. After the MCS, the signal output by an optical preamplifier was incident on the colorless coherent receiver. The saturation power of 17 dBm was set by assuming the use of a compact and low-cost preamplifier with uncooled-LD pump. An optical filter was placed in front of the receiver to eliminate ASE noise outside the C band. Fig. 11(b) shows an example of the received WDM spectrum covering the wavelength range from 1530 nm to 1565 nm.
At the receiver, the incoming signal was mixed with LO light in an optical front-end. Two LDs with 16-dBm output power were used as the LO, and one of them (ch4 or ch5) was selected by the Silicon-photonic TF shown in Fig. 10(b). Wavelength tuning was performed by cascaded thermo-optically controlled asymmetric MZI. Regarding LO light, system performance is mostly determined by the output power of the shared LO bank and marginal power penalty is expected when it is kept high (≥13 dBm) [36]. In the experiment, the LO power is set at 16 dBm as used in numerical simulations. This LO power is possible with the shared bank configuration as discussed in Ref. [36]. The signal was stored by an 80-GSa/s DSO and processed offline. In the offline DSP, matched filtering was done by using an RRC filter with a roll-off factor of 0.1. The filter output was resampled at 64 GSa/s to two samples per symbol. The resampled signal was input to 15-tap butterfly-structured finite-impulse-response (FIR) filters for channel equalization and polarization demultiplexing. The filter coefficients were recursively computed by the constant-modulus algorithm (CMA), wherein the minimum estimation error was identified. Finally, the carrier frequency offset and phase noise were compensated by a Kalman filter algorithm [37]. We examine system performance by means of the BER and Q-factor calculated from the restored signals.

B. Switching Performances
Figs. 12(a) and 12(b) plot the measured BER versus EDFA saturation power (P E ) for different sharing numbers (S) in 928 × 928 (M = 16) and 1856 × 1856 (M = 32) optical switches, respectively. The test signal was set on the fourth wavelength (1546.119 nm; ch4). In the 928 × 928 (M = 16) optical switch, the attainable saturation powers were found to 2, and 24.6 dBm, respectively. As can be seen in Fig. 9(a), the resulting saturation powers well agree with the simulated ones. The slight difference (≤1 dB) is mainly due to deviation from the assumed penalty at the receiver front-end. The experiments confirm that colorless coherent detection is effective for port-count expansion as predicted by numerical simulations.

C. Switching Latency
We investigated switching performance when the LO wavelength was switched from 1546.119 nm (ch4) to 1546.418 nm (ch5). A turbo-pulse heater driving for MZI was employed for accelerating the TF tuning speed [35]. Fig. 13 plots the measured Q-factor transitions without and with the turbo-pulse driving method. We evaluated the switching time in absence of the MCS since it is faster (< 3 µs) than that of TF [38]. In the adaptive equalizer, the learned tap coefficients were preset to focus on the tuning speed of the TF. Without turbo-pulse operation, we observed a switching time of 8.0 μs measured between ninety percent of the first and final Q-factors. The short switching time of 3.52 μs was realized by means of turbo-pulse driving, which is two times faster than that without turbo-pulse. It is concluded from the results that large-scale and fast optical switches with switching times of a few microseconds can be created using the Silicon-photonic TF-based LO bank for coherent detection. Further switching time reductions are expected by utilizing the electro-optic effect instead of thermo-optic one [39].

VI. CONCLUSION
This paper presented a scalable and fast optical circuit switch, a key component in implementing an electrical-packet/ opticalcircuit hybrid switching network for intra data centers. The resulting large-scale optical switches enable flat and unified interconnects among ToRs/PoDs in a data center. Such a highport-count optical switch is effectively created by combining two dimensions (wavelength and space) as utilized in present ROADM/OXC systems, although the specification requirements are very different. Silicon-photonics technologies are harnessed to fill the gap to match the datacenter-oriented requirements of scalability, fast connectivity, and cost-effectiveness. To make the best use of the Silicon-photonic TFs, colorless coherent detection that eliminates the TF (and thus its loss) from the front of a receiver, and shared LO bank configuration that does not need TLDs were devised. The results of analyses and various simulations allow optical switches tailored for different dater center requirements to be designed considering various parameters, many of which are interrelated. For example, at the rates of 256 Gbps and 512 Gbps, port counts of 7424 and 3712, respectively, are possible using practical and tractable devices. To confirm the validity of the switch architecture and the analytical results, we conducted switching experiments on an emulated 1856 × 1856 optical switch system; the measured results well matched the simulation results with switching times under 3.52 μs. To the best of our knowledge, this is the first demonstration of a large-port-count optical switch with 400-Tbps class throughput realized without relying on TLDs. We expect the technologies discussed here will pave the way to the next generation of intra-data center networks.
An important related development is the optical switch control technology, but this is out of the scope of this paper. We will utilize a decentralized and distributed control scheme for the optical switch network, which does not require network wide synchronization nor routing control. As a result, low blocking probability (less than 10 −6 ) and low latencies (less than a few tens of microseconds) are possible [24]. Control network design issues including control latency and overall blocking performances were elaborated in our recent paper [24]. In this paper, we have not detailed the shared LO bank design, but its cost-effective implementation is described in a recent publication [36].