Optical-Mechanical Configuration of Imaging Operation for Endoscopic Scanner: A Review

Miniaturized endoscopic scanners have had a significant impact on high-resolution optical imaging. Technological advancements in micro-electromechanical systems and optical fiber technology have resulted in various optical-mechanical configurations designed to fulfill specific requirements. However, it is still challenging to provide comprehensive, undistorted images with high-resolution images of target samples. This paper reviews the optical imaging techniques utilized in cantilever-based endoscopic scanners by analyzing and comparing their key performances, pros and cons, and corresponding optical components needed to develop the system. The concept of multimodal imaging is then highlighted by discussing its principle and current status in endoscopic scanners. We also reviewed the scanning configurations concerning their mechanical components, general structures, and drive signals for different scanning patterns. The feedback control aspect in endoscopic scanners is then highlighted. We discuss its role in mitigating undesired nonlinear vibration effects and provide a survey of the current implementations. Finally, we discuss the endoscopic scanners’ current and potential applications, artificial intelligence techniques in image reconstruction, and disease detection and provide recommendations on endoscopic scanner system design for future reference.


I. INTRODUCTION
Biomedical imaging involves chains of obtaining, processing, and visualizing structural or functional images of target samples. Apart from traditional methods, such as X-ray, positron emission tomography (PET), and magnetic resonance imaging (MRI), technological developments have produced various methods for interrogating target samples non-invasively. For instance, microwave imaging has been developed as a low-risk and cost-effective alternative for early breast cancer detection [1]. Optical fiber technology, known for its applications in telecommunications, networking, and sensors [2], [3], has also been utilized for optical imaging.
Micro-optics such as gradient-index (GRIN) lenses and compound lenses are utilized for light beam shaping to scan the target sample. The two choices of scanning the target sample are (1) MEMS-based scanning mirrors and (2) mechanically driven scanning fiber cantilevers. The complementarity between optical and mechanical components is crucial in designing the endoscopic scanner.
Among the many optical imaging techniques in this field, confocal microscopy, fluorescence microscopy, two-photon fluorescence (TPF), and optical coherence tomography (OCT) are popular for imaging applications. Each technique has its unique traits in extracting useful information from the target samples to provide a specific contrast mechanism to images. For instance, fluorescence microscopy reveals the functional and molecular contrast in tissues. While these techniques offer helpful insight into biological functions and structures, they also come with their respective drawbacks, such as limited penetration depth, low contrast, and restricted functional information. Multimodal concepts combine the complementary strengths of different imaging techniques to provide more comprehensive information on the target sample.
Aside from imaging techniques, the mechanical configuration of the scanning unit can also be categorized into several aspects, namely scanning patterns, actuation types, viewing directions, actuation methods, and driving signals. Reviews on these scanning systems have been available in the literature during recent years, with each having its focus areas. For instance, Kaur et al. reviews many aspects of a fiber cantilever-based endoscopic scanner, including imaging technologies, general principles of scanning patterns, potential applications [4], the fiber cantilever system's dynamics, and working physics of actuators [5]. Qiu and Piyawattanametha [6] focus on MEMS technology and describe the types of MEMS scanning mirror and actuator used for optical imaging. However, regardless of any chosen mechanical configurations, undesired nonlinear effects are inevitable in endoscopic scanners. Micro-misalignments may cause these effects during assembly or imperfection during fabrications, which could lead to image distortion. A feedback control mechanism is highly desirable to overcome these problems. Therefore, the present paper reviews the multimodal imaging concept and feedback control mechanism, highlighting their current implementation and identifying the main technical challenges in cantilever-based endoscopic scanners. The following parts of this paper are organized as follows: In section 2, we review the optical imaging techniques by analyzing and comparing their specific performances, limitations, and corresponding optical components required to construct the system. Next, the multimodal concept is highlighted by examining its principle, current status, and implementation challenges in endoscopic scanners. In section 3, to demonstrate the importance of the feedback control mechanism, we first review the mechanical configurations in terms of their mechanical components, general structures, and drive signals for different scanning patterns. Then, followed by a review of the current implementation of the feedback control mechanism in endoscopic scanners. Next, we provide further discussion regarding selecting optical components for imaging, potential applications of endoscopic scanners, artificial intelligence techniques in image reconstruction and disease detection, and recommendations on mechanical configurations for different scanning patterns in section 4. The last section is the conclusion.

II. OPTICAL IMAGING TECHNIQUES
We begin with the endoscopic scanner classification, divided into two main systems: optical and mechanical systems. Fig. 1 shows an overview of different categories that can be considered during the design of an endoscopic scanner. The optical system is mainly responsible for imaging techniques. In comparison, the mechanical system is primarily responsible for the dynamics of the scanning system.
Many optical components such as optical fibers, light sources, detectors, lenses, and filters are involved in optical imaging. Still, the two crucial components that need to be decided first are the light sources and detectors. Light sources come in many types, such as continuous, pulsed, and low coherence, with wavelengths ranging from ultraviolet (UV) region to infrared (IR) region. The standard detectors used are the PIN photodiode, avalanche photodiode (APD), a photomultiplier tube (PMT), charge-coupled device (CCD), and complementary metal-oxide-semiconductor (CMOS) cameras. The choice of light source and detector is highly dependent on the desired imaging technique. Different optical imaging modalities in endoscopic scanners are briefly described in this section, including the commonly used light source and detector for each optical imaging technique. Various combinations of imaging techniques for multimodal imaging are also reviewed.

A. SINGLE-MODAL IMAGING TECHNIQUES 1) CONFOCAL MICROSCOPY
Light is focused on a target sample using a point illumination system. The backscattered light is refocused and passed through a pinhole aperture to be captured by the detection system. The most common wavelength used in confocal microscopy is in the visible window. Wavelengths extending to the near-infrared (NIR) window are possible depending on the target sample. For instance, wavelengths of 625 nm [7], [8], and 785 nm [9]- [12] are generally used to image cellular tissues. NIR wavelengths such as 1310 nm and 1460 nm [14] were also used to imaging dental caries as the tooth has different optical windows. A continuous-wave light source such as a laser diode would be sufficient for confocal microscopy. The various detectors can be used depending on the light source's wavelength, the power of backscattered light, and desired features such as fast-speed detection.
The PIN photodiode is very compact, and it doesn't require a high operating voltage. The voltage output signal is linearly proportional to the input signal. But generally, it has a relatively low sensitivity and a small active area. APD offers higher sensitivity with a built-in gain and fast response time, but it requires a high reverse bias voltage. On the other hand, PMT offers high bandwidth and high gain, ideal for low light or short-pulsed light detection. PMT is the popular choice of detector used in many endoscopic scanners. However, they have limited detection efficiency for longer wavelengths [13]. CCD and CMOS are detectors offering high quantum yield, low dark signal, and multichannel capability. Both detectors convert photons into electrons at each pixel. If high-speed detection is the primary requirement of the scanning system, the CCD or CMOS detector can deliver.

2) FLUORESCENCE MICROSCOPY
Fluorescence imaging is used to study the molecular activities of the target sample. Unlike confocal microscopy that generally focuses on absorption of wavelength, fluorescence microscopy focuses on the excitation and emission wavelength. The used excitation wavelength usually depends on the type of dye used rather than the tissue sample, and typically it is within the UV-Visible region (330 nm -700 nm). Filtering is often required to match the selected dye to eliminate the excitation wavelength from being captured by the detection system. The emitted light from the excited sample is generally weak in power; thus, a high sensitivity detector is needed, such as PMT or CMOS camera.
Many recent studies focus on TPF and second harmonic generation (SHG) in comparison to traditional fluorescence techniques. These nonlinear techniques are label-free technology, have a high spatial resolution, deeper penetration, and less phototoxicity [4]. The light source used is usually the pulsed (femtosecond) type, and the choice of wavelength is in the NIR region due to the nature of these techniques. Wavelengths of 780 nm [14], 800 nm [15], 810 nm [16], 840 nm, 880 nm, and 900 nm [17] are used for two-photon fluorescence application in literatures. PMT is often used together with the pulsed light source for these techniques.

3) OPTICAL COHERENCE TOMOGRAPHY (OCT)
OCT has attracted many interests due to its high sensitivity, high-resolution imaging, and deep tissue penetration, and it has been applied in the endoscopic scanner. This technique relies on the interference pattern obtained from either a Michelson or a Mach-Zehnder interferometer. Various schemes such as time-domain OCT and frequency-domain OCT [18] can be employed. Under frequency-domain OCT, the swept-source OCT (SS-OCT) system is the standard configuration used for OCT imaging in a fiber cantileverbased endoscopic scanner [19]- [21]. Spectral-domain OCT (SD-OCT) may be used, as well [22]. A low coherence light source with a wavelength that allows deep penetration of tissues is desired. Hence most OCT scanning systems utilize the NIR to the infrared region of the optical spectrum. Wavelengths of 1300 nm [19], [20], and 1320 nm [21] are used in the SS-OCT scanning system. Different types of low coherence light sources may be used. Such as the superluminescent diodes (SLED), multiple quantum well semiconductor optical amplifiers, doped fiber-based amplified spontaneous emission (ASE) sources, Kerr-lens mode-locked (KLM) lasers, supercontinuum light source, swept-source lasers [18], and recently, vertical-cavity surface-emitting laser (VCSEL). Tissue images: (a) In vivo reflectance image of several pagetoid cells (yellow stars) at the epidermal level and atypical cells at the dermo-epidermal junction (yellow arrows) from confocal microscopy, Original publication in [23], (b) Ex vivo image of green fluorescent protein bacteria in a mouse spleen model from fluorescence microscopy, Original publication in [24], (c) TPF images from a mouse kidney section stained with Alexa Fluor 488 wheat germ agglutinin (F-24630, Invitrogen), Original publication in [17], and (d) OCT images of the hard palate of sublingual mucosa (EP: epithelium, BM: basement membrane, LP: lamina propria), Original publication in [25].
The common light sources used in endoscopic OCT scanners are SLED [19]- [22] and swept-source lasers [21]. A balanced detector (BD) is used in the OCT system as the technique relies on interference signals.

4) DISCUSSION
Fig. 2 summarizes tissue images reconstructed using information obtained from confocal microscopy [23], fluorescence microscopy [17], [24], and OCT [25]. Confocal microscopy offers the elimination of out-of-focus glare by using pinhole aperture to produce a high-resolution image but at a shallow penetration depth (<100 µm) [26]. With a similar setup, traditional fluorescence microscopy is dependent on specific fluorescent dyes introduced to target samples. The emitted light can be filtered and collected for image reconstruction by exciting the fluorescent dyes with longer wavelengths. It is a powerful tool to study the molecular activities of the target sample, but long exposure of excitation wavelength may cause photobleaching. TPF and SHG would minimize photobleaching and are label-free imaging technologies as compared to traditional fluorescence microscopy. However, the light source equipment is costly. Among the imaging techniques, OCT offers high penetration depth (1-3 mm) [18] to map out a cross-section of target samples. However, there would be a trade-off between image resolution, imaging speed, and penetration depth depending on the deployed OCT scheme.
Each imaging modality provides useful tissue information in their applications, but it may not be sufficient depending on the information required for a specific scenario. This limitation is related to the fact that focusing on a particular characteristic does not provide a complete picture of the anatomical complexity of the tissue. Combining the complementary strengths of various imaging techniques can result in more comprehensive tissue information. The concept of multimodal imaging and its current implementation in endoscopic scanners is discussed in the following section.

B. MULTIMODAL OPTICAL IMAGING TECHNIQUES
The multi-modality concept combines the complementary strengths of different imaging techniques to access more information on the target sample. This method has gained many interests, and various multimodal imaging systems have been developed for simultaneous imaging.
Among the imaging techniques, OCT has been the core technology in many multimodal imaging systems. OCT is a label-free imaging technique that offers deep penetration depth to produce high-resolution structural information of the target sample based on backscattered lights. However, the contrast based on backscattered light limits the precise quasi-histologic assessment in the target sample [55]. Implementation of other imaging techniques would mitigate this drawback. More information on multimodal optical imaging based on OCT can be found in [27], [28]. Hence, this section briefly reviews the recent multi-modality of optical imaging techniques and their current implementation in endoscopic scanners.

1) OCT AND FLUORESCENCE
Fluorescence imaging offers high sensitivity for detecting biochemical and molecular activities in the target sample. Thus, OCT-Fluorescence dual-modal imaging combines structural imaging (OCT) and molecular imaging (Fluorescence) to determine the position of labeled structures and their molecular activities in the target sample. Tang et al. [29] combine OCT and high-sensitivity fluorescence laminar optical tomography (FLOT) for sub-surface cancer detection and diagnosis. Thouvenin et al. [30] use a combination of full-field OCT and fluorescence structure illumination microscopy (SIM) to study the biochemical and structural changes in zebrafish larva, macaque retina, and rat heart.

2) OCT AND PHOTOACOUSTIC
Photoacoustic (PA) imaging relies on the emitted ultrasound produced by tissues due to optical absorption. The technique produces high spatial and high contrast images, and it is notable for its use in tissue vasculature visualization due to absorption by hemoglobin. The combination of absorption contrast (photoacoustic) and scattering contrast (OCT) would produce comprehensive information on skin perfusion [28]. The combined OCT-PA imaging technique has been used for metabolic rate measurement [31] and human skin condition investigation [32]. Liu and Drexler have done a comprehensive review of the OCT-PA imaging technique in [33].

3) OCT AND TWO-PHOTON FLUORESCENCE (TPF)
TPF imaging technique generally yields a high spatial resolution, deeper penetration depth, and less phototoxicity than traditional fluorescence techniques. Thus, the combination of OCT and TPF gives access to improved molecular and multi-scale structural contrast to the reconstructed image. For example, Gagnon et al. [34] have developed an imaging system by combining OCT and TPF for microvascular flow circulation and distributions.

4) DISCUSSION
The multi-modality concept in imaging systems is undoubtedly powerful. It provides a comprehensive study on the target sample's structural, functional, and molecular properties. Moreover, the number of imaging techniques combination is not just limited to two techniques only. For instance, Song et al. [35] have performed in vivo multimodal imaging of mouse ears using three imaging techniques, namely TPF, SHG, and PA. Fig. 3 shows the color-encoded microstructures of obtained images from different imaging techniques.
Briefly, the cell profiles can be described using the TPF technique, the collagen fiber clusters with SHG technique, and the distribution of the microvasculature network using the PA technique. Finally, combining collected data from TPF, SHG, and PA, comprehensive images such as Fig. 3(c). can be illustrated.
Although multimodal imaging provides comprehensive information on target samples, its implementation in endoscopic scanners is challenging due to the additional equipment and components required for the experimental setup. This requirement raises the setup's cost and complexity to ensure that transmission and collection of various wavelengths of light are performed efficiently. The implementation issues are discussed in the following section.

C. IMPLEMENTATION CHALLENGES
Different imaging techniques may be employed sequentially or simultaneously. In sequential imaging, each technique is conducted independently in a sequence. Whereas in simultaneous imaging, each technique is performed at the same time. Thus, each imaging technique may require independent optical components to operate. Some components, such as optical fiber and objective lens, are bound to be shared to perform multimodal imaging. A sequential multimodal imaging system design would be straightforward as it has independent scan paths for each imaging technique, and optical components can be shared. For instance, a multimodal scanning fiber endoscope (SFE) was developed that combines confocal and fluorescence microscopy for structural, chemical, and biological activities imaging of atherosclerosis [36]. The spiral scanning SFE contains a single-mode fiber cantilever that acts as a light guide and a ring of optical fibers to collect the backscattered light. Red, green, and blue excitation lasers and a separate red laser were coupled into the fiber cantilever sequentially for fluorescence and confocal microscopy.
As for a simultaneous multimodal imaging system, it is possible to position two separate probes together physically, but it would increase the overall probe size and causes unwanted nonlinear mechanical effects, covered in the following section. Ideally, different light sources need to be coupled into an optical fiber for imaging, which leads to other challenges, such as effective delivery of light sources, effective collection and separation of backscattered light, and reducing the overall footprint of the endoscopic scanner. In recent years, only one approach has been explored, which is the utilization of DCF. Different light sources can be coupled into the core and inner-cladding of the DCF. Mavadia et al. [37] use this approach and design an endoscopic scanner for simultaneous OCT and fluorescence imaging. A specialized optical fiber wavelength division multiplexer (WDM) was used to couple both the NIR (1305 nm) and visible excitation (488 nm) sources into the DCF. The separation of backscattered light was done through a custom DCF coupler. The viewing direction of the scanner is sideways, and the scanning was done by rotating the reflector lens. Further development can certainly be made to produce a forward-viewing endoscopic scanner with a mechanically actuated scanning fiber cantilever.

1) DISCUSSION
Although a sequential multimodal imaging system is often more straightforward than a simultaneous multimodal imaging system, its acquisition speed might be limited. This issue is due to the additional time needed to collect sufficient data from each imaging technique to reconstruct an image. Combining several imaging techniques that require an entirely different setup is a challenging task in itself, which needs to be carefully planned out scientifically and experimentally. Additional electrical and optical components are necessary to segregate the backscattered signals in their optical regime, VOLUME 9, 2021 increasing overall complexity. Unfortunately, the required additional components are often a trade-off between cost, complexity, challenges, and information acquired. While multimodal imaging improves detectability on a target sample, the interaction between the target object and multimodal technique should also be considered, as all applications do not necessitate multimodal imaging.
In another aspect, the image reconstruction algorithms that process all collected information are also crucial. For instance, if the multimodal OCT and confocal technique are employed, one needs to consider the different light properties of the reflected signal upon interaction with the tissue. The confocal system relies on the weak-coupling of back-scattered signal that is impinged on the photodetector. On the other hand, the OCT signal relies on the different phase of light properties between the reference and targeted arm. These light properties are manageable and understood well in optics. But, the implementation requires extra handling on alignment and optical-electrical signal conditioning. In addition, the transmission and separation of numerous light sources, optical-electrical conversion, and image reconstruction algorithms must work seamlessly to produce a highly informative image. But, having discussed all the challenges, the multimodal imaging operation is the most powerful imaging technique if one can employ in the experiment.

III. MECHANICAL CONFIGURATIONS
In the scanning unit of endoscopic scanners, light scans along two perpendicular axes to reconstruct the target sample's twodimensional (2D) image. The most common scanning patterns are raster, spiral, and Lissajous. Other scanning patterns, such as propellers, are also explored. Readers may refer to work done in [4] to overview each scanning pattern and its advantages and disadvantages.

A. SCANNING SYSTEM
Assembly configuration and actuation method are the key factors in producing desired scanning pattern. A comprehensive review of various actuators' working principles used in cantilevered scanning systems was reported in [5]. Hence it is not covered in this paper. Instead, this section focuses on designing and assembling components in the cantilever-based endoscopic scanner and the applied driving signal for each scanning pattern.

1) RASTER SCANNING SYSTEM
Raster scan is generally performed by actuating the two principal bending modes simultaneously with mechanical vibration from an actuator. The light scans the target sample along a line at a time, from top to bottom, producing a rectangular scan. A raster scan can provide a uniform pixel dwell time compared to other scanning patterns. The horizontal axis is usually scanned rapidly, while the vertical axis is scanned slowly. The ratio between these two frequencies is usually very high, at least 20 times between the axes. This requirement usually calls for two actuators or two rigid beams to drive the high frequency and low frequency separately in the scanning system. For instance, Rivera et al. [15] utilized two piezo bimorph actuators and aligned them in a way that their bending axes are perpendicular to each other, achieving two-axis scanning with field-of-view (FOV) of 110 µm × 110 µm. However, the driving voltages to drive the actuators are very high (>200 Vp-p). A similar design was used in [38], and the system can produce a scan area of 100 µm × 100 µm with a driving voltage of 74 Vp-p. Aside from piezoelectric actuation, electromagnetic actuation based on two nickel foil cantilever beams with permanent neodymium magnets can also be utilized [9].
The inclusion of two actuators or two rigid beams undoubtedly increases the overall endoscopic scanner size. Thus, Do et al. [14] developed an endoscopic scanner using a tubular piezoelectric actuator. The tubular actuator has four quadrants, where opposing quadrants are controlled separately in pairs. Applying input voltages at different axes would produce fast and low frequencies, despite being high voltages (>100 Vp-p). The developed endoscopic scanner was not operating in resonance as the deflection range of the fast resonance axis is limited by the scanner enclosure. This causes lower scanning speed (<160 Hz) and lower resolution. Besides piezoelectric actuation, electrothermal actuation is also possible. Park et al. [39] utilized an electrothermal bimorph MEMS actuator and attached a fiber to produce a high frame rate, fast resonant scanning, and step-wise linear raster beam scanning at a very low operating voltage of 3 Vp-p. The developed endoscopic scanner also offers versatility in other scanning patterns. However, the utilized MEMS actuator has a large footprint (3 mm), which causes the probe diameter to be huge (5.5 mm). High operating temperature is to be expected, and cooling of the scanner can be a concern.
Regarding the driving signal, a triangular signal is generally used in a raster scan to drive the fast, resonant axis, while the scan position shifts in steps or continuous in the slow, non-resonant axis. The triangular signal may be replaced with other waveforms such as sinusoidal and sawtooth [40], but this would affect the resolution, the sampling rate, and the image quality. Descriptions between these driving signals are provided in [40], [41].

2) SPIRAL SCANNING SYSTEM
For spiral scanning, the light scans the target object in an expanding spiraling pattern. The drive signals used to produce a spiral scan are a pair of sinusoidal signals with varying amplitude and a phase shift of 90 • between each signal, as shown in the following equations.
where f x and f y are the scanning frequencies for both x and y axes, respectively, f (t) is the modulating amplitude function, t is the scanning time, and X s and Y s indicate the position of the scan point over t. For spiral scanning systems, only one fiber is required to perform the spiral scan. Since the crosssection of a circular fiber is a circular shape, therefore f x = f y . Thus, the frequency ratio, f ratio between the two perpendicular axes would be unity, as indicated in the following equation.
Unlike raster scan, the spiral scan is one of the more popular scanning patterns due to its simplicity in design of the endoscopic scanner since a single actuator that can drive both x and y axes would be sufficient.
E. Seibel is one of the leading researchers in the development of fiber cantilever-based endoscopic scanners. He and his team from the University of Washington have developed the SFE [42]. The SFE, as shown in Fig. 4, consists of a single fiber optic and a piezoelectric tube as the actuator [43]. This single fiber optic is used primarily for target illumination. In contrast, ten multimode fibers (MMF) surrounding the scanner housing are used to capture the target sample's backscattered light [43]. X. Li also leads a team in developing a spiral endoscopic scanner using similar piezoelectric tube actuation [44]. The fiber used is a custom-made DCF as the spiral scanning system is mainly used for TPF applications. A custom-made air-silica double-clad photonic crystal fiber (DC-PCF) was developed in a similar piezoelectric tube actuated endoscopic scanner to optimize the excitation pulsed beam delivery backscattered light collection, thus improving the imaging resolution [16].
Besides piezoelectric actuation, other actuation may be used as well. Zhang et al. [45] attached an SMF onto an electrothermal large-vertical displacement (LVD) bimorph actuation for a cantilever-based endoscopic scanner. The MEMS actuator is small in size. It produces a large scanning range of 1.65 mm for the fixed end length of 20 mm at a low operating voltage of ∼3.75 V. The drawbacks are low resonant frequency scanning (108 Hz) and high operating temperature. Concerns such as cooling, thermal expansion of fiber, and glue degradation between the optical fiber and actuator base need to be considered. Electromagnetic actuation is also possible for potential spiral scanner applications. Generally, a magnet is attached to the fiber, and an electric coil surrounds the magnet-attached fiber. The interaction between the magnet and electric coil produces vibrations for a quick scanning response. However, this often causes the large dimension of the endoscopic scanner. Cylindrical projection lithography [46] was proposed for manufacturing microstructures on surfaces as micro components. The manufacturing process is complex, but a smaller scanning probe is possible.
Typically, an assembly of lenses is positioned in front of the fiber scanner for precise scanning. Vilches et al. [19] and Wurster et al. [20] took a different approach to the architecture and attached the lens at the tip of the scanning fiber, as shown in Fig. 5. Vilches et al. fuse the scanning fiber with a GRIN lens, while Wurster et al. attached a 3D printed holder that accommodates a plano-convex lens to the scanning fiber. The imaging would be telecentric, eliminating vignetting, field curvature aberration, and non-uniform optical resolution on the target sample [19]. The increased weight at the tip of the fiber would also reduce the resonant frequency. However, this architecture does have some drawbacks. The added lens or lens holder at the tip of the fiber limits the scan movement, dependent on the scanning probe's diameter. The cantilever length of the optical fiber is made short to prevent it from bending down due to the added weight at the tip. Installation of lens or lens holder on the fiber tip is also difficult as precision alignment is crucial to prevent nonlinear effects. Aside from mechanical configuration, the drive signal applied to the spiral endoscopic scanner is vital as the amplitude and phase shift variation would affect the pattern's circularity, distort reconstructed images, and reduce the frame rate. Referring to the SFE in Fig. 4, the driving signal for the SFE is unconventional [47]. Each frame consists of 3 distinct phases: imaging, active braking, and free decay. The amplitude of the driving signal gradually increased during the imaging phase to scan the target in an outwardly growing spiral pattern. Once it reaches its maximum extent of the scan, the piezoelectric actuator applies a high-amplitude drive signal that lags the fiber motion by 90 • to drive the fiber back to its initial state actively. Finally, the actuation is halted, allowing the residual motion to decay naturally until the following frame scan [47]. A similar approach was also made in [16]. After adding the braking command, the time taken for the fiber's free return to its initial state reduced from several hundreds of milliseconds to 12 ms, contributing to higher frame rates.
In another approach, the sinusoidal amplitude may be used to eliminate the discontinuity in the ramping amplitude's derivative, which causes the probe to ring, distorting the image [48]. This modulation has been applied in [19], [20].

3) LISSAJOUS SCANNING SYSTEM
The Lissajous scan relies on two separated resonant frequencies in the x and y axes compared to the spiral scan. It is a two-dimensional scanning pattern that is governed by the following equations.
where f x and f y are the scanning frequencies for x and y axes respectively, ϕ x and ϕ y are the corresponding phases for each axis, A and B are the corresponding amplitudes, t is the scanning time, and X (t) and Y (t) indicate the position of the scan point over t. The ratio between these two resonant frequencies is smaller than that of the raster scan. Theoretically, an ideal cylindrically symmetric fiber is not suitable for generating a Lissajous pattern. For a cylindrical fiber of radius r, the area moment of inertia, I x, and I y about the x and y axes is given as The resonant frequency, f r of lowest-order bending mode of a cantilever beam is defined as where L, A, and I are the length, cross-sectional area, and area moment of inertia of the beam, ρ and E are the density, and Young's modulus of the cantilever beam material and α = 3.52 is a constant found by numerical solution of the beam-bending equation [49]. Considering I x = I y , the resonant frequencies for both x and y directions are the same, resulting in a unity frequency ratio between the two orthogonal axes. Any attempt to resonate the cantilevered optical fiber in a single axis vibration will result in an elliptical scan in practice. This phenomenon is caused by the insufficient separation of the resonance frequency of the orthogonal bending modes to produce the non-degenerate frequencies for the Lissajous pattern. A comparison study between SMF and PMF was conducted [50], and it was found that a warped Lissajous shape was achieved using SMF and the scan pattern only covers 16 % of the original FOV area of 343 × 333 pixels. In this study, the PMF was a panda-type, with a dual stress rod in the fiber's cladding. The strength member in the PMF results in different stress in its orthogonal axes. The additional strength contributes to a mechanical asymmetry that separates the resonant frequencies (28 Hz mode separation) and results in a clean, unwrapped Lissajous pattern. Hence, mechanical asymmetry is vital in the Lissajous scanning system to create a frequency separation between f x and f y .
Kwang et al. reported a piezoelectric actuated Lissajous fiber scanner with a micro-tethered-silicon-oscillator (MTSO) attached to a DCF, as shown in Fig. 6(a) [11]. The additional micro-fabricated attachment would create extra stiffness in the fiber, contributing to the separation of resonant frequencies. Two optical fibers can also be attached to make the mechanical asymmetry, as described in [51]. Wu et al. [21] had a slightly different approach where an additional protruding rod is fixed on the actuator to provide a mechanical asymmetry to the optical fiber during scans. A piezoelectric stripe actuated Lissajous fiber scanner is also reported, as shown in Fig. 6(b) [52]. The DCF fiber used in the setup was modified to create a flattened cladding at two sides.
Modification to the actuator is also possible to create mechanical asymmetry. Fig. 6(c) shows a fiber Lissajous scanner schematic with an electrothermal actuation [53]. The actuator includes a double hot arm and a cold arm structure with a linking bridge for the fiber attachment, contributing to the separation of frequencies between orthogonal axes. It is also noted that the driving signal for this particular system is unconventional since rectangular 16 Vp-p pulse trains at resonant frequencies were used instead. This modification was done to allow sufficient cooling during electrothermal operation for maximum scanning length [53], limiting the range of resonance frequencies in the system.  [11], adapted with permission from [52] 2015 Optical Society of America, and used with permission from [53] 2016 Optical Society of America respectively).
Among the scan patterns, the drive signal for the Lissajous scan is complex, with several critical factors that contribute to the scanner's performance. One of the vital factors is the fill factor. Conventionally, the scanning frequencies are the system's resonant frequencies, which would provide the maximum FOV, but it may not contribute to high scan density. A frequency selection rule was then developed for high definition and high frame rate (HDHF) Lissajous scan [54]. To achieve a higher fill factor (FF) and frame rate, the selection rule can be expressed by two conditions; N ≥ N min (FF) and Max [GCD (f x , f y )], where GCD is the greatest common divisor of f x and f y , and N is the total lobe number expressed as A comparison between conventional and HDHF Lissajous scanning was studied. Results have shown that the conventional Lissajous scanning produced a fill factor of 80% at scanning frequencies of 1319 Hz and 1410 Hz. On the other hand, HDHF Lissajous scanning produced a fill factor up to 96% at a slightly different scanning frequency of 1310 Hz and 1410 Hz at near-resonant frequencies. For more information, readers may refer to [54].
Following that, Wang et al. [55] has redefined the design parameters and fill factor for the Lissajous scan and introduced the three design rules to achieve a dense and rapid Lissajous scanning. A value, k, was introduced, and it is defined as where n x and n y are the smallest integer divisors of f x and f y . Depending on the k value and the ratio of n x and n y , different Lissajous scan curves and scan densities can be achieved, and one of the design rules needs to satisfy k = 2. For more information, readers may refer to [55]. It is worth noting that the non-repeating Lissajous scan was discussed in [54], while the repeating Lissajous scan was discussed in [55]. Non-repeating Lissajous scan is generated when the f x and f y ratio is irrational, while a repeating Lissajous scan is generated when the f x and f y ratio is rational [56]. Conventional frequency selection methods usually produce a non-repeating Lissajous, with limited conditions such as precise control of phase difference between x and y axes at 90 • and temporal flickering during real-time imaging. Using the frequency selection rule in [54], higher scan density and higher resolution can be achieved. If a repeating Lissajous scan is desired, the driving frequency and phase difference need to be tuned, and the design rule in [55] can be applied. Repeating Lissajous contributes to a higher frame rate but with a trade-off in scan density.

4) OTHER SCANNING PATTERN SYSTEM
Aside from raster, spiral, and Lissajous scan, other setup techniques use different scanning patterns for optical imaging. Fig. 7 shows a schematic of an electrothermal actuated fiber scanner. The electrothermal actuator is made from a single piece of laser-cut thin brass foil, and the bridge is manually lifted to 90 • , perpendicular to the brass foil base [57]. The fiber cantilever was modified by etching the distal 2 mm of an SMF to reduce its diameter from 125 µm to 12 µm [58]. The fiber cantilever was then attached to the top of the bridge. This electrothermal actuator can only perform the single-axis scan with a sinusoidal drive signal at its resonance frequency. For creating a 2D scan, the target sample is rotated, thus creating a propeller scan pattern. Ideally, a linear scan is produced if the optical fiber is actuated linearly in a single axis. Still, an elliptical scan was observed instead, primarily due to micro imperfection during fabrication and assembly. Tracing a quarter of the elliptical scan in time allows a linear scan line but at a low efficiency as other scan parts were not utilized. Microcracks formed in the adhesive were also observed after a long operation period.
Acemoglu et al. [8] have designed an electromagneticdriven endoscopic scanner to provide laser position control and scanning capabilities at the tip of a robotic arm during a surgical operation. The setup includes a permanent magnet integrated into fiber and four electromagnetic coils to control the fiber's movement. No particular scan pattern was used, and the trajectory was controlled by a user using a touchpad. Data acquisition (DAQ) card was often used to provide the appropriate drive signal to the fiber scanner. In this work, the Arduino Due was used instead to provide the electromagnetic coils' driving signals. The movement control is accurate and precise, but some effort is needed to familiarize the instrument. Table 1 summarizes some reviewed configurations for each scanning pattern. Among the scanning patterns, more interest is shown towards spiral and Lissajous scans in recent years compared to the raster scan. This trend is because the frequency ratio of the orthogonal bending modes needed to excite the cantilevered optical fiber is inherently already available due to the geometric shape of the optical fibers. On the other hand, to excite the cylindrical optical fiber into the raster scan, the fast-axis resonant frequency must be at least ten or twenty times faster than the slow-axis. This condition is the minimum requirement needed to produce a meaningful raster scan. Another constraint is the overall system size, considering two actuators are typically required. The spiral scanning system can be built with a small number of components. At the basic requirement, it needs a tubular piezoelectric with quadrature piezo element, a high voltage driver, and a single optical fiber. Nevertheless, it is particularly susceptible to micro-changes in the system, which leads to image distortion. This issue is because the cantilevered spiral scanner still has a unity frequency ratio between the orthogonal bending axes. Therefore, it is easy to perturb the driving trajectory if the fiber length is not sufficiently short or no feedback control mechanism is in place. With this type of scanning pattern, the fiber must also vibrate starting from the center of the FOV. It then undergoes expanding amplitude of vibration until it reaches the periphery of FOV. Then, the fiber must undergo an 'active braking' operation and should allow coming to the origin of the FOV. This sequential driving signal is then repeated until an image is obtained.

5) DISCUSSION
For the Lissajous scanning system, the scan trajectory is highly dependent on the driving signals and response of the fiber cantilever. Inaccurate characterization of the system would result in a severely distorted image. The inaccuracy might come from the discrepancies between the driving signal and the assumed trajectory of the image coordinate during the image reconstruction stage. The phase response of the cantilevered optical fiber must also be considered. The phase of the drive will have a direct impact on the actual Lissajous scan pattern. A feedback control feature would be highly recommended to mitigate any nonlinear mechanical effects in endoscopic scanners. The feedback control aspect will be explained further in the next section.

B. CONTROL ASPECT OF THE SCANNING SYSTEM
Endoscopic scanners' design contributes to the resonant frequencies, driving voltage and power, scanner dimensions, FOV, frame rate, and image resolution. However, in every scanning system, image distortion tends to occur. This issue is generally caused by the nonlinear dynamic response on optical fibers, anisotropic mechanical system, and micro-misalignment of the scanning system components. Therefore, resolution test targets, such as Thorlabs combined resolution and distortion test target (R1L3S5P, Thorlabs), Siemens star target, and 1951 USAF resolution test target, are commonly used to check the resolution of the developed scanning system and any distorted scan lines.
In the raster scanner system, distortion occurs at the periphery of the scanning area; as shown in Fig. 8(a), images appear wavy. In the spiral scan, whirling at the center is the most common distortion on the reconstructed images, as shown in Fig. 8(b). Distortion in the Lissajous scan often causes the interlaced scan lines or the image to look tilted, as shown in Fig. 8(c).
As mentioned previously, an ideal cylindrically symmetric fiber theoretically should have the same mechanical resonance for both x and y directions. If the symmetric fiber is vibrated in a single axis, a linear scan line should appear. In practice, manufactured optical fibers have imperfections and are not symmetric. If an asymmetric fiber were vibrated in a single axis, an elliptical scan would appear instead, indicating a cross-coupling between the mechanical resonance from both x and y axes. An anisotropic mechanical system may also contribute to image distortion. These concerns are inevitable due to manufacturing variability. Thus, feedback controls have been implemented to overcome these concerns.

1) OPEN-LOOP FEEDBACK CONTROL
In the typical endoscopic scanners, they have minimal to no automatic control or feedback capabilities to regulate the scanning process variables and maintain the intended output scan. The phase and amplitude responses of the fiber cantilever are dependent on the endoscopic scanner construction and the driving signals. Firstly, relying on hardware-based solutions such as rigid beams [9] or additional stiffening rods [15] may reduce the undesired mechanical coupling for the x and y axes. In Lissajous scanning systems, cross-coupling between the mechanical resonance from both x and y axes is minimal because the additional effort is made in separating the mechanical resonance frequencies from both axes. This method is unfavorable for spiral scanning systems as it requires the mechanical resonance from both axes to be the same.
Secondly, modification to the driving signal can help to reduce image distortion. In a conventional spiral scan, the sinusoidal drive signals have ramping amplitudes, which form a triangle envelope. Some studies [19], [20], [48] would use a smooth sinusoidal envelope instead to reduce the distortion. A change in phase angle difference can also reduce the distortion [60], [61]. A slight adjustment to the driving frequency can counteract the anisotropic mechanical system. Lastly, the most common approach is tracking and obtaining the scanning trajectory position using a position-sensitive detector (PSD) and storing the data as a calibration reference during image construction. With the obtained position information, remapping and correction can be made via software to produce a clear and undistorted image. This method is usually done last after reducing the image distortion using the previous methods.
However, these solutions are considered open-loop feedback and would only be conducted once after fabricating the system or before performing any scan using a calibration chamber [62]. It is also worth mentioning that the endoscopic scanners reported in the literature are primarily stage-mounted. Variations in the scanner system due to environmental effects and external stresses are not considered. Even though the endoscopic scanners are stage-mounted, there may be unknown sources that cause a phase shift between the drive signals and the response of the fiber cantilever. Consequently, there is a need for a sensing mechanism to provide feedback capabilities.

2) CLOSED-LOOP FEEDBACK CONTROL
With a sensing mechanism, the response of the fiber cantilever is continuously compared with the desired result, and the control output to the scanning mechanism is modified and adjusted to reduce the deviation, forcing the response to follow the reference. Thus, external and internal effects are automatically compensated. An ideal sensing mechanism would precisely track the trajectory course based on the tip of the scanning fiber cantilever. For instance, PSD can be added into the scanner system [52] to consistently track the trajectory scan, providing a close loop feedback system as shown in Fig. 9. However, the PSD has limitations such as transparency to infrared wavelength only, limited active area, and the overall device available in the current market is still considered bulky for endomicroscopic applications. For example, the PSD manufactured by Hamamatsu with model S5990-01 has a 10.6 mm × 8.8 mm dimension with an active area of 4 mm × 4 mm. Theoretical analysis and simulations on various closed-loop control schemes for fiber cantilever-based endoscopic scanners have been explained here [63]. Among the various scheme, phase-locked loops (PLLs) are the most prevalent control method utilized not just in fiber cantilever-based endoscopic scanners but also in other similar scanning systems such as fast steering mirror (FSM) systems [64], MEMS devices [65], and atomic force microscopy (AFM) [66]. PLLs controllers are commonly used for tracking and controlling amplitude-fixed periodic signals, such as sinusoidal signals.
In another study, closed-loop feedback control has been conducted, but mainly for z-axis alignment or focusing scanning [67]. The feedback control system is utilized to closely monitor the hysteresis-prone shape memory alloys (SMA) wire for depth scanning. The contraction distance of the SMA wire is measured using a Hall effect sensor, and the measured distance is input into a PID algorithm operating on a microcontroller unit. The algorithm then compared the measured and desired positions and adjusted the electrical current in the SMA wire to further deform the SMA and bring the scanner to the desired position.
In another literature, Yeoh et al. [62], [68] have developed an adaptive feedforward controller that uses a piezoelectric self-sensing approach, enabling self-contained recalibration for the scanner system. The piezoelectric tube in the scanning system serves as a sensor, and a low-footprint self-sensing circuit is attached at the proximal end to measure the piezoelectric tube's bending displacement. The first-mode resonant dynamics are recorded along each eigendirection, and its model parameters can be extracted based on the derived full electromechanical model of the scanning system for one eigendirection. The extracted model is then applied to control the fiber tip trajectory. The proposed feedback system was tested in both stage-mounted and unmounted scenarios to consider the scanner system's external stresses. As a result, scan trajectory error is reduced, but its accuracy can still be improved with further optimization, which is still under research.
Mokhtar and Syms [69] have implemented an additional apertured mirror mounted before the imaging lens. The intermittent optical feedback provides a signal for the Lissajous scanning system's closed-loop phase and voltage control. Feedback light pulses can be measured to accurately determine the scanning system's response with the intermittent optical reflection from the apertured mirror. For maximum response, the scanning system's response should vary with the sinusoidal drive signal but lag in phase by 90 • . A closed-loop controller was then developed to provide active control to lock the drive frequency and voltage to ensure the scanning system operates in resonance. Further research has been conducted to demonstrate the effect on the intermittent optical reflection due to different conditions of fiber, aperture, and target sample placement [70].
Loewke et al. [56] have proposed a software-based feedback system for phase control, frame rate improvement, and mosaicking for a Lissajous scanning system. The feedback control algorithm tests the image data over a wide range of phase values to determine the system's response. This step is crucial because if the phase response used during image reconstruction is not ''in phase'' with the phase response of the scanning system, it will cause interlaced copies of the image to shift along the orthogonal axes, resulting in a distorted image. The algorithm firstly maps a set of image vectors based on different phase lag values. It then interpolates the coordinate vertically and normalizes the image vectors. The algorithm then performs Fourier transform on the image vectors and sum the center portion of the absolute value of each images' spectrum. The maximum value calculated would correspond to the phase response of the system. After the phase lag was controlled, image mosaic and post-processing steps were conducted to enhance the image. This method does not require any additional components or sensors. However, this method needs real-time imaging as part of the feedback loop, resulting in a long time required to reconstruct an image from sampled data (360 ms per image during phase control). The selected image processing techniques are also sensitive to signal-to-noise ratio, optical resolution, and sample rate.

3) DISCUSSION
From this review, we can categorize the closed-loop feedback control into two types: hardware-and software-based control. A physical sensor would be perfect for hardware-based feedback control since it allows for real-time sensing, precise fiber tip tracking, and quick response time. However, it increases their size, electronic complexity, and cost, making them especially challenging to implement in small endoscopic scanners. Therefore, adding a physical sensor approach may not be a viable solution until sensor miniaturization progresses. This, in turn, fosters innovative approaches to address the issue. Currently, the fiber tip sensing mechanism is based on piezo disk sensing [62], [68], and backscattered intermittent light [69], [70]. Only electrode wires are required in endoscopic scanners for piezo disk sensing, but it is not a direct trajectory measurement based on the fiber tip, and errors are still possible. Additional aperture mirror mounted before the imaging lens can provide a close approximate of trajectory measurement based on the backscattered intermittent light (from apertured mirror). However, its FOV is expected to be limited by the aperture.
In addition, a similar technique with the AFM technique [71] could be adapted into an endoscopic scanner. This method requires an external guiding laser directed to the top surface of the cantilevered optical fiber. But, this method involves the use of D-shape optical fiber with reflective metal deposited on the flat side of the fiber. The reflective surface could be deposited by sputtering or evaporating a thin metal layer on the flat surface of the D-shape optical fiber. When the D-shape optical fiber vibrates, the guiding laser will be reflected from the reflective surface. The reflected light can then be detected using a suitable photodetector. Hence, the dynamics information can be obtained, and this signal can be utilized as a part of the control loop feedback system. However, if a user wishes to mount the D-fiber at 45 • with respect to the vertical axis, the setup requires a special mounting bracket to align and adjust the bending modes to obtain the Lissajous scan. This task is not impossible but requires a rotational maneuver to position the fiber to a precise tilt angle.
On the other hand, software-based feedback control does not introduce any additional hardware component in the endoscopic scanner [56]. The algorithm compares the collected data to a simulated data set with varying test phase lag values and determines which phase corresponds to the collected data. The image is then reconstructed at the determined phase lag that matches with the scanning system. This approach does not actively adjust the driving signal to reduce the deviation and requires significant processing time. This software-based feedback control is easy to be implemented as it does not require any external hardware integration. But this method requires consideration from a mathematical approach and needs to be integrated in the feedback control terms.

IV. DISCUSSION
In endoscopic scanners, it is crucial to understand each component's functionality and its contributions to the optical and mechanical system. This section covers further consideration during the selection of optical components, discusses the current and future applications of endoscopic scanners, and providing guidelines on mechanical configuration for raster, spiral, and Lissajous scanning patterns.

A. OPTICAL PROPERTIES OF TISSUES
Under optical configuration consideration, the choice of light source and detectors depends on the desired imaging technique as each technique operates differently. Another factor that also needs to be considered is the optical properties of the target sample. This information is vital for interpreting diagnostic measurements. The optical properties of tissues can be categorized into different diagnostic/therapeutic windows such as UV window (350 -400 nm), visible window (625 -975 nm), and NIR windows (1100 -1350 nm, 1600 -1870 nm, and 2100 -2300 nm) [72]. Factors such as light absorption, scattering, and penetration depth of tissue need to be considered when selecting the appropriate wavelength and imaging technique. For example, it is known that NIR wavelength can penetrate deep into the dermis of the human skin layer for diagnosis [73]. Different target samples, such as dental enamel, may have different optical windows. Table 2 provides a summary of some applications of fiber cantilever-based endoscopic scanners in recent years. The target objects and samples used in literature for their intended applications, the information of light source and detector used, and the mechanical configuration of endoscopic scanners and their features are also provided. Table 2, studies on atherosclerosis have been conducted, including cell morphology and capillary perfusion. Since deep penetration is not crucial for surface-level VOLUME 9, 2021 tissue diagnosis, imaging techniques such as confocal and traditional fluorescence microscopy are used. Recently, there has been an increasing interest in dentistry for early caries detection, and studies that explore dental imaging have emerged. The fundamental studies on the optical properties of dental enamel for early caries have been carried out by a research team led by D. Fried. The variation of lesion contrast in the tooth using multiple wavelengths ranging from visible to NIR [74] has been conducted. But, its implementation in endoscopic scanners was not explored.

Referring to
This lack of development is perhaps because the scanning system was still a new, emerging imaging technology. Lee et al. [43] have recently looked into fiber cantilever-based endoscopic scanner application for dental caries detection. Based on the fundamental study of early caries detection, confocal microscopy that utilizes NIR wavelengths of 1310 nm and 1460 nm was used for caries detection in artificial enamel caries lesions.
For deep penetration imaging, surveillance colonoscopy, human fingers/nails structures, gastrointestinal disease analysis, and cerebellum activities have been studied. OCT and nonlinear imaging are the primary techniques used for structural and molecular activities in tissues. During the coronavirus 2019 (COVID-19) pandemic, pneumonia caused by the novel COVID-19 virus occasionally becomes severe and requires endotracheal intubation. The examination is usually performed using a laryngoscope, but it risks proximity between the operator and the patient.
Hence, a commercial ultrathin flexible gastrointestinal endoscope was proposed to be used instead [75]. Since an ultrathin and flexible endoscope is desired, the fiber cantilever-based endoscopic scanner's application would be possible. It has also been demonstrated that the endoscopic scanner can be made small enough (outer diameter: 2.6mm) to fit as an accessory for a conventional laparoscope [11]. However, precaution is required to prevent further infection and occupational hazards [76]. Although the fiber cantilever-based endoscopic scanner is predominantly used for biomedical microscopy applications, the scanning system has shown potential application in other fields. For example, applications in surgical operation and Laser Detection And Ranging (LADAR) systems also have been studied.

C. ARTIFICIAL INTELLIGENCE TECHNIQUE
Artificial intelligence (AI) techniques have influenced modern society by transforming many industries, including clinical medicine and biomedical microscopy. Advances in AI technology, such as increasing computer capacity and the adoption of deep learning, have resulted in greater availability of meaningful applications in clinical work. Some examples of AI in medicine are telemedicine [77], digital healthcare [78], and image classification and characterization [79]. Generally, the AI technique can be employed in endoscopic scanners in the software operation of the system. The software operation includes the image reconstruction algorithm after image scanning, image processing, and pattern recognition for disease diagnosis.
As an example, in the image reconstruction algorithm, a deep convolutional neural network (CNN) is used to reconstruct the image using the speckle reflection from the surface of multi-mode fiber (MMF) [80], [81]. The training set is developed by making an intended relationship of dataset between the original target object with the speckle reflection images when MMF is used in an optical fiber imaging system. This dataset is fed into a neural network, and the predicted image is obtained by just analyzing the front speckle reflection of the MMF. For a more straightforward approach, Zhu et al. [82] used a single hidden layer dense neural network (SHL-DNN) to reconstruct images, utilizing speckle patterns from the surface of MMF. SHL-DNN generates similar results as U-Net, a complex convolutional neural network (CNN) originally developed for biomedical imaging, with substantially less training time and network complexity. Deep learning can also aid in reconstructing 3D images based on a set of 2D images of the target sample [83], [84].
Aside from image reconstruction, AI may also assist with image processing for image enhancement [85]. For example, Zhang et al. [86] combine three methods, namely principal component analysis (PCA) method, deep learning-based speckle classification (DLSC), and deep learning-based image enhancement (DLIE), to produce high definition images from a MMF. The proposed DLSC-PCA approach was utilized in conjunction with simulated speckles for image reconstruction. Then, the resulting image was enhanced using the DLIE method to produce a high-definition image.
In disease diagnosis, AI has been actively explored and debated for its ability to provide a computer-aided diagnosis (CAD) in gastrointestinal endoscopy [87], [88]. AI techniques, much like deep learning and machine learning, can be used to perform segmentation, lesion detection, and disease classification based on reconstructed images [89]. In ophthalmology, automated pterygium detection is diagnosed by analyzing the eye images obtained from OCT imaging. The Deep Neural Network (DNN) method is used to assist the ophthalmologist in verifying the disease's diagnosis [90].
Casalegno et al. [91] used CNN for automated detection and localization of dental caries based on NIR transillumination (TI) imaging. A collection of grayscale images of teeth, acquired with the 'DIAGNOcam' system, were fed into the proposed CNN for pixel-wise segmentation (dental, enamel, and caries) and binary labeling of the interested area. Common skin conditions can also be predicted using a deep learning system, with secondary prediction spanning 419 skin conditions [92]. The system uses DNN to process a variable number of input images and a shallow module to process metadata, such as demographic information and medical history, to give a differential diagnosis of skin disorders From these examples, AI techniques certainly contribute their strengths in endoscopic imaging configuration, making it an excellent opportunity for implementation in endoscopic scanners. It can aid in image reconstruction, recovery, and enhancement, resulting in clear, high-resolution, undistorted images. AI may also be utilized for disease identification and classification using images acquired from endoscopic scanners.

D. RECOMMENDATIONS OF MECHANICAL CONFIGURATIONS FOR SCANNING PATTERNS
It is also equally important to understand the contribution of different mechanical configurations to different scanning patterns. For instance, optical fiber selection can affect both the optical and mechanical configuration of an endoscopic scanner. Different actuator types or different architecture may require modification to other components such as the applied drive signal. Different scanning configurations can be utilized as long as the basic working principles of scanning patterns are followed. For a raster scanning system, two actuators are generally needed for a raster scanner to drive the high and fast resonant frequency and low and slow driving frequency. Raster scan can provide uniformity in its scan pattern and pixel dwell time throughout the sample. Although the power consumption is low, the driving voltage is generally high (200 Vp-p), especially for the non-resonant axis. According to IEC 60601-01 standard, the medical device's operational voltage must be lower than 40 VDC [93]. The additional actuator needed to drive another axis also limits the size of the overall scanner. Different actuator type that uses low driving voltage such as electrothermal and electromagnetic should be utilized to overcome the high driving voltage of piezoelectric actuation. Further miniaturization of the actuator is also required to reduce the overall size of the scanner.
The minimum requirement for designing a spiral scanning system is tubular piezoelectric with quadrature piezo elements, a high voltage driver, and single optical fiber. The spiral scanner does not require additional actuators to drive two different frequencies and additional components to break the fiber's cylindrical symmetry. However, it suffers a non-uniform illumination over the FOV, where the intensity is higher at the center, which may cause photodamage or photobleaching. The sampling density is also much lower at the outer peripheral regions of the FOV. In practice, the asymmetric fiber would always cause mechanically coupling between the two perpendicular axes, thus creating image distortion. Special attention to corrections, such as modifying the drive signal and using calibration data, would be required to produce a clear and undistorted image.
For a Lissajous scanning system, exceptional attention is required to separate the mechanical resonance frequencies in the x and y axes. Modification can be made to the mechanical actuator or the fiber cantilever, be mindful of its size if an additional part is added. The driving signal is also more complex than the raster and spiral scanning pattern as it covers many aspects such as amplitude and phase control, repeating or non-repeating Lissajous scan, and scan densities. Each element would contribute to the resolution and frame rate. Specific rules also need to be followed to generate a high-resolution Lissajous scan.
Regardless of the scanning pattern, inevitable concerns such as manufacturing variability, environmental factors, and wear and tear components can cause a nonlinear response in the scanning system, leading to image distortion. The standard solutions, such as adjustment to the driving signal and obtaining position information of the scan trajectory lines as a calibration reference, are ineffective in mitigating the inevitable issue. A closed-loop feedback system would be an effective solution, but its implementation in the scanning system is a challenge as this solution's primary restriction is the size. A different approach that senses the motion of the fiber, such as the utilization of the Fiber Bragg Grating (FBG) sensor [94], may help predict the scan trajectory.

V. CONCLUSION
This paper first reviewed individual imaging techniques and provided recommendations on optical components' choices for each technique. We then highlighted the multimodal imaging concepts and discussed different imaging techniques in providing various complimentary benefits. Implementing multimodal imaging in endoscopic scanners is challenging as various light sources must be coupled into a single optical fiber. If sequential multimodal imaging is desired, an SMF capable of guiding multiple wavelengths of light is sufficient. Whereas if simultaneous multimodal imaging is selected, a DCF can be used and DCF couplers to separate different backscattered light profiles.
Accordingly, mechanical configurations and drive signals for different scanning patterns were reviewed, and the advantages and disadvantages of various configurations were compared. The undesired nonlinear effect was discussed to highlight the importance of the feedback control mechanism in endoscopic scanners. A closed-loop feedback control mechanism will provide continuous calibration to the endoscopic scanner to produce a clear, undistorted image, but its implementation proves to be challenging. Focuses must be placed in the actuator's response or backscattered light without sacrificing the overall enclosure size or field of view.
We also discussed the current and potential application of endoscopic scanners. Cell morphology and capillary perfusion, gastrointestinal disease analysis, and cerebellum activities are the current trends in applying endoscopic scanners. However, its application can also be extended to other fields such as dentistry and bronchoscopy. Finally, further discussion regarding tissues' optical properties, implementation of AI, and recommendations on selecting scanning patterns have been provided. We hope this review gives a comprehensive technical reference for practitioners in developing endoscopic scanners and provides valuable insights into designing a suitable endoscopic scanner to solve specific engineering problems. and the Centre for Research and Instrumentation Management (CRIM), Universiti Kebangsaan Malaysia for all amenities provided.