Coherent Imaging Through Multicore Fibres With Applications in Endoscopy

Imaging through optical fibres has recently emerged as a promising method of micro-scale optical imaging within a hair-thin form factor. This has significant applications in endoscopy and may enable minimally invasive imaging deep within live tissue for improved diagnosis of disease. Multi-mode fibres (MMF) are the most common choice because of their high resolution but multicore fibres (MCF) offer a number of advantages such as widespread clinical use, ability to form approximate images without correction and an inherently sparse transmission matrix (TM) enabling simple and fast characterisation. We present a novel experimental investigation into properties of MCF important for imaging, specifically: a new method to upsample and downsample measured TMs with minimal information loss, the first experimental measurement of MCF spatial eigenmodes, a novel statistical treatment of behaviour under bending based on a wireless fading model, and an experimental observation of TM drift due to self-heating effects and discussion of how to compensate this. We next present practical techniques for imaging through MCFs, including alignment, how to parallelise TM characterisation measurements to improve speed and how to use non-interferometric phase and polarisation recovery for improved stability. Finally, we present two recent applications of MCF imaging: polarimetric imaging using a robust Bayesian inference approach, and entropic imaging for imaging early-stage tumours.


I. INTRODUCTION
O VER the past decade, optical fibre imaging has developed to the point where it now enables microscale optical imaging in hard-to-reach environments, such as fluorescence imaging of neuronal activity in live animal brains [1]- [3]. Many different types of optical fibre imaging have been demonstrated including confocal [4], two-photon [5], [6], brightfield, darkfield and fluorescence [7], quantitative phase and polarimetric [8], [9], speckle [10] and structured illumination [11]. The key technical advance that has made this possible is the ability to characterise the complex but deterministic linear function that describes how light propagates down the fibre, which when discretised is termed the transmission matrix (TM) [12].
The majority of these methods use multimode fibre (MMF) [13]- [16] with a circularly symmetric graded-or step-index refractive index profile. The main alternative to MMF is multicore fibre (MCF) (or imaging fibre bundle), which comprises up to 100,000 light-guiding elements (termed cores or fibrelets) fused together into a single solid 'super' fibre. The positions and sizes of the cores is typically randomised so as to minimise core-to-core coupling while maximising core density [17]. They therefore lack any strong symmetry despite being quasi-periodic in appearance.
MCFs with single-moded cores have a lower mode density (and hence imaging resolution) than the equivalent size MMFs [18], but many commerical MCFs have closely spaced cores (e.g. < 4 μm [19]) and support multimodal propagation within cores [20], closing this gap at the expense of increased core-to-core coupling [21]. The light-confining properties of MCF mean it has a sparse TM (see Sections III-D, IV-B and [8], [9]) enabling approximate amplitude images to be formed through it with no compensation, particularly at shorter visible wavelengths where core-to-core coupling is less. For this reason MCF is already widely used in commercial medical endoscopes [22], which has the advantage of lowering barriers to clinical approval for new devices.
Though MCF allows approximate uncorrected amplitude imaging, it introduces significant distortion in phase and polarisation with coherent light [8], [23]. This can be minimised using bespoke MCF designs [24], [25] but these require large core-to-core spacing and hence have very low mode-density. While suitable for scanning confocal imaging, this results in This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ Fig. 1. Experimental set-up used for characterisation and imaging through multicore fibre (adapted from [8]). The input (X) and output (Y) planes are indicated for reference. SLM = spatial light modulator, M = mirror, HWP = half-waveplate, PBS = polarising beam splitter, L = lens.
poor resolution for wide-field imaging. For phase and polarisation control it is therefore necessary to measure the MCF TM. However, the sparse nature of MCF, even with relatively large core-to-core coupling enables parallelised TM characterisation measurements (see Section IV-B and [8]). Further, the ability to form approximate amplitude images without correction and the lack of radial symmetry make alignment significantly easier (see Section IV-A).
Given these advantages of MCF, they remain a popular choice for imaging both in clinical settings [22] and in research (see [18] for a review of endoscopic imaging with MCF). In this paper we first present empirically derived properties of MCFs important for coherent imaging, namely: choice of representation basis, including a novel method of performing up-and down-sampling of measured TMs and, for the first time, experimental measurement of eigenmodes; a novel statistical treatment of the effects of bending on the TM; and the effect of time-dependent self-heating on the TM. Next, we present important practical strategies that enable imaging through MCF, specifically: dual-polarisation alignment, parallelisation of measurements for increased TM characterisation speed, and use of non-interferometric phase and polarisation reconstruction for improved stability. Finally, we discuss two novel practical applications of MCF fibres: polarimetric imaging via a robust Bayesian inference approach, and phase entropy imaging.
II. EXPERIMENTAL SET-UP Figure 1 shows the experimental set-up used to produce data here. The dual-polarisation design has been presented previously [8], [26], though other dual-polarisation designs are possible [15], [27]. Notably, the imaging is non-interferometric, which has experimental advantages (discussed in Section IV-C). The wavelength chosen is 852 nm, which falls within the 'optical window' [28] where tissue fluorescence minimal. Light is therefore predominantly elastically scattered, enabling accurate imaging of structural features. The laser diode (DBR852S, Thorlabs) has a coherence length of ∼1 m and power output of 35 mW. The 2 m long MCF (Fujikura FIGH-06-350G) has 6000 cores, core diameter ∼2.9 μm, core spacing 4.4 μm, and outer diameter 350 ± 20 μm. In order to reduce computational load and experimental time, only around 75% of the available facet area is used and this is sampled with a period of approximately double the expected core spacing (see Figure 3a). The TM is then characterised using the process presented in [8] at 824 spatial points in the input plane, X, taking 50.8 minutes. Despite under-sampling, we can still determine a great deal about the fibre TM.

A. Basis Representation
Using SLM1 of Figure 1 we can project optical fields with arbitrary amplitude, phase and polarisation profiles onto the distal fibre facet (plane X). With the system aligned these fields can be accurately simulated via Fraunhofer diffraction from the hologram displayed on the SLM surface to plane X. Computer simulations require discretisation so some sampling scheme for the fields must be chosen. Similarly, at the proximal facet of the fibre (plane Y ) amplitude, phase and polarisation are measured via multiple measurements on the camera (see Section IV-C). The sampling here is performed by the camera pixels (resolution: 1200 × 1200, pixel pitch 5.5 μm).
We first consider sampling the input field's horizontally polarised component on a regular M × M grid and then stacking the rows in column-major order (or alternative orderings such as Z-ordering or space-filling curves). The result is an M 2 × 1 vector. Repeating for the vertically polarised component produces a second M 2 × 1 vector. Interleaving the two polarisations (to preserve spatial locality) gives a 2M 2 × 1 vector, termed x, made up of complex elements representing coherent light (i.e. x ∈ C 2M 2 ).
Next, we consider the pixels of the camera sampling an M × M area at plane Y . Considering the two polarisations this can be converted to a 2M 2 × 1 vector y ∈ C 2M 2 . Treating the fibre as a linear scattering medium, the input and output fields, x and y are related by linear propagation integrals [12], which manifest as a TM, A, when discretised so that: where A is a 2M 2 × 2M 2 complex matrix. Imaging through fibres usually requires the recovery of x based on measured y. This in turn requires the recovery of A either directly [8] or indirectly [9], both of which require measuring pairs of known input-output relations (x, y).
Sampling at points on a regular grid (often called the canonical basis [29]) is not the only way of representing input and output fields. If vectors x and y are 2M 2 dimensional, we first consider arbitrary coordinate transformations represented by square 2M 2 × 2M 2 matrices, T. Equation (1) then becomes: where A is the TM expressed in the transformed coordinates (i.e. T −1 A T = A). If the rows of T are linearly independent they form a new basis for expressing input/output vectors and this transformation is termed a change of basis. This basis need not be orthonormal, but orthonormality is experimentally preferrable as it minimises redundancy and ensures numerically stable inversion.

B. Upsampling and Downsampling
The coordinate transformation matrix, T, need not be square: it can be 2N 2 × 2M 2 (with N < M), representing a linear projection or downsampling of the input. Equation (1) then becomes: where T D is the forward downsampling matrix and T U is the forward upsampling matrix, and we require that T D T U = I, where I is the 2N 2 × 2N 2 identity matrix. The transformed TM, A , is of size 2N 2 × 2N 2 which has significant computational benefits. A 1000 × 1000 resolution camera (i.e. M = 1000) might require a 2 × 10 6 × 2 × 10 6 TM, consuming 58.2 TB of memory with double-precision floating point complex numbers. Downsampling to N = 100 shrinks the memory requirement to 6.0 GB while still accounting for 20,000 propagating modes. The minimum value for 2N 2 , the dimension of the downsampled TM, without loss of information can be determined by considering A as a multiple-input multiple-output information carrying channel [30]: 2N 2 should be ≥ Q, the number of non-zero singular values of A. This cutoff can be computed at a particular wavelength for well-defined waveguides (e.g. MMF [31]) or determined empirically using a very large number of measurements [32], [33]. If the number of experimental measurements, P , is known to be less than Q then minimal information loss is achieved with 2N 2 = P .
There are multiple methods of downsampling (i.e. determining T D of Equation (3)). A simple approach is to select 2N 2 pixels from the 2M 2 available pixels. For example, if a scanned spot basis is used, the pixel nearest the centre of each spot position could be selected. T D would then resemble a permutation matrix with each row containing a single 1 and no other non-zero elements. For MMF a suitable downsampling is achieved with a basis of eigenmodes computed for circularly symmetric refractive index profiles (e.g. Laguerre-Gauss or Hermite-Gauss). This is achieved experimentally by displaying a sequence of holograms on the SLM that act as complex spatial filters for this basis and then measuring the complex correlation coefficient [31]. In both these examples, downsampling enables fewer experimental measurements to be recorded, significantly reducing memory usage as discussed above. However, the actual measured TM now available for image reconstruction is the matrix product T U A , determined by measuring pairs of vectors (T D x, y).
When recovering a distal image, x, from a measured field, y, we require the inverse TM so rewrite Equation (3) as: where (..) + represents a general inverse such that T D T + D = I and T + U T U = I. T + U is now termed the backward downsampling matrix and T + D is termed the backward upsampling matrix and it follows from Equation 3 that T + U T + D = I. T + U has the implicit role of determining which pixels in y carry the most relevant information required for reconstructing x. T + D has the role of interpolating recovered points to form an image, x, of dimension larger than the fundamental TM, A. In the simplest case, this can be defined to implement a linear interpolation between points.
Next, we consider the inverse of the recorded matrix product, denoted B: Given some appropriate interpolation, T + D , we can directly reconstruct x from a recorded y using B. However, since T + U is dependent on the exact imaging system and basis used, we wish to decouple its effect from that of A −1 which is considered a more fundamental fibre property that can be used to examine, for example, the fibre eigenmodes.
There are many possible candidates for T + U given a known T D . For the MMF case, one can be constructed using the conjugate transpose of the Laguerre-Gauss or Hermite-Gauss basis, which effectively uses prior knowledge of the ideal waveguide modes (and optical reciprocity) to optimally utilise all available power [34], [35].
For the MCF case we consider an example using data from [8] where T D is a downsampling permutation matrix that selects 1648 rows (the number of measurements, chose for experimental practicality) of the 2.88 million (1200 × 1200 in 2 polarisations to match camera) available at plane X. The requirement T + U T + D = I could be satisfied trivially by setting T U = T D but this implies that T + U is a permutation matrix like T D and that there is no useful power between sampling points, which is physically unlikely. To find a more realistic T + U , we define some required properties of the factorisation of the measured matrix product, B, using its singular value decomposition, This ensures that the upsampling and downsampling bases invert one another.
2) The singular value decomposition of B will have the same left singular vectors, U B , as the singular value decomposition of (A ) −1 . This is because (A ) −1 is the leftmost term in the factorisation of B.

3) T +
U should ideally be an orthogonal basis, i.e. it should not discard information during downsampling. Therefore, the singular values of (A ) −1 should be the same as the singular values of B. 4) Ideally, (A ) −1 should be symmetric so as to ensure its eigenvectors are orthogonal. However, optical losses or improper sampling can create the appearance of asymmetry so this requirement is not strict. Again, selecting T + U = T D , a permutation matrix in this case, tends to produce a poorly-conditioned (A ) −1 (violating the third requirement) because only a small fraction of power is coupled to the specific pixels sampled.
An improved approach is to sum pixels in the neighbourhood of each sample point (e.g. all pixels that are closest to that point than any other, the Voronoi cell). This approach utilises the expected light confining and wave-guiding properties of the MCF structure. We generate an estimated sampling matrix, termedT + U , by setting the appropriate columns in the Voronoi region to 1 for each of the 2N 2 rows. This is still suboptimal, as the elements of the optimalT + U might take any complex value. To proceed, we approximate the expected amplitude profile of (A ) −1 , termed A amp : where | · | is the element-wise modulus and we use the transpose ofT + U as an approximate inverse because each row (T + U ) comprises an approximately equal number of non-overlapping 1 s. Next, we determine the left singular vectors and the singular values of (A ) −1 by finding the left singular vectors and singular values of B, using a singular value decomposition such that these are both 2N 2 × 2N 2 matrices.
Finally, we estimate the full complex (A ) −1 using a novel error reduction (or alternating projection) iterative algorithm [36]. The algorithm developed here alternates between constraining the estimated right-hand singular vectors of (A ) −1 to form a unitary matrix and constraining the amplitude of (A ) −1 . The projection operator is a multiplication by U B S B (or its inverse). This process is depicted in the flowchart of Figure 2. The appropriate backward downsampling matrix, T + U can be determined from (A ) −1 and the measured B. The final step is to set the appropriate elements of T + U to 1 to satisfy T D T U = I N . The resultant basis satisfies the first 3 requirements and is comprised of spots translated across the fibre facet with randomised phase profiles ( Figure 3a shows an example element). The average amplitude envelope, with a full-width half-maximum of 40 μm, is shown in Figure 3b. An example estimated (A ) −1 is shown in Figure 3c and it can be seen that the matrix is approximately symmetric.

C. Approximating Eigenmodes of MCF
The upsampling and downsampling matrices can be used to express the TM as a square matrix and therefore compute the eigenmodes (or eigenvectors) and eigenvalues. The magnitudes of the eigenvalues are all close to 1 (Figure 4b), showing that there is near-minimal loss (equivalently, near-maximal information transfer) through (A ) −1 . The eigenmodes can be plotted in the original 1200 × 1200 pixel frame of plane y using the upsampling matrix and are seen to have power uniformly spread across the fibre facet with randomised phase (example shown in Figure 4a). This agrees with theoretical work predicting that the eigenmodes of MCF are supermodes filling the entire facet [21]. Only a subsection of the full fibre is characterised to reduce computational load (indicated by the coloured or black areas) and the shape is due to the rectangular array of spots used to parallelise measurements (see Figure 11). Other basis elements appear similar but translated with randomised phase. b) By averaging the amplitudes of many basis elements and taking a cross section, an approximately Gaussian (or perhaps hyperbolic secant) amplitude envelope is observed. c) Visualisation of the reduced inverse matrix, (A ) −1 . It is observed that it is broadly, though not exactly, symmetrical suggesting that an approximately orthogonal eigenbasis can be found.

D. Other Useful Bases for MCF
In MMF the theoretical eigenmodes (e.g. Laguerre-Gauss basis) typically produce a sparse TM in which most of the elements are zero [31]. The sparsity enables parallelised characterisation (see Section IV-B) and the theoretical model aids physical insight.
By contrast, the MCF eigenmodes are highly complex with very heterogenous phase profiles and so any slight pertubation (e.g bending) may result in a very different set of eigenmodes. These eigenmodes are not then a robust choice for a sparse basis. A more robust sparse basis can be constructed with inspiration from the upsampling basis of Section III-B each basis element is a spot with Gaussian amplitude profile and flat phase, translated to different positions. New basis elements are easily created on the fly by tilting a mirror or displaying a blazed grating on an SLM, instead of needing to store large libraries of bespoke holograms as with the MMF sparse bases. This basis is a Fourier conjugate of the 'angled plane wave' basis [14], [37]. Physically, the sparsity arises from the lateral confinement of light and can be exploited to parallelise measurements (Section IV-B).
There are other practical considerations when deciding an experimental basis. For example, if the basis is being projected onto to the MCF facet via a lens, there is a trade-off between minimising distortion at the edge of the MCF facet, achieved with long focal length lenses and having a small of the Gaussian spot, achieved with short focal length lenses. Phase-only SLMs can only redistribute light, rather than block it, so it is difficult to fully 'turn off' a polarisation arm (with reference to Figure 1) to create a pure linear polarisation basis. An elliptical polarisation basis with phase-delay between polarisation may be more reliable in such cases [8]. If using binary phase or amplitude SLMs to increase speed, a Hadamard basis may be appropriate as it is easy to generate [29]. The basis choice may also be application specific: for example, Fourier and wavelet bases enable examination of scattering properties useful for diagnostic tissue imaging [9].

E. Effect of Bending
In order to reduce core-to-core coupling, commercial MCFs have randomised numbers of, spacings between and diameters of cores [17] making it difficult to model bending deterministically as has been demonstrated for MMF [35]. We therefore adopt a statistical treatment based on experimental measurements. Using the set-up of Figure 1, we measure the TM of a 2 m piece of MCF bent in Q different configurations around a series of posts, creating a range of different bend radii down to 35 mm (to avoid breakage).
We first perform a singular value decomposition of each of the measured TMs, A q . The singular values hardly change under the different bending conditions covering both large and small bend radii (Figure 5a). This suggests that under typical bending conditions the power (or information) loss of the MCF is not significant. This agrees with studies on MMF that have shown that very tight bend radii (< 14 mm) are required before significant information is lost [38]. We then produce a matrix C by vectorising the measured TMs in column-major order and concatenating: We perform a principal component analysis of C (equivalent to a singular value decomposition) to check for any bendinvariant modes. The resulting principal values are all nearly unity (within 0.1%) indicating that the TMs are nearly perfectly orthogonal and that no signficant bend-invariant modes are found (Figure 5b). By contrast, MMFs with precise parabolic refractive index profiles possess a set of bend-invariant eigenmodes [39].
Next we wish to characterise how these TMs change with bending. We do not expect to observe a significant memory effect, as seen in previous work [40], because our fibre is longer (2 m vs. <30 cm) and we are using a longer optical wavelength (850 nm vs 530 nm) resulting in increased core-to-core coupling. We therefore model the TMs as random variables. First, we investigate correlations between TM elements. For 7 different bending conditions (denoted by matrices A 1 · · · A 7 ) we compute the cross-correlation between TM elements at row r, column s and row t, column u as: where a denotes an element of A 1 , b denotes an element of A 2 etc. Each TM element, e.g. a r,s , represents a coupling between a point on the input facet (x r , y r ) and a point on the output facet at location (x s , y s ). Because correlation compares pairs of TM elements we consider a second point, (t, u) and define the following quantities: conditions. The zoomed inset shows coupling of non-zero mean between fibre cores (Rician fading), which appears as stripes, and zero-mean coupling between cores and cladding (Rayleigh fading), which appears as black speckle.
For fixed input coordinates (e.g. elements a r,s and a r,u ), S simply represents the distance between output points. We might then expect an inverse relationship between ξ and S. We next compute S and ξ for a random subset spanning 10% of possible TM element pairs (to reduce computational load) and observe an inverse relationship (Figure 6a).
Correlation drops to 0.5 by S = 0.3 μm, the physical distance mapped to adjacent camera pixels, and is comparable to the diffraction limit of ∼ λ/2 ≈ 0.42 μm. This decorrelation within a core may be due to multimodal propagation [42] or to fields in the cladding [43]. The correlation drops further to 0.2 by S = 1 μm, the average core radius [41], followed by a long tail extending to 50 μm, the approximate width of the amplitude envelope of the ideal upsampling basis (Figure 3).
The low correlation enables the TM elements to be modelled as independent random variables. Each element is formed by the coherent addition of light propagating via many paths so its amplitude can be modelled by a Rician distribution, borrowed from the concept of Rician fading in wireless communications [44]. The Rician distribution is derived as the amplitude of a complex circularly symmetric Gaussian distribution and has two parameters: ν, representing the distance of the mean of the underlying Gaussian from the origin, and σ representing the standard deviation.
We determine maximum likelihood Rician parameters across 7 bending conditions using TMs downsampled via the process of Figure 2. The results are shown in Figure 6b. There is a strong diagonal element meaning that a significant amount of light is confined or guided. This is expected because MCF forms approximate images without correction.
Zooming in, we observe that ν exhibits significant offdiagonal components forming a 'streaked' pattern ( Figure 6b). This is because the input and output sampling functions may be centred either on a core or in the cladding. Core-core and cladding-cladding coupling results in non-zero mean, ν, and thus Rician fading, while core-cladding or cladding-core coupling is more likely to have zero mean and thus Rayleigh fading (producing the black speckle observed in Figure 6b). Just as Rayleigh fading in radio systems is due to indirect reflections off objects, here it is due to indirect coupling caused by bending. This indicates that in terms of mean power coupling, core-core or cladding-cladding modes are less sensitive to bending. However, the Rician fading model does not specify phase and we observe empirically that the phase of all TM elements is uniformly distributed ∼ U (0, 2π), making useful prediction difficult and preventing the existence of truly bend-invariant modes.

F. Correcting for TM Drift
Over time the TM of the fibre will vary due to perturbations such as bending or temperature changes. If these perturbations can be tracked or predicted the TM can be adjusted to avoid deteriorating image quality [35], [45]. A zero-order model tracks the global phase over time relative to a 'reference beam'. In interferometric systems this entails tracking drift between the signal and reference arms [45]. With MCF an alternative reference beam is created by projecting a constant field onto a small set of cores (see Figure 7a and b).
A first order model tracks phase tilt, which arises due to the memory effect that is observed when MCFs are bent very small [40]. The tilt can be considered to arise from bending-induced path length differences. By displaying a constant reference pattern on SLM1 of Figure 1 and repeatedly imaging the field at plane Y , a time-varying phase tilt is observed (Figure 7c). Further insight is gained by observing the phase tilt drift under different bending conditions (Figure 8). Bending is quantified by averaging the absolute value of curvature over the fibre length. Curvature is measured by fitting an osculating circle to the fibre path traced from an image. It is noted that higher curvature is linked with a higher rate of tilt drift, with an upper bound that is approximately a negative exponential curve with a time constant of the order of minutes. This is consistent with a simple heating model (e.g. Newton's law of cooling). We therefore hypothesise that a small amount of light couples out of the MCF (especially at sharp bends) and is absorbed by the protective sheath, slightly heating it which in turn induces small differential bending. This  'micro-bending' may fall within the memory effect range of this fibre hence producing a phase tilt. At lower curvature the drift of phase tilt is observed to be slower, but still with the same general increasing trend. Further experimental investigation of the effect of varying laser power, which is here fixed at 35 mW, is required to fully verify this thermal drift hypothesis.
Superimposed on this exponential trend, we observe random fluctuations with time scale of order ∼1 minute which limits the minimum time between tracking measurements. If this time is less than the TM characterisation time, then phase tilt correction must be applied to each of the characterisation measurements [8]. Failure to do so results in significant TM error and poor image recovery (Figure 9). The relative stability of polarisation retardance (i.e. birefringence) suggests that the cause of this drift in tilt is minor path length changes and that there is negligible contribution from stress and strain [35].
If the tilt magnitude drifts above 0.04 there is significant residual noise in the reference phase images even after correcting for tilt ( Figure 9). This is because bending has moved beyond the memory effect range and has changed the TM in an unpredictable way, requiring it to be re-measured in full. To avoid this for most realistic bending configurations, matrix characterisation and imaging ought to be performed within about 4 minutes, as per Figure 8.

A. Aligning MCFs
A key advantage of MCF over MMF is that TM characterisation in a pixel (or canonical) basis does not require precise transverse alignment but still provides a reliable sparse representation basis for the TM (see Section III-A). By contrast, the Laguerre-Gauss basis that provides sparse TMs in MMF requires extremely precise transverse alignment with the central axis of the fibre, often to within fractions of a micron. However, MCFs do require some alignment: first, the characterisation patterns on the fibre facet (plane X of Figure 1) must be in focus. The relatively high lateral confinement of light (Section III-E) means amplitude images are approximately formed through the fibre without correction and so can be used to evaluate focus, for example by using a recognisable text sequence.
Next, the two polarisations much be aligned for reliable production of elliptical polarisation states. To do this, the vertically polarised beam is first 'turned off' by displaying a random pattern on the appropriate half of SLM1 to scatter light. The other half of SLM1 (i.e. horizontally polarised beam) displays a blazed grating and scans the x and y pitch, which in turn scans a spot across the distal facet. The camera measures the Fig. 10. Aligning the horizontally and vertically polarised characterisation beams for dual polarisation MCF characterisation. a) For each polarisation a grating is used to scan a spot in two dimensions and the centroid of each on the output facet is determined. b) A hyperplane is then fitted to each polarisation to average distortion introduced by the fibre TM. This is then used to adjust the tilt on the vertical polarisation so that it is aligned with the horizontal polarisation. distorted spots at the other end of the MCF and the centroids are determined (Figure 10a). A 2D plane embedded in 4D space is then fit to the centroid positions to average out distortions introduced by the fibre TM. The result is a precise map between grating pitch and spatial position. We repeat the process for the vertically polarised beam with the horizontally polarised beam 'turned off', and find the relative spatial offset between the two fitted planes. This offset is used to adjust the pitch of the vertically polarised grating and hence align the two polarisations (Figure 10b).

B. Parallelising Calibration Measurements
The sparse structure of MCF TMs when using a spot basis (see Section III-D) means that separate areas of the TM can be characterised in parallel. This is because there are rows of the TM that have no power overlap with any other rows. Power can also be coupled into two or more locations at the input facet that will not produce overlap at the output facet -for example, two spots at opposite sides of the fibre. By selecting sets of rows for which this property holds between all pairs, a maximally efficient parallel set of measurements can be achieved. For a spot basis this means determining how far apart spots needs to be spaced to avoid significant power overlap at the output.
Empirical measurement for the MCF used here leads us to the spot array of Figure 11a. Each single physical measurement is split into 'virtual' independent measurements by isolating each spot (Figure 11b). This enables a dramatic speed-up in experimental time -12-fold here -and 1600 modes can be characterised in 50.8 minutes [8]. Characterisation speed could be improved significantly further by using high-speed digital micromirror devices (DMDs) instead of liquid crystal spatial light modulators [2], [46]. When using measured input and output fields to reconstruct the TM, sparsity can be further exploited by noting that most elements of any given column of the TM will be zero and can be excluded from calculations, thus reducing computational requirements (Figure 11c). Fig. 11. Exploiting sparsity to parallelise TM characterisation: a) Using an array of spots spaced sufficiently far apart that the output fields do not overlap. b) After measurement, the data can be masked to split each single measurement into several effective measurments (12 in this case). c) If reconstructing the inverse TM (A −1 ) column by column, only rows within the specificed subregion for a given input (which is different for every input/column of A −1 ) will be non-zero -the rest can be excluded from calculations. Adapted from [8].
Such parallelisation is possible in other systems that exhibit sparsity, for example a MMF with a Laguerre-Gauss basis. In the MMF case, however, the strong axial symmetry means that precise alignment is required to achieve parallelisation making it practically difficult. Further, a large pre-generated library of holograms is required whereas here we simply need to generate different blazed gratings [47].

C. Non-Interferometric Phase Recovery
Much fibre characterisation and imaging work uses interferometry to determine phase from camera measurements [15], [23]. This approach is fast but requires high coherence lasers, ruling out most low-cost diode lasers, and is very sensitive to drift, thermal fluctuations and vibrations [40].
Non-interferometric phase imaging (or phase retrieval) provides greater stability and permits less coherent lasers at the expense of increased experimental and computational time. The experimental set-up of Figure 1 uses a non-interferometric method that involves generating a through-focus stack of images at many different focal planes. This is achieved by displaying a parabolic phase mask on one half of SLM2 (representing one polarisation) that defocusses the beam [8], [48], shown in Figure 12a. The other half of SLM2 displays a random hologram to scatter light, effectively deactivating the other polarisation. An iterative algorithm then simulates optical propagation between the focal planes using Fresnel diffraction and constrains the amplitude at each plane. After typically 200 iterations, this converges to the desired phase profile [49]. Fig. 12. Non-interferometric imaging of amplitude, phase and polarisation. a) A parabolic phase mask displayed on the horizontally polarised half of SLM2 is used to generate defocussed images of the object on the camera. 7 different parabolic masks are used to generate a through-focus stack, from which phase is recovered using an iterative algorithm [49]. b) Phase stepping the vertically polarised half of SLM2 and interfering it with the horizontally polarised image via a 45 • polariser enables phase-shift interferometry between the two polarisations.
To measure the full polarisation state, both halves of SLM2 are enabled and are interfered on the camera via a 45 • polariser. One half of SLM2 is then stepped through different phase levels from 0 to 2π, effectively performing phase-shift interferometry. A curve is fit to the amplitude sequence at each pixel giving the relative amplitude and phase of the second polarisation ( Figure 12b).

A. Measurement of Polarimetric Parameters
Polarimetric imaging measures how objects alter the polarisation state of incident light. Applications include examining molecular structure, e.g. chiral molecules like glucose [50], and quantifying optical heterogeneity for detecting diseases such as cancer [8], [51]. Polarimetric data is typically represented using either the Mueller-Stokes formalism or the Jones formalism, although with temporally and spatially coherent light, as is the case here, the two become equivalent [52].
The Jones formalism is a special case of the dual-polarisation transmission/reflection matrix formalism presented in Equation (1) where light couples only between polarisations and not spatial locations. This gives a 2 × 2 matrix at each point (termed a Jones matrix) that relates a 2D input field vector to a 2D output field vector (termed Jones vectors). We must create at least 2 distinct Jones vectors incident on the sample and measure the associated Jones vectors after transmission to unambiguously determine the Jones matrix at a point on the sample. At some location on the sample, (x, y), consider n output Jones vectors, V x,y = [v 1 (x, y) · · · v n (x, y)], arising from n distinct input Jones vectors, U x,y = [u 1 (x, y) · · · u n (x, y)]. The 2 × 2 complex Jones matrix, J x,y , can be determined through: where † indicates a Moore-Penrose pseudoinverse. The multiple distinct input Jones vectors could be generated either through a separate fibre, e.g. a polarisation maintaining fibre [53], or by illumination through selected cores of the MCF [54], though the latter may require advance knowledge of the fibre TM.
The Jones matrix can be further factorised to produce more easily interpretable parameters. This requires defining a model and fitting data to it -here, we use a model of an elliptical retarder followed by a partial polariser (Figure 13a).
Factorising J x,y in terms of these two components gives: (14) where R(θ) is a rotation matrix and the 5 resolved polarimetric parameters are diattenuation, D, diattenuation axis orientation, θ D , retarder circularity, ξ, retardance, η, and retardance axis orientation, θ η with the following ranges: For each set of parameters, there is a 7-fold degeneracy and so for display purposes the degenerate set closest to some fixed point is used [8]. We perform this factorisation using Bayesian inference due its robustness to noise and overfitting [56]. We first apply Bayes' theorem to model the joint probability distribution of parameters (θ θ θ = [D, θ D , ξ, η, θ η ]) at location (x, y) conditional on the known input and measured output Jones vectors, termed the posterior distribution: The elements of V x,y , denoted v ab (x, y), represent measured complex quantities and are assumed to be independently distributed complex Gaussian variables (see Section III-E): where u ab (x, y) is element (a, b) of U x,y , σ 2 I is the covariance matrix, and CN (μ, Σ) is a 2-D complex Gaussian distribution of mean μ and covariance Σ. σ, the noise standard deviation, can be inferred from the data along with the other parameters. This enables evaluation of the first term of the RHS of Equation 16. The second term, p(θ θ θ) represents the prior distributions of parameters, which we will assume are independently distributed (i.e. p(D, θ D , η, θ η , ξ) = p(D)p(θ D )p(η)p(θ η )p(ξ)). Joint prior distributions could be derived using more restrictive physical models or empirical methods such as copulas. The prior distributions could be uniform distributions across the parameter ranges of Equation (15), giving broad uninformative priors. However, more restrictive priors based on physical intuition improve results: for example, biological samples rarely exhibit high degrees of linear diattenuation [57] so our prior for D would have a peak at zero. Phase values require a circular distribution: here we use the von Mises distribution, which can be made more restrictive using a non-zero κ value.
With the priors selected to suit the application, parameters are estimated from Equation (16) either via optimisation (to find maximum likelihood), or Monte-Carlo simulations (to examine parameter distributions). Figure 13b shows Bayesian polarimetric imaging of a birefringent test target (R2L2S1B, Thorlabs) through an MCF using the experimental set-up of Figure 1 with a spatial resolution of 36.0 ± 10.4 μm (adapted from [8]). The target should have a background θ η = 0 and a foreground θ η = π/4 ≈ 0.78, but the measured mean θ η is slightly lower at 0.65. This discrepancy may arise because the target is used outside the design wavelength range, resulting in different behaviour of the birefringent polymer. The grid-like artefacts arise from the slightly non-uniform illumination within each single frame becoming pronounced when multiple single frames are stitched together as the target is translated. More details of the experimental set-up, as well as validation of additional polarimetric properties can be found in [8].
Though we select a particular physical model to fit here, the Bayesian approach can actually compare many different possible models by evaluating their likelihoods, a process called Bayesian model selection. The approach is easily extended to consider the joint probabilities with neighbouring pixels and perform spatial smoothing (see [8] for further detail).

B. Entropic Imaging for Tissue Analysis
Another emerging application of imaging through MCF fibres is imaging spatial entropy. This represents a measure of the variation of some parameter across a surface and has proved useful in identifying amorphous structures arising in diseased tissue [58]- [60].
Coherent imaging through MCF provides multiple parameters for which entropy could be computed either individually or jointly: amplitude, phase and inferred polarimetric properties. Spatial entropy can be computed approximately by a windowed filtering process: values within the filter window are binned and the resulting histogram integrated to compute entropy [61]. This has the downside that selecting the appropriate binning level can significantly affect results, a problem that grows significantly worse when estimating joint entropy between multiple parameters.
Alternatively, we can consider the Kullback-Leibler divergence, which measures the similarity of probability distributions P and Q with density functions p(x) and q(x) (the 1-D case) respectively: Setting Q to be a uniform distribution, D KL (P ||Q) becomes a measure of how 'spread out' the distribution P is termed the differential entropy, H: This measure can be extended to multivariate distributions, p(x 1 , . . . , x m ), simply by integrating over the additional variables. We then compute spatial entropy by fitting a multivariate distribution the desired parameters within a spatial window and computing entropy via Equation (19). Figure 14 shows how imaging of phase entropy through MCF can be used to detect small tumours in tissue due to increased light scattering (adapted from [8]). The spatial resolution is of the order of 100 μm due to the spatial windowing required for entropy, and the precision of phase entropy is ∼ ±20%.

VI. CONCLUSION
In this paper we presented new empirical results useful for imaging through MCF. We first presented a new method of determining appropriate upsampling and downsampling schemes for experimentally measured non-square matrices and used this approach to, for the first time, experimentally determine the eigenmodes of an MCF. Next, we presented a novel statistical analysis examining the effects of bending on MCF inspired by wireless fading models. We experimentally observed bending-dependent TM drift, conjectured to be produced by a self-heating effect, and discussed strategies to compensate. Three important practical techniques for enabling MCF imaging were then discussed: alignment, parallelisation of transmission matrix characterisation, which offers greatly improved speed, and non-interferometric phase recovery, which offers improved stability. Finally we discussed two recent applications of MCF imaging: polarimetric imaging using a Bayesian inference approach to compensate noise, and entropic imaging for examining light scattering properties of samples with applications to cancer imaging.
The range of biomedical imaging techniques demonstrated through MCF continues to expand. Implementing these in a very thin form factor is a significant step towards minimally invasive in vivo biomedical imaging, as early experiments in mice brains have demonstrated [1]- [3]. To develop these techniques towards clinical translation, two key challenges remain. First, fibre TM characterisation must be fast enough to compensate dynamic bending and temperature-induced fibre distortions in vivo and allow imaging at several frames per second. Some progress towards this has been achieved with high-speed digital micro-mirror devices (frame rates >22 kHz) and high speed cameras [2], [46]. The second, more fundamental problem is the need to precisely measure the dynamic effect of bending and temperature on the TM during use and without compromising the ultra-thin form factor. One proposed method is to adjust a pre-measured TM using precise modelling of bent MMF [35] but in the case of MCF with randomised and complex refractive index profiles, precise modelling is likely infeasible. Guide star approaches provide another option but may compromise the ultra-thin form factor by adding bulk at the distal facet [62], [63]. A recently proposed approach aims to overcome this by exploiting a compact reflector structure at the distal facet [64], though experimental implementation remains to be achieved.
ACKNOWLEDGMENT Data sets used for this paper and associated code are available at https://doi.org/10.17863/CAM.40552.