A Fast and Robust Framework for 3D/2D Model to Multi-Frame Fluoroscopy Registration

Three-dimensional (3D) kinematic analysis plays an important role in improving diagnosis and in the evaluation of treatments and surgical procedures. For example, measuring the 3D kinematics of knee joints is essential for understanding their normal function and diagnosing any pathology, such as ligament injury and osteoarthritis. Image registration is a method which can be used to compute kinematic measurements without involving the introduction of instruments into the body. However, in these techniques, the trade-off between accuracy and computation time is still a challenging problem which needs to be addressed. In this paper, a fast and robust registration method is proposed for the measurement of post-operative knee joint kinematics. Using this method, after total knee arthroplasty (TKA) surgery, a 3D knee implant model can be registered with a number of single-plane fluoroscopy frames of the patients’ knee. Generally, when the number of fluoroscopy frames is quite high, the computation cost for the registration between the frames and a 3D model is expensive. Therefore, in order to speed up the registration process, we apply an interpolation-based prediction method, to initialize and estimate the 3D positions of the 3D model in each fluoroscopy frame. The estimated 3D positions are then fine-tuned. The experimental results, which were performed on the knee joints of 18 patients post-surgery, show that the computational time required to register each frame for each bone using our proposed method is only 67 seconds, which is much faster (almost 6.5 times faster) than the best existing registration method (a registration method based on sum of conditional variance (SCV) similarity measure) while maintaining almost the same accuracy. The average of the mean difference ± standard deviation of the proposed method for femoral and tibial bones for translation and rotation parameters are 0.0603 ± 0.2966 (mm) and −0.0069 ± 0.2922 (degree) respectively.


I. INTRODUCTION
The success of total knee arthroplasty (TKA), which is a surgical procedure for restoring the function of a knee joint, is commonly evaluated using the level of satisfaction with the outcome reported by the patient. Also, to more quantitatively The associate editor coordinating the review of this manuscript and approving it for publication was Kumaradevan Punithakumar .
analyse the success of the surgery and the condition of the patient's knee afterwards, several methods have been used to measure 3D knee kinematics. Kinematic analysis can also help to improve the design of knee implants. In a study by [1], knee kinematics were investigated to determine the effects of improving implant design through customized TKA components. Regarding the methods used for kinematics measurement, although roentgen stereo photogrammetry analysis (RSA) is an accurate and popular method for kinematic analysis, it is invasive because of its requirement for tantalum beads to be implanted into the patient's bones. This method is based on radiographic examinations of calibration cages and object markers implanted in the skeleton. Three-dimensional motion analysis can be provided by accurate measurements of radiographs and computer-assisted calculations. RSA requires the following steps: implantation of tantalum markers, the radiographic examination and measurements on radiographs followed by mathematic calculations [2].
In [3], a bi-planar RSA method was used to compute knee kinematics after arthroplasty surgery. In this approach, bi-planar (two x-ray radiographs) fluoroscopy is used, and instead of using 3D models of the knee's implants, the positions of the implants were reconstructed using geometric features of the designed components including its planar surfaces, straight lines, pegs and circular edges. However, in this method, since some clusters of markers are attached to the knee bones and subsequently tracked by a camera system, it is considered to be invasive.
Skin-mounted marker tracking methods are non-invasive techniques which can be used to measure human joint kinematics. However, they are prone to inaccuracy because the markers may move because of skin movements and they cannot measure the exact motions of the bones underlying the markers [4]. To improve their accuracy, some approaches used a force plate coupled with a video-fluoroscopic system to measure TKA kinetics and kinematics [5], [6]. In these techniques, implant components were tracked directly by x-ray cameras to compensate for their low accuracy which may have been caused by the artifacts of skin movement. A tracking-based method using multi-body kinematics optimisation (MKO) and extended Kalman filters (EKFs) to decrease the effect of the soft-tissue artifact (STA) and provide a more accurate estimation of joint kinematics was proposed in [7].
To address the previously mentioned problems of providing 3D kinematics, including being invasive, inaccurate and requiring a high computation time, registration methods have provided non-invasive techniques with high accuracy for kinematic measurements. In [8] a registration method for conducting a 3D post-operative analysis of a TKA's components to determine the condition of a patient's knee using the normalised correlation coefficient (NCC) similarity measure was developed. Using this method, implant and bone models of the knee are registered with dual x-ray images of the knee. Although a simulated annealing global optimisation method was proposed in this study, its registration algorithm may still converge to a local minimum which can cause errors in the estimated 3D position of the components. A new, improved 3D-2D fluoroscopy registration algorithm for TKA analysis was developed in [9]. It focused on the out-of-plane translation and rotation movements which are difficult to measure precisely using a single-plane approach, with the experiments performed by registering only 3D CT images of a Sawbones model with 2D fluoroscopy frames.
In [10], a real-time 2D to 3D registration method based on a convolutional neural network (CNN) was proposed. In experiments, it was applied to measure TKA kinematics by registering the 3D model of a knee prosthesis with the fluoroscopy video of its implants. A laser scanner was used to acquire the model, and then a binary volume constructed from the scanned data. Although this method was accurate, fast, robust against large initial displacements and applicable for real-time applications, it required a GPU implementation based on a special workstation. In a clinical setting, information of dynamic joint kinematics extracted from videos of the movements of the knee joint is very useful. However, one of the issues when extracting information from videos is that the process of registration for a large number of images can be quite time-consuming. For example, to provide information of 3D knee kinematics, approximately 300 2D fluoroscopy frames are typically registered with 3D models of the femoral and tibial components of the relevant artificial knee and a registration algorithm must be run at least 600 times.
Most registration methods [11]- [13] applied in this medical area registered a 3D model to a fluoroscopy video using a frame-by-frame approach, as shown in Figure 1, where f denotes a frame and n is the total number of frames which should be registered. In this approach, the registration output, which showed the correct position of the bone in a 3D model for one frame, could be used as the initial position for the model for the next frame. However, all the frames must have been registered which is not efficient in terms of computational time. An automated registration method for registering 3D models of knee implants to single-plane x-ray images was proposed in [11]. It applied a pyramidal similarity metric and Lipschitzian optimisation routine for registration, was robust against the noise from bones and soft tissue and had a reasonable computational time of almost one minute per frame. However, this time is computed when a GPU based version of the registration method was run using a 4GHz Intel Core i7 CPU in conjunction with an Nvidia GeForce GTX 970 GPU. In this method, when a fluoroscopy frame was to be registered with a 3D model, a registration estimate based on the previous registered frame was initialised for the current frame. However, one of its drawbacks is that the initializations of the subsequent frames would not be appropriate which may result in mis-registration if a large mis-alignment occurs in one frame. Although one solution is to search a larger range of values for the geometric transformation parameters, this process is very time-consuming. A robust 3D computer-aided design (CAD) model to 2D single-plane registration method was proposed in [14]. In this method, a combination of an intensity and a contour matching score accompanied with a simulated annealing optimization algorithm were used. However, it had limited robustness to large initial displacements of a component's position in a video frame, and when an initial starting position was not close to the correct answer, it required up to 50 percent user intervention.
Object-tracking methods, such as those based on Kalman filtering [7] and particle filtering [15], [16], can also be used for kinematics measurements. In these approaches, the position of the target object at time t is predicted using the previous information from times 0 to t − 1. A parameterised prediction model is required for a Kalman filter computation while patients' knee joints may have different functionalities [16]. Recently, methods based on predicting implanted knee kinematics have been proposed. In [15], the 3D position of a knee prosthesis was estimated by applying a particle filter algorithm, which involved the three steps of: prediction, observation and resampling. In this method, chronological information was applied and the 3D position of the initial frame was defined manually. However, a great deal of computational time was required for this method. Although the performance of this method was improved in [16] and its computational time decreased, it was only capable of providing the kinematics without the transformation information of the 3D model. It applied fuzzy membership functions to consider the relative positions of the femoral and tibial knee components and omitted the less likely ones. However, this may not provide acceptable and accurate results in some conditions where these position are not in the specific ranges defined by the fuzzy functions.
Registration parameters, or initial deformation fields for registration, were predicted in some learning-based methods [17]- [20] and other studies predicted implanted knee kinematics before surgery [21], [22]. As joint kinematics are not the same for all patients and their patterns can be different [22], a simple mathematical formula may not be capable of modeling these kinematics. In [22], a supervised machine learning prediction model was proposed. Its predicted post-operative kinematics could help create a better design for TKA components and achieve more successful surgeries. In this study, the performance of three machine learning methods (support vector regression (SVR), neural network (NN) and generalized linear regression (GLR)) in predictive model construction for postoperative kinematics prediction were compared. In [21], a machine learningbased method for predicting post-operative knee kinematics was proposed. In this method features are extracted using principal component analysis (PCA), and then a mapping function from pre-operative feature space to post-operative feature space is constructed. One of its advantages is that it could predict these kinematics before TKA surgery. However, it required a very large dataset consisting of a set of pre-and post-operative knee mobility functions to create a model for predicting post-operative knee functions. As access to such a large training dataset is usually not feasible, this method would not be very applicable in a clinical setting.
Most studies that proposed a registration approach for 3D kinematics did not exploit the continuous nature of the kinematic parameters to be measured. Also, the smoothness of knee kinematics is highly important for knee joint analysis, an aspect that was not considered in many studies. Exploiting these characteristics will enable simulations of more natural joint movements.
In this paper, we propose a fast, robust method for registering a 3D model to multiple 2D fluoroscopy frames. To register multi-frame images, firstly, some frames are registered accurately using a registration method based on two recent similarity measures, the edge position difference (EPD) [23] and SCV [24], and a gradient descent optimisation method. Then, the 3D positions of the models are estimated for the remaining frames using an interpolation technique. A final fine tuning step to improve the registration of some frames is adopted when required. Our experimental results show that the proposed method is almost as robust as the SCV registration approach but is much faster. By exploiting the continuous nature of the Kinematic parameters to be measured, the proposed method also provides smooth registration results which can lead to more natural 3D modelling of joint movements. The proposed method will enable simulations of more natural joint movements. Using the proposed registration method, a large number of fluoroscopy frames, for example 300 2D frames, can be registered with the relevant 3D model of femoral and tibial components in a relatively short time. The proposed method gives us the best position of the 3D models which match each fluoroscopy frame. Using the output information, a 3D video of knee joint movements can be viewed, and 3D knee kinematic parameters can be computed to be used for analysing the success of the TKA surgery and the condition of the patient's knee afterwards.
The remainder of this paper is arranged as follows: in Section II, the proposed 3D model to 2D x-ray fluoroscopy image registration method is described; in Section III, the proposed method is evaluated through several experiments; and finally, in Section IV, our conclusions are presented.

II. PROPOSED 3D MODEL TO 2D MULTI-FRAME FLUOROSCOPY REGISTRATION METHOD
Our proposed approach relies on the assumption that the registration results for most frames in a fluoroscopy video can be estimated by predicting their values from the values of a few neighbouring registered frames, and therefore can be computed in a relatively short time.
In the registration, six transformation parameters are found including in-plane parameters T x , T y and R z , and outof-plane parameters T z , R x and R y which are shown in Figure 2.   The proposed method includes the general steps of pre-processing, initial registration, estimations of the 3D transformation parameters, and improvements in registration, as discussed in the following subsections.

A. PRE-PROCESSING
As mentioned in section I, one of the drawbacks of most feature-based registration methods is that, if the segmentation technique applied before the registration process is not suitable, the registration results may not be reliable [25]. In our proposed framework, a registration method based on the normalised EPD (NEPD) [23] and SCV [24] similarity measures is used. While the fluoroscopy frames do not require segmentation for the SCV method, they need proper segmentation when registered using the EPD measure. As different fluoroscopy frames for each patient can have different levels of intensity, contrast and illumination, a non-adaptive segmentation method with a fixed threshold may result in poor segmentation results. Also, although implants made of metal usually have different and distinctive intensity values compared with those in other areas of the frames, in some fluoroscopy frames, the implant and its close neighbours may have similar values. This characteristic of fluoroscopy frames makes segmentation quite a difficult task. Figure 3(a) shows a fluoroscopy frame of a patient's knee joint after TKA surgery in which the intensity values, particularly near where the tibial component is implanted into the bone, are very similar to those of the implant. Furthermore, occlusions and low object-to-background contrast can be seen in the fluoroscopy frames as, at the time the image was captured, the positions of the legs may cause an overlap of their projections. For example, Figure 3(b) shows a fluoroscopy frame of a patient's knee joint after TKA surgery in which the femoral and tibial components overlap each other.
In the proposed method, the fluoroscopy frame and the 2D projected model of the knee bone to be registered are denoted by R and I respectively which are converted to binary edge images E R (x, y) and E I (x, y) respectively when the registration method based on the NEPD is used. An adaptive segmentation method, which automatically provides proper segmentation results for different patients is adopted. This adaptive method uses an approach that computes an appropriate image threshold using local firstorder statistics for each image [26]. The method is applied using the adaptthresh Matlab function using a Gaussian probability distribution whereby, for each pixel, a local threshold is computed by the Gaussian weighted mean of its neighbourhood. The output from the adaptthresh function is a matrix which shows a specific threshold for each pixel for use in the binarisation process when a grey-scale image is converted to a binary image. Using the computed threshold, the segmented fluoroscopy image is binarized and denoted by E R (x, y). Then, using a morphological function (the 'remove' operation in the bwmorph function in Matlab), the pixels inside the 2D projected implant's image are removed which leaves only their border. This image is the binary edge image of the implant denoted by E I (x, y). In Figure 4, the results obtained from a constant threshold segmentation method, and the adaptive segmentation technique used in the proposed method are shown. In Figures 4(a), 4(b) and 4(c) examples of fluoroscopy images of the human knee joints of three different patients after TKA surgery are presented. In Figures 4(d), 4(e) and 4(f) the corresponding segmented fluoroscopy frames using a constant thresholding segmentation method (with the same threshold used in all three frames as the segmentation step is required to be fully automatic without any user intervention), and in Figures 4(g), 4(h) and 4(i), the segmented frames using the proposed adaptive segmentation technique are shown. Although the segmented image, obtained using the constant threshold segmentation method, shown in frame 4(d) is suitable, the segmented images shown in frames 4(e) and 4(f) are of poor quality. Since the segmented image in Figure 4(e) was obtained using the same threshold as for the image in Figure 4(b), certain parts of the femoral and tibial components are removed if the overlapping area between them is ignored. However, segmenting the image in Figure 4(b) using the proposed adaptive method, provides good results, as shown in Figure 4(h). For the image in Figure 4(c), using the constant threshold method is not appropriate, as shown in Figure 4(f), while the adaptive segmentation approach provides acceptable results, as shown in Figure 4(i). However, for those fluoroscopy frames which do not exhibit high contrast and are blurry, the adaptive segmentation method may not exactly differentiate between the implants and their neighbourhoods as explained in more detail in section II-D. In the proposed method, the effects of the neighbourhood area with similar intensity values are decreased by using a combination of feature and intensity-based registration methods. In our proposed framework, we used a combination of these methods as the NEPD is considered to be a featurebased similarity measure while the SCV is defined as an intensity-based similarity measure. In this way we can benefit from the advantages of both of these measures to perform a trade-off between accuracy and computation time (SCV is accurate, and NEPD is fast). While the fluoroscopy frames need proper segmentation when registered using the EPD measure, they do not require segmentation for the SCV method. Therefore, when the segmentation of fluoroscopy frames is not perfect in the registration method based on NEPD, using the intensity registration method helps to compensate this by using intensity features of the images to be registered.
The remaining parts of the proposed registration method are shown in Figure 5, and explained in the following subsections.

B. INITIAL REGISTRATION
In frame-by-frame image registration methods, after the first frame is registered, its output is usually used in the initial registration of the next frame. Also, as all frames should be registered, which can be very time-consuming, if there is a sudden change in knee movement, the correct results may not be obtained, and the next frames may be initialised incorrectly which can have a huge effect on the resulting registration results. To estimate 3D knee joint kinematics for multiframe fluorosocopy videos, in the proposed method, a few frames are registered first before the kinematic parameters for the others are estimated. The first frame is registered semi-automatically, that is, the initial position is found manually and then the automatic method based on the SCV similarity measure [27] is used to improve the registration transformation parameters. Subsequently, depending on the number of frames (n) to be registered and the type of knee joint movements and sequences, x out of n frames are registered. For the initial registration, all x out of n frames, except the first one, are registered fully automatically, and as can be seen in Figure 6, the result for each is used to initialise the registration parameters of the next frame. The x frames are chosen with distance d between them as shown in Figure 6. In this figure, f i denotes frame number i, and d represents the number of frames which are skipped in the initial registration step. The value of d is chosen according to the amount of movement between frames of the implant to be registered. For example, in the case of knee joints, if the implant's position only changes slightly between each fluoroscopy frame, a larger value of d may be chosen. However, if the implant moves rapidly between fluoroscopy frames, a smaller value of d would be chosen, and the number of frames in subset A to be registered initially will be larger. The total number of frames (x) required for registration is defined as: For example, if the number of frames is 295 and we want to register those with a distance of 10, 30 frames should be registered initially before using the estimation method to register all other frames. The transformation parameters for each frame to be registered are initialised using the registration output from the previously registered frame in the set. In order to compute the registration transformation parameters for the alignment of x frames, a large range of values should be searched to reduce the chance of misalignment which may occur due to the larger number of frames between each subsequent frame in subset A and the possibility of a large difference between the relative 3D positions of the models in each frame in subset A. The registration results for the initial x frames should be highly accurate as they are the input data for the subsequent fast multi-frame registration method. Hence, a combination of the accurate and fast registration approaches in [27] and [13] is used in the initial registration step. The registration method used in [13] is based on the EPD similarity measure. This multi-modal similarity measure is based on the minimum difference between the position of binary edge images. These edge images are produced by using an edge detection method on the two images to be registered. To compute the EPD similarity measure, firstly, the input images to be registered, R and I , as mentioned in Section II-A, are converted to binary edge images E R (x, y) and E I (x, y) respectively. In the next step, a 2D chamfer distance algorithm [28] is used to compute the distance D R (x, y) to the nearest edge pixel in the binary image of the reference image (E R (x, y)). Although chamfer distance is not considered to be a global operation, as it works on small neighborhoods around pixels, it provides a good approximation to Euclidean distance. Finally, the edge position difference, P, between the two images R and I is computed. It is computed by summing the distance to the nearest edge of image R, D R (x, y), at locations which correspond to edges in the other binary image E I (x, y). If β I is defined as the set of pixel locations where T (E I ) = 1, the edge position difference is calculated as follows: where m i denote the parameters that define the transform T . For example, in a 2D rigid-body transform, the new pixel locations (x , y ) in the transformed image are given by: where m i are the parameters that define the transform T . This equation only describes 2D transformation, however, in the proposed method both 2D and 3D transformations are applied. In the method proposed in this paper we used a normalized version of the EPD (which needs almost the same computation time as the EPD) to limit the range of EPD similarity values to between 0 and 1. This normalization is necessary when comparing the level of the similarity between two different images to be registered. In the registration method based on the EPD, a steepest descent optimization algorithm is used to find the set of transform parameters that minimize P(m i ). In the registration method applied in [27], the SCV similarity measure is used. This measure is based on calculating the sum-of-conditional variances using the joint histogram between the two images to be registered. In this registration method, a standard Gauss-Newton optimization method was used.
In the initial registration step, first, the NEPD similarity measure, which has values of between 0 and 1, is computed for the frames to be registered. The closer the value of the NEPD is to zero, the greater the similarity between the images to be registered. Depending on the NEPD value for the frame, different registration strategies are applied, and different search ranges are selected. If it is less than a threshold (T 1 ), which is set to 0.5 in the experiments, the registration result is not very far from the best values of the transformation parameters. In this case, the registration method based on the SCV similarity measure [27] is applied for a small range of values for in-plane and out-of-plane transformation parameters. The search ranges of the in-plane parameters T x , T y and R z were set to [−5,5]  However, if the NEPD value is equal to or larger than T 1 , it is most likely that there is a large displacement between the estimated and true positions of the implant component. In this case, the fast registration method based on the EPD [13], is run for a large range of in-plane parameter values. If the NEPD value is smaller than 0.7, the search ranges of the inplane parameters T x , T y and R z were set to [−15,15]  each consecutive values is 0.25). Then another search of inplane parameters is performed in which the 2D projection of the implant is estimated using a transformed 3D model of the implant. The 3D search ranges of the in-plane parameters T x , T y and R z were set to [−4,4] pixels, [−4,4] pixels and [−6,6] degrees respectively where the number of values searched in each range of T x , T y is 9, and in the range for R z is 25 (the difference between each consecutive value for T x , T y is 1 and 0.5 for R z ). This step improves the results because, when there is a large displacement between the model's positions, the appearance of the DRR generated by translating a 2D DRR can be distorted when compared with the real DRR generated by projecting the displaced 3D model using a perspective projection. After this coarse registration, the SCV method [27] is applied to the registered frame to provide it with a fine registration output. The search ranges of the in-plane parameters T x , T y and R z were set to [−5,5]  Then, the NEPD value is again computed and compared with T 1 . If it is still equal to or larger than T 1 , registration steps, similar to the previous ones, are applied to improve the registration results. When the EPD method is used, the same search ranges of the in-plane parameters T x , T y and R z were used. However, a larger search range for the out-of-plane rotation parameters (Rx, Ry) of [−6,6] degrees was used. The number of values searched in each range is 49 (the difference between each consecutive values is.25). Because the registration method and search ranges are chosen with respect to the NEPD similarity value, the computational time can be reduced.

C. ESTIMATIONS OF 3D TRANSFORMATION PARAMETERS
After the initial registration, six 3D rigid-body transformation parameters are estimated for the remaining frames using the proposed method and, if they need improvement, further finetuned. In this step, an iterative loop is applied to estimate the 3D transformation parameters for the frames located between the previously registered frames. Interpolation and curve fitting estimation methods were evaluated for this part of the algorithm. At the end of each iteration, 3D position parameters are estimated using curves updated from the information of the new registered frames. They show estimations of the change patterns of different transformation parameters which are improved using the new registration results for the frames registered in the current loop. Therefore, in each iteration, the 3D estimated positions of the subsequent frames to be registered are improved as the number of available registered ones is increased and the curves become more reliable. The results obtained from applying the proposed method using curve fitting and interpolation approaches on the same data are presented in section III. As the estimation technique based on interpolation provided better results compared with the method based on curve-fitting, it was adopted in the proposed method. Furthermore, in addition to the curves of the six transformation parameters, to assess the estimated registration results for the new frames, an NEPD value curve for the previous transformed registered frames is calculated and updated after each iteration. It uses measurements of the NEPD values which compute the similarity between the DRRs of the transformed 3D models and the fluoroscopy frames which are registered. Then, when estimating the transformation parameters for the new frames, the six curves are used. These estimations are evaluated by measuring the NEPD for the transformation estimations and comparing it with the estimated NEPD value for that frame using the NEPD value curve. If the computed NEPD value is less than the estimated one plus a threshold (which is set to 0.1 in the experiments), the evaluation result is considered satisfactory and the estimated result is assumed to be a good match for that frame. Otherwise, the frame is registered again and the parameters improved, as explained in sub-section II-D. Consequently, only a small number of frames are required to be registered as the registration results estimated for the remaining ones satisfy the NEPD requirement and do not require registration. Therefore, not only is this method very fast for multi-frame registration but it also provides smoother and more natural movement results which can easily be noticed when the output video of a moving 3D joint model is viewed.

1) TRANSFORMATION ESTIMATIONS USING CURVE FITTING
When several view points and data are available, curve fitting is one method which can be used to fit surfaces and curves to the data, and provide estimates of the points which are not available. Different functions can be used to determine which is the best fit for the data. Goodness-of-fit statistics, such as the sum of squared error (SSE), r-square, adjusted r-square and root mean squared error (RMSE), can be used to assess the quality of the curve-fitting stage. In the proposed method, the data from the initial registration step explained in subsection II-B is taken as the input data for the curve-fitting procedure and according to the experiments conducted on the data, a sine function with a degree of 8 was chosen as the curve-fitting function.

2) TRANSFORMATION ESTIMATION USING INTERPOLATION
Spline interpolation usually produces a smaller error than polynomial interpolation [29]. Also, it can avoid the problem of Runge's phenomenon and oscillation which may occur when using high-degree polynomial interpolation. In the proposed method, cubic spline interpolation was used, in which polynomials of degree 3 are used to estimate the 3D transformation parameters. The advantage of using an interpolation method rather than a curve-fitting approach is that, for each specific sequence and its transformation parameters, and for each multi-frame video of different patients, it can enable a more suitable estimated smooth curve to be drawn which is not limited to a specific function and pattern. An example of using interpolation and curve fitting in the proposed method is shown in Figure 7. The computed translation parameter values (T x ) when using curve fitting and interpolation approaches are shown in red and blue respectively. These estimated values are compared with the results from the SCV method which is shown in green. As can be seen in Figure 7, the curve related to the proposed method using interpolation is more similar to the actual registration results from the SCV registration method.

D. IMPROVEMENTS IN REGISTRATION
As mentioned in section II-C, if evaluation of the registration result of a frame is not satisfactory, the frame should be registered and the parameters should be improved. To improve the registration result, in an iterative loop, the six estimated curves of the transformation parameters are updated with the values of the NEPD similarity measure values computed for the new registered frames used to update the curve of NEPD values. In each iteration, the transformation parameters are estimated from the six transformation curves and an NEPD similarity value is computed for that frame. If this measure is less than a threshold (1.1 times the estimated NEPD value), the transformation parameters are taken as the registration output for that frame. Otherwise, the in-plane parameters are fine-tuned and a small search range is used to register the frame using the method based on the EPD similarity measure, and a steepest descent optimization algorithm [13]. The search ranges of the in-plane parameters Tx, Ty and Rz were set to [−2,2] pixels, [−2,2] pixels and [−2,2] degrees respectively. The search ranges for the out-of-plane rotation parameters (Rx, Ry) were set to [−2,2] degrees. The number of values searched in the ranges is 17 (the difference between each consecutive values is 0.25). Again, the NEPD value is checked and compared with the NEPD threshold. If it is less than a threshold distance from the estimated curve of the NEPD values, the transformation parameters are considered the registration output for that frame. Otherwise, in an iterative loop, the out-of-plane parameters are finetuned and the in-plane parameters are also optimized using the SCV method and a standard Gauss-Newton optimization method [27].

E. IMPROVEMENTS IN REGISTRATION OF KNEE JOINT COMPONENTS
During the development of the proposed method, when the initial registrations were performed on the femoral and tibial components of different patients, it was found that those of the latter were much more challenging. One reason for this was the similarity between the intensity values of the tibial component and its neighbourhood as shown in the frame in Figure 3(a). As mentioned in sub-section II-A, one of the challenges of registering implants is the existence of similar intensity values in their neighbourhood. In some fluoroscopy images, it can be seen that the borders between the implants and their neighbourhoods are not easily distinguishable as they can be blurry in these areas and their intensity values can be very similar. In this case, their neighbourhood may be considered to be part of the implants when the related frame is segmented. Consequently, as any segmentation method is not capable of segmenting the frames well, the registration results may be affected and may not be very accurate. Therefore, different positions other than the best one may be predicted for the 3D models. This issue can be seen in the tibial components which have higher intensity values than the femoral components. In the proposed method, an adaptive segmentation method is used to address this issue to some extent. However, as some fluoroscopy frames did not exhibit high contrast and were blurry, it could not exactly differentiate between the implants and their neighbourhoods in all the frames of all the patients. Consequently, a combination of feature and intensity-based registration methods was used to decrease the effects of the neighbourhood area with similar intensity values as shown in Figure 3. Therefore, when the segmentation of fluoroscopy frames is not perfect in the registration method based on NEPD, as it may cause decreasing accuracy, using the intensity registration method helps to compensate this by computing some intensity features of the images to be registered. The second reason for the challenging nature of registering tibial components is that those used are symmetric and their right and left views similar. However, a femoral component can be considered a semi-symmetric object because its top part has different left and right views. The robust optimisation technique used by the proposed method can register the femoral components, and find the correct global minima/maxima. In some previous research studies [14] registration experiments were performed for only the femoral components and not the tibial ones, with examples of these presented in Figure 8. This symmetric nature of the tibial components may cause a mis-registration because there is more than one registration answer. One answer is correct, while another shows the model as a mirror image of the correct 3D position. Because of the symmetry of the transformed model, when it is in the correct and symmetric positions, its 2D projection silhouettes are identical. The main difference between these two computed positions of the 3D object is in their R x and R y values while the value of the similarity measure computed for these projections can be quite similar. Therefore, in the optimisation process, any of these transformation parameters could be taken as a global minima or maxima and can be considered the registration answer. To address this issue, at the time of the initial registration of the tibial components, the registration information related to the femoral ones obtained from the same frames was used. For the tibial components for which the values of the transformation parameter R X or R Y were close to 0 degrees, 180 or −180 degrees, respectively, the following steps were performed. First, the area near the initial transformation positions, and that near the transformation parameters with the same magnitudes but opposite signs for R X and R Y were searched individually to find the best registration transformation parameters in each of these spaces. Then, the relational rotation kinematics parameters, internal-external rotation and abduction-adduction, for both registration results were computed. If the difference between them, for each scenario, was larger than a threshold (1 in the experiments), the position of the component that provided the smallest rotational parameters was taken as the answer for registering the tibial component.

III. EXPERIMENTS AND DISCUSSIONS
Several experiments were performed to evaluate the performance of the proposed multi-frame registration method using Matlab R2019a on a computer with an Intel(R) Core(TM) i7-4790 CPU @ 3.60GHZ, with 16 GB of installed memory (RAM) and a 64-bit operating system. Data was provided by the Canberra Hospital of the knee joints of 18 patients two years after they received TKA surgeries. The data for each patient includes a 2D fluoroscopy video (which contains approximately 300 fluoroscopy X-ray frames) and the 3D models of the femoral and tibial components. The frames were captured at 30Hz with 1024 x 1024-pixel spatial dimensions and 12bits/pixel using single-plane, curved-panel fluoroscopy (AXIOM-Artis, Siemens). Regarding the threshadapt function used in the proposed method, at each pixel a local threshold is computed by the Guassian weighted mean in each pixels' neighbourhood of 31. In this function, the sensitivity was set to 0.4 and the foreground is considered to be darker than the background. This setting is the same for all patients and fluoroscopy frames. However, for each frame, this function provides a different matrix showing specific thresholds for each pixel to be used in the binarization process. One of the main challenges of evaluating the proposed method when applied to real clinical data was that, as there is not an exact ground truth for comparison, computing its accuracy was not possible. Therefore, it was compared with an accurate registration method [27] based on the SCV similarity measure which is the best existing and relevant registration method. As explained in [27], in vitro kinematics produced by the SCV method and the 'gold standard' RSA method are very similar. Therefore, as we had access to the frame-byframe registration method based on SCV, the results from the SCV method were assumed to be the best registration answers, and we compared our proposed method to this approach. To register a fluoroscopy sequence of frames for a patient's knee joint, firstly, the femoral components were registered by the proposed methods followed by the tibial components. After registering all the frames for each patient,  the relational kinematic parameters for the knee joint were computed. Similar intensity values of the implants and their neighbourhood, as well as occlusions and low object-tobackground contrast, can be seen in a number of these fluoroscopy frames. An example of the registration results for a frame with an occlusion issue is shown in Figure 9 in which it can be seen that when using the proposed method, the femoral and tibial models related to the actual components implanted in the patient's bone are registered well with the fluoroscopy frame.

A. INVESTIGATION OF RELATIONAL TRANSLATION AND ROTATION PARAMETERS
The three relational translation kinematic parameters, medial-lateral shift, anterior-posterior draw and distractioncompression, as well as the three relational rotation parameters, flexion-extension, internal-external rotation and abduction-adduction are the kinematic parameters most commonly used to describe knee joint movements and are usually analysed to investigate joint kinematics [31]. These parameters are shown in Figure 10. In order to examine the performances of the proposed methods, we computed and compared these six parameter values for the frameby-frame method based on the SCV and proposed multiframe methods. In the experiments, the measured kinematic parameters for the sequence of fluoroscopy frames related to 18 patients were investigated. The registration results and their relevant kinematic parameters for the fluoroscopy frames were provided to us by the Trauma and Orthopaedic Research Unit of the Canberra Hospital which used the frame-by-frame registration method based on the SCV [27]. Then, experiments were performed by applying the proposed multi-frame registration methods based on curve fitting and interpolation. To evaluate them, firstly, the differences between the kinematics parameters which were computed by the SCV and proposed methods when using curve fitting and interpolation are shown in Figures 11 and 12 respectively. As can be seen, the kinematics results for both the proposed methods were similar to the registration results from the SCV method as the medians of the differences are close to zero for most patients. For the method based on curve fitting, the mean of the absolute median of the differences of flexionextension, internal-external, abduction-adduction, mediallateral, anterior-posterior and distraction-compression were 0.4136, 0.1867, 0.1108, 0.3185, 0.3672, 0.0694 respectively. However, results of the proposed method based on interpolation, which were 0.2506, 0.1399, 0.0800, 0.1862, 0.2139, 0.0531 respectively, were much better than those of the VOLUME 9, 2021 proposed method based on curve fitting. The box plots of the differences in the relational kinematics parameters of the proposed method based on interpolation show that it could provide results that were almost identical to the SCV registration method. For some patients like patient 5 and 13, the range of the values of relational parameters for outliers are larger compared to that of the other patients. After investigation of the results for these patients, and comparison with the other patients' results, we understood that the main reason is that a number of fluoroscopy frames related to these patients were very blurry which makes the registration more challenging. The mean squared differences (MSDs) and standard deviations (SDs) of the differences between the relational kinematics parameters measured by the proposed and SCV methods are presented in Table 1. As can be seen, those measured by the proposed method based on interpolation were much closer to zero than those measured by the proposed method based on curve fitting.

B. INVESTIGATION ON TRANSFORMATION PARAMETERS
The mean difference (MD), the standard deviation and the mean of the absolute difference (MAD) between the registration transformation parameters computed by the proposed methods and the method base on SCV are shown in table 2. Directions are considered with respect to the fluoroscopy coordinate system. The X and Y axes are considered to be parallel to the imaging plane and the Z axis is perpendicular to the imaging plane. The experimental results show that  the difference between the computed 3D positions of the implanted components using the proposed methods and the method based on SCV are negligible. However, the computed differences using the proposed method based on interpolation is much lower when compared with those of the proposed method based on curve fitting. For the proposed method based on interpolation, the mean difference ± standard deviation for the femoral components were −0.0062 ±0.1502, −0.0336 ±0.1487 and 0.1352 ± 0.4465 for transformation in the X direction (T x ), Y direction (T y ) and Z direction (T z ) respectively. The rotations around the X axis (R X ), Y axis (R Y ) and Z axis (R Z ) were 0.0076 ± 0.2024, −0.0094 ± 0.1613 and 0.0227 ± 0.3418 respectively. The corresponding values for the tibial models were −0.0046 ± 0.1542, −0.0130 ± 0.1429, 0.2842 ± 0.7372, 0.0087 ± 0.2166, −0.0357 ± 0.5579 and −0.0353 ± 0.2733 respectively. Notice that, the differences for the out-of-plane transformation parameters are slightly higher for the tibial component when compared with the differences for the femoral component. While the differences for the in-plane transformation parameters are almost identical for both the femoral and tibial components.

C. COMPUTATIONAL TIMES
When many images are to be registered, such as when providing 3D knee joint kinematics, the total time required can be considered an important aspect of the registration algorithm as a low-speed registration method may not be very applicable in practice. In order to register a 3D model to a 2D fluoroscopy image, the steps in the registration algorithm, such as determining the 3D transformation and DRR projection, computing a similarity measure and running an optimisation method are all computationally expensive operations. Using our proposed method, the time required to  perform these steps for every single frame to be registered was omitted for the majority of frames which significantly reduced the total computational times. In the experiments, in addition to the frames which were registered in the initial step of the proposed method, only about 32 frames from 270 frames on average per patient were needed to be registered in the registration improvement section. Table 3 shows the computational times (in seconds) required to register each frame for each bone using our proposed methods and the frame-by-frame ones using the SCV technique [27]. All the fluoroscopy frames of the deep knee-bend sequences of 18 patients with their relevant knee implant models (femoral and tibial) were registered and then the means of the times which is required for the registration of each frame per implant model was computed. In the registration method based on SCV, all frames were registered, and the time spent for each frame was almost the same. However, in the proposed method, a large portion of the computational time was linked to the registration of the frames in the initial step which was performed over a large search range. This increased search range was important for providing accurate results. The times required to register one frame with a 3D model using the proposed methods based on curve fitting and interpolation are shown in Table 3. These times were compared with those required to register the same images using the frame-by-frame SCV method and it is clear that the proposed multi-frame registration method based on interpolation took the lowest computational time. It was significantly faster than the frame-by-frame SCV method while both provided similar registration results. To explain the computational times in detail, it should be mentioned that, using the proposed method based on interpolation, the mean of the times required for registration of each bone for the first frames registered in the initial registration step was approximately 624 (s) while an average of only 3.5 (s) was taken to register each bone in each of the remaining frames.

D. DISCUSSION ON REGISTRATION OF KNEE JOINT COMPONENTS
When the tibial components for a number of frames of different patients were registered and the 3D positions of the components compared with the registration results obtained by the frame-by-frame SCV method, it was found that some frames were registered incorrectly. To demonstrate this, the rotation transformation parameter in the Y direction obtained from our proposed method based on interpolation and the results from the SCV method for one patient are shown in Figure 13. It was clear that the R Y values for some frames in a sequence for one patient computed by the proposed registration and SCV methods had similar magnitudes but opposite signs. As SCV registration method is based on frame-by-frame registration, a small range of values of R Y is searched around the initial positions for each frame because in each frame the initial position comes from the registration results from the previous frame. So the possibility of large changes in R Y is low. In Figure 13 and 14, for the SCV method, the values of R Y for all frames are shown, however, for the proposed method, the values of R Y for only the frames which are registered in the initial registration step of the proposed method are shown. The values of R Y for the remaining frames are not shown as these frames have not been registered yet in the initial step of the proposed method. The error bars (gray vertical lines) show the standard error of the data. Figure 14 shows a comparison of the rotation transformation parameters in the Y direction computed by our proposed final step and the SCV approach. It is clear that our method solved the problem of mis-registration of the tibial component. Figure 15 shows the incorrect and correct positions of the tibial model overlaid on one of the fluoroscopy frames after initial registration and after using the proposed improvement method, respectively. While the 2D projections of this model at these two positions were almost identical, the correct position was computed by the improved method.

IV. CONCLUSION
Analysing human joint kinematics plays an important role in many clinical settings and 2D to 3D registration has been shown to be an effective method for measuring these parameters. However, to register multi-frame sequences, such as a sequence of fluoroscopy frames, previously proposed registration methods have mainly been used on each image individually. As registering a large number of sequential frames can be quite time-consuming, in this paper, an estimation registration method for increasing a registration algorithm's speed while maintaining its accuracy is proposed. It uses a coarse-to-fine approach and applies the NEPD and SCV similarity measures together with a gradient descent optimisation method to register a small subset of the total number of frames in a sequence. Then, registration transformation parameters are estimated for the remaining frames and the results improved when required. The experimental results show that the proposed method can provide kinematic parameter values almost as reliable as those of one of the most accurate registration methods while much less time is required to calculate them. In future work, it is intended to increase the speed of this registration method by enhancing its performance in the initial registration step using an improved optimisation method. Also, it is planned to increase the accuracy of the proposed method using a segmentation technique based on deep learning. DIANA M. PERRIMAN received the bachelor's degree from The University of Sydney, the M.Sc. degree from the University of East London, and the Ph.D. degree from Australian National University, in 2011. She has worked extensively as a Clinician at hospitals in Sydney and Canberra, Australia, and London, U.K. She managed Edinburgh Community Team, for eight years, before returning to Australia, in 2003. After completing her Ph.D. degree, she became the Clinical Research Coordinator for the Trauma and Orthopaedic Research Unit, which is part of the Australian National University Medical School and Canberra Hospital, where she is currently an Australian Physiotherapist. Her research interests include trauma and arthroplasty outcomes, which involve considerable imaging analysis.

JENNIE
M. SCARVELL received the G.Cert. Higher Ed. and B.App.Sc. degrees in physiotherapy in Sydney, the Ph.D. degree from The University of Sydney, in 2000, and the Bachelor of Physiotherapy degree from the University of California, in 2013. She was formerly the Head of School, Health Sciences, and the Head of physiotherapy with the Faculty of Health. Before commencing her Ph.D. degree, she was a Clinician for 15 years, in Australia and Canada, and a Senior Physiotherapist in outpatients and rheumatology. Her Ph.D. degree examined the development of osteoarthritis in injured knees. She was one of the team that developed the physiotherapy curriculum for the Master of Physiotherapy, when it began at the University of Canberra, in 2004, and the Deputy Head and the Clinical Education Coordinator. She spent three years as the Clinical Research Coordinator in orthopaedics at Canberra Hospital and returned to UC as the Head of physiotherapy, in 2011. She is currently the Associate Dean of research with the Faculty of Health, University of Canberra. She is active in physiotherapy program development in Shanghai. Her research interests include orthopaedic physiotherapy, using medical imaging to analyze arthrokinematics. Her clinical outcomes studies and intervention trials include physiotherapy for knee osteoarthritis, back pain, and hip and knee replacement.
PAUL N. SMITH graduated in medicine from Flinders University, South Australia, in 1986. He completed orthopaedic surgical training, in 1995. His following fellowships in adult joint replacement in London, Ontario, and Princess Elizabeth Orthopaedic Centre, Exeter, U.K., then he returned to Canberra, in 1998. Since then, he has been taken up a full-time clinical position specializing in adult knee and hip joint reconstructive surgery. He is specializes in orthopaedic and trauma surgery, primary and revision joint replacement, and surgery of the knee, hip, and pelvis.