Introduction
Clinical gait analysis is fundamental to understand and interpret physio-pathological characteristics of human locomotion and its importance as a clinical diagnostic tool is widely accepted. In particular, there is strong clinical evidence on its effectiveness in supporting the identification of optimal surgical procedures and the consequent rehabilitation pathways in children with bilateral cerebral palsy (CP) [1].
To date, standard 3D clinical gait analysis protocols are based on the use of infrared multi-camera system to reconstruct the trajectories of markers attached to the skin of the patient on specific locations [2], [3], [4] (MB). Unfortunately, MB protocols routine use is limited by several issues such as the need for highly qualified staff, its high price tag and skin markers acceptance which are very critical when dealing with younger patients.
To reduce the time associated to the subject preparation and discomfort, motion tracking performed using markerless technologies (MS) may offer a promising alternative to MB motion capture. Recently, different MS multi-camera solutions have been proposed for three-dimensional (3D) joint kinematics analysis [5], [6], [7], [8], [9], [10], [11], [12]. However, multi-camera set-up requires time for installation and for extrinsic cameras calibration and therefore it is not the ideal solution for ambulatory settings with no dedicated space.
Conversely, there are applications where a two-dimensional (2D) joint kinematic analysis is still clinically relevant (e.g. for screening purposes, to identify gait patterns, for follow-up over time and to evaluate treatment). For these purposes, system portability, affordability, and user-friendliness are essential requirements. Methods based on the use of a single camera with minimum set-up time would therefore be preferred. Lately, several manufacturers have been producing inexpensive tracking systems (200-400 € /
The single camera MS methods proposed in the literature can be grouped into three categories: i) black-box methods either based on software development kit (SDK) integrated with proprietary hardware ([13], [14], [15]) or commercial software (e.g. IPsoft iPi Biomech, MediaPipe Studio), ii) open source methods based on deep learning approaches ([5], [16], [17], [18]), and iii) replicable non-machine learning methods ([19], [20], [21], [22], [23], [24], [25], [26]).
Generally, black-box methods are conceived especially for animation or gaming purposes and are not compliant with clinical standards and terminology [27]. The major limitation of this category is related to its ‘ black box’ functioning resulting in the inability of fine-tuning some model parameters for pathological data at the expense of external validity and performance [28]. In addition, body tracking SDKs are developed for specific hardware solutions and therefore difficult to generalize.
The majority of the open-source methods based on deep learning approaches are often trained on synthetic generic movement data (e.g. AlphaPose, OpenPose) ([5], [16], [17], [18]), not necessarily gait, and training relies on reference data not based on clinical gait analysis standards [29] (e.g. clear and anatomical/functional rules for joint centers definition [30]). Furthermore, original training data sets do not include people with impaired gait and, therefore, methods performance is not optimized and clinical validity is not established.
Thirdly, replicable non machine learning methods have the advantage that they do not require a specific training set although they need to be optimized for the specific problem and their performance is not expected to improve with the dataset size. The most common factors limiting the clinical applicability of most past studies included the use of color filter and homogeneous background for subject segmentation ([19], [20]), single joint analysis ([20], [21], [22], [23]), lack of technical validation against gold standards and on pathological populations ([19], [23], [24], [25], [26]).
The aim of the present study is to propose and validate on 18 CP patients, an original MS clinical gait analysis protocol based on a single RGB-D camera. Accuracy and reliability of the sagittal lower limb joint kinematics and spatial-temporal parameters were assessed based on a 3D MB clinical gait analysis protocol.
Material and Methods
A. Subjects
Gait data were collected from 18 participants, 4 females and 14 males, age between 6.5 and 28 years old (mean 15 y.o.). Most participants showed bilateral CP (11), some showed unilateral CP (3), some suffered from dyskinetic CP (3), and one from ataxic CP. In the Gross Motor Function Classification System (GMFCS), six of them were classified at level I, eleven at level II, and one at level III. The study was approved by the regional ethical review board in Gothenburg, Sweden (approval number 660-15).
B. Experimental Protocol
Instrumentation - An RGB-depth camera (Kinect 2 for Xbox One, Microsoft, RGB images:
Subject preparation – Each subject was asked to wear colored ankle socks (red for the right and blue for the left) and underwear. External anatomical landmarks including the lateral malleolus (LM), lateral epicondyles (LE), great trochanter (GT), anterior superior iliac spine (ASIS) and posterior superior iliac spine (PSIS) were identified by palpation by an expert operator and marked with a black felt pen.
Data collection – Two static lateral views (right and left side) of the subject while standing upright were captured at the beginning of the experimental session. Participants were then asked to walk at a comfortable self-selected speed along a straight 10-meter walkway. Ten gait trials per subject were recorded including five right and left full gait cycles. The dataset containing MS data from 10 CP patients has been uploaded on IEEE DataPort (RGB-Depth_CP_patients_POLITO_dataset
Validation – a 12-camera stereo-photogrammetric system (Oqus 400 Qualisys medical AB, Gothenburg, Sweden) was used to collect 3D reference data at 100 fps. The capture volume was of 14 m
C. Image Pre-Processing
Calibration refinement and camera lens correction was implemented using the Heikkilä undistortion algorithm ([31], [32]). A matching operation was carried out by using intrinsic and extrinsic parameters obtained from the calibration refinement of both RGB and Depth sensor to overlap RGB and Depth images of the same size (Nrow = 1080, Ncol = 1536).
D. Method Description
The proposed method consisted of four main stages: gait cycle identification, subject segmentation, subject-specific models calibration and joint center trajectories estimation (Fig. 2). MATLAB codes have been made available on Github (https://github.com/dilettabalta/ModelBased_MarkerlessProtocol.git).
1) Gait Cycle Identification
From each gait trial, the most central gait cycle was selected and analyzed based on the identification of initial foot contacts. To this purpose, a specific algorithm was developed to account for different types of foot-ground contacts commonly encountered in subjects with CP as shown in Fig. 3 [33].
Different types of foot contacts: a) fore-foot contact in equinus gait, b) Foot-flat contact in individuals who walk in a crouch gait with excessive knee flexion, c) rear-foot contact in patients classified at a low level of the GMFCS.
Specifically, for each video frame, a binary segmentation mask
Identification of the MRF and FF. a). An ellipse was fitted on each foot; the centroid and the principal axes (
The mid-rear foot
The foot points
Computation of stride length and duration, step length and gait speed. A) Velocity of MRF and FF coordinates. The red line in bold defines the first initial contact (IC #1) while the blue one represents the following initial contact (IC #2). The green areas represent the intervals in which MRF and FF are assumed to be in contact with the ground (stationary condition). B) The stride length is the distance between two consecutive initial contacts of the foreground foot (IC #1 and IC#2 in orange). Step length is the distance between initial contact of the foreground foot (IC #1 in orange) and the initial contact (IC #1 in blue) of the contralateral one.
The gait cycle was identified by two consecutive ICs of the foreground foot, whereas the step was identified by the first IC of the foreground and the subsequent IC of the background foot.
Spatial-temporal parameters (stride and step length and stride duration) were calculated based on the positions of the relevant foot points
2) Subject Segmentation
For each frame, a preliminary background subtraction was performed between the RGB image \begin{equation*} {}^{I} D\left ({{x,y,c} }\right) = \left |{ {^{I} I\left ({{x,y,c} }\right)-{}^{I} B\left ({{x,y,c} }\right)} }\right |\end{equation*}
The resulting difference image
\begin{equation*} Th=\frac {\sum \limits _{i=0}^{255} {w_{i} \cdot g_{i}} }{\sum \limits _{i=0}^{255} {w_{i}}}\end{equation*}
The segmentation mask \begin{align*} {}^{I}{M}_{sub} (x,y)=\left \{{{\begin{array}{llllll} 1,&\quad \left |{ {^{I}D_{gray} (x,y)\ge Th} }\right. \\ 0,&\quad \left |{ {otherwise} }\right. \\ \end{array}} }\right \}\end{align*}
Undesired residual small regions due to noise or time-variant shadows were removed under the assumption that the subject is associated to the largest connected area.
The feet segmentation was then refined implementing a color filter technique exploiting the use of colored socks to avoid inaccuracies due to the presence of shadows during foot approach to the ground.
3) Multi-Segmental Model Definition
A 2D subject-specific kinematic lower limb model was introduced to estimate lower limb joint angles. The model consisted in four body segments (foot, shank, thigh and pelvis) connected by revolute joints (ankle, knee and hip joints) for a total of 6 degree of freedom (DoF). The foot segment was assumed to be the parent segment and its motion was characterized by two translational and one rotational DoFs. The ankle joint (AJC) was centered with the lateral malleolus (LM), the knee joint center (KJC) with the lateral epicondyles (LE) and the hip joint (HJC) with the great trochanter (GT).
4) Anatomical Calibration and Body Segment Templates Definition
The body segments’ templates and the relevant coordinate systems were calibrated on the static upright standing acquisition (image “0”) by manually selecting the anatomical landmarks (LM, LE, GT, ASIS, PSIS) to obtain their position vectors in
a: Foot Template
From \begin{align*}& ^{I}TMP_{foot} (x,y) \\ &=\left \{{{\begin{array}{llllll} 1,&\left |{ {^{I} {M}_{foot} (x,y)=1} }\right.\cap MRF_{xi} < x < MRF_{xi} +0.9l_{f} \\ 0,&\left |{ {otherwise} }\right. \\ \end{array}} }\right \}\end{align*}
The foot coordinate system
b: Shank Template
The central shank portion was extracted as the region included in the anulus centered in
Then, the generic pixel \begin{align*}& ^{I}TMP_{shank} (x,y) \\ &=\!\left \{{{\begin{array}{llllll} 1,&\left |{ {^{I}{M}_{sub} (x,y)=1} }\right.\cap l_{shank25} < \sqrt {x^{2}+y^{2}} < l_{shank75} \\ 0,&\left |{ {otherwise} }\right. \\ \end{array}} }\right \}\end{align*}
c: Thigh Template
The central thigh portion was extracted as the region included in the anulus centered in
Then, the generic pixel \begin{align*}& ^{I}TMP_{thigh} (x,y) \\ &=\left \{{{\begin{array}{lllllll} 1,&\left |{ {^{I}{M}_{sub} (x,y)=1} }\right.\cap l_{thigh25} < \sqrt {x^{2}+y^{2}} < l_{thigh75} \\ 0,&\left |{ {otherwise} }\right. \\ \end{array}} }\right \}\end{align*}
d: Pelvis
The pelvis inclination, with respect to the
5) Joint Centers Trajectories Estimation
For each frame of the gait cycle, the joint center positions were identified following a bottom-up tracking approach from the foot to the pelvis.
a: Ankle Joint Center (AJC) Estimation
The foreground foot was extracted from the RGB image based on color filters (
Ankle joint center estimation. a)
After having expressed
Finally, the \begin{equation*} ^{I} \boldsymbol {AJC}\equiv ^{I} \boldsymbol {LM}=^{I} \boldsymbol {T}_{f} ^{f} \boldsymbol {T}_{f_{0}}^{f_{0}} \boldsymbol {T}_{I}^{I} \boldsymbol {LM}_{0}\end{equation*}
b: Knee Joint Center (KJC) Estimation
The separation between the foreground and background shanks was carried out using two alternative strategies depending on whether there was or was not overlap between foreground and background shanks.
To discriminate between overlap/non overlap conditions, a circle centered in
Separation between foreground and background shanks. A circle centered in
Conversely, when there was overlap, a single connected region was found, and auxiliary depth sensor data were used to separate foreground and background shanks (Fig. 8b). To this purpose, the histogram of depth values within the region was computed and the Otsu method [37] was applied for a binary classification (class 0: foreground shank, class 1: background shank) based on the minimization of the variance between classes.
The central portion of the foreground shank (
The shank coordinate system
Finally, the \begin{equation*} {}^{I} \boldsymbol {KJC}\equiv {}^{I} \boldsymbol {LE}={}^{I} \boldsymbol {T}_{s} {}^{s} \boldsymbol {T}_{s_{0}}{}^{s_{0}} \boldsymbol {T}_{I}{}^{I} \boldsymbol {LE}_{0}\end{equation*}
c: Hip Joint Center (HJC) Estimation
To separate foreground thigh from the background thigh and the hand during arm oscillation, two alternative procedures were implemented depending on whether the foreground hand was superimposed to the foreground thigh or not. Preliminarily, a circle centered in
In case of foreground hand superimposition, three peaks, corresponding to the foreground hand (class 0), foreground thigh (class 1) and background thigh (class 2), were found on the histogram envelope (Fig. 9a). Then, the Otsu method [37] was applied for a three-classes classification. Alternatively (Fig. 9b), a binary classification was implemented (class 0: foreground thigh, class 1: background thigh).
Separation between foreground and background thighs. A circle centered in
The central portion of the foreground thigh (
The thigh coordinate system
Finally, the \begin{equation*} {}^{I} \boldsymbol {HJC}\equiv {}^{I} \boldsymbol {GT}={}^{I} \boldsymbol {T}_{t} {}^{t} \boldsymbol {T}_{t_{0}}{}^{t_{0}} \boldsymbol {T}_{I}{}^{I} \boldsymbol {GT}_{0}\end{equation*}
6) Subject-Specific Models Calibration
It must be highlighted that within the recorded gait cycle, size and shape of the lower limb body segments vary due to soft tissue deformation [38], changes in the subject position relative to the camera field of view, and potential out-of-plane movements, thus limiting the effectiveness of the matching procedure between the body segment templates and the segmented body segment masks. To overcome these limitations, a multiple calibration procedure [39] was implemented based on three sets of body segment templates, the first defined from the standing posture (Fig 10a), the second and the third from frames selected during the loading and the swing phases of the gait cycle, respectively (Fig. 10b, and 10c). The procedure for the identification of joint centers trajectories as described in par.V, was then repeated using the additional templates, thus obtaining three different trajectories for each center.
7) Joint Kinematics Estimation
Joint kinematics was determined based on the segment inclination as defined by the lines connecting the joint centers. For the ankle, the plantar-dorsi flexion angle was determined as the angle between
E. Performance Assessment and Statistical Analysis
The accuracy of the gait events identification was evaluated by computing the time difference in terms of the mean absolute error (MAE) and mean error (ME) between the gait events found by visual inspection from the RGB images and those estimated by the automatic MS method over trials and subjects.
The spatial-temporal gait parameters estimated were assessed in terms of MAE, MAE%, ME, ME% with respect to the estimates provided by the 3D MB protocol over trials and subjects.
Before comparison, both the MS and MB kinematic curves were filtered using a fourth order Butterworth filter (cut off frequency at 7 Hz) and were time-normalized to the gait cycle (1-100%) [41].
For each subject, gait trial and joint, the performance of the proposed MS method were assessed in terms of offset and waveform similarity [42]. The offset was computed as the absolute difference between the mean value of the MS (\begin{equation*} Offset_{s,t,j} =\left |{ {\overline {MB_{s,t,j}} -\overline {MS_{s,t,j}}} }\right |\end{equation*}
For each joint, the latter values were then averaged across trials and subjects:\begin{equation*} Offset_{j} =\frac {1}{N_{S}}\sum \limits _{s=1}^{N_{S}} {\frac {1}{N_{T} }\sum \limits _{t=1}^{N_{T}} {Offset_{s,t,j}}}\end{equation*}
For each subject, gait trial and joint, the waveform similarity was evaluated as the root mean square error (RMSE) of the MS joint kinematic curves with respect to the MB joint kinematic curves, after removing their mean values [42]:\begin{equation*} RMSE_{s,t,j} \!=\!RMS\left ({{\left ({{MB_{s,t,j} \!-\!\overline {MB_{s,t,j}}} }\right)-\left ({{MS_{s,t,j} -\overline {MS_{s,t,j}}} }\right)} }\right)\end{equation*}
For each joint, the latter values were then averaged across trials and subjects:\begin{equation*} RMSE_{j} =\frac {1}{N_{S}}\sum \limits _{s=1}^{N_{S}} {\frac {1}{N_{T} }\sum \limits _{t=1}^{N_{T}} {RMSE_{s,t,j}}}\end{equation*}
In addition, a set of clinically relevant key gait features were extracted according to [43] from the MB and MS sagittal lower limb joint kinematics after offset removal (Fig. 11):
the knee flexion at the initial contact (0% of the gait cycle);
the knee maximum flexion during the loading response (0 - 40% of the gait cycle);
the knee maximum extension during the stance phase (25- 75% of the gait cycle);
the knee maximum flexion during the swing phase (50 - 100% of the gait cycle);
the ankle maximum dorsiflexion during the stance phase (25 - 75% of the gait cycle);
the ankle maximum dorsiflexion during the swing phase (50- 100% of the gait cycle);
the hip maximum extension during the stance phase (25 - 75% of the gait cycle).
For each key gait feature, the MAE and the ME were computed with respect to the MB estimates along with 95% confidence intervals (95% CI) were computed. Normality of the key gait features distributions was assessed by applying Shapiro–Wilk test. After verifying normality, a two-sample t-test with a significant level of 95% was implemented to quantify differences between the MS and MB methods. A p-value of ME and MAE less than 0.05 were considered statistically significant.
For each gait feature and each method (MS and MB), the reliability was evaluated with intraclass correlation based on absolute agreement and 2 way random effects (ICC (2,k)) computed based on the formulas reported in [44] based on the data collected over subjects (n = 18) for the different gait cycles (k = 10).
ICC values less than 0.5 indicate poor reliability, values between 0.5 and 0.75 indicate moderate reliability, values between 0.75 and 0.9 indicate good reliability, and values greater than 0.90 indicate excellent reliability [44].
Spearman’s correlation coefficient (R) was used to correlate the differences between MS and MB systems. The estimated values, derived from R, can be interpreted as follows: values below 0.19 indicate a negligible relationship, values between 0.2 and 0.29 suggest a weak relationship, values between 0.3 and 0.39 indicate a moderate relationship, values between 0.4 and 0.69 imply a strong relationship, and values greater than 0.70 signify a very strong relationship. [45]
Results
Results for gait events identification, spatial-temporal gait parameters and lower limb joint kinematics, are reported in Table 1. Regarding spatial-temporal parameters, the smallest errors in terms of MAE % values were obtained for stride duration (2%), followed by the step and stride length (2.2% and 2.5%, respectively) and by gait speed (3.1%).
The RMSE values computed for lower-limb joint kinematics ranged between 3.2° and 4.5°, smallest RMSE values were found for the knee joint kinematics (3.2°), followed by the hip (3.5°) and the ankle (4.5°).
Results related to the extracted key gait features in terms of ME, MAE and their 95% CI are summarized in Table 2. Overall, MAE values were significantly different from zero and ranged from 3.1° to 5.9°.
Results for ICC(2,k) and R for both MS and MB protocols, are reported in Table 3. Both MS and MB measurements revealed excellent reliability for K1, K2, K3, K5 and H3 (ICC = 0.90-0.94) while for A3 and A5 both protocols showed a good reliability (ICC = 0.80-0.88). Correlation between MB and MS kinematics ranged from very strong for all knee and hip gait features (≥0.85) and strong (= 0.66) for ankle kinematics features.
An ensemble view of the normalized joint kinematics curves, averaged over trials and subjects, are reported in Fig. 12.
Discussions
The aim of the study was to present and evaluate accuracy and reliability of a clinical markerless gait analysis protocol based on the use of a single RGB-depth camera for estimating spatial-temporal parameters and sagittal lower limb joint kinematics on patients with cerebral palsy. The protocol was specifically devised to provide a quantitative tool which could be easily implemented for screening and monitoring disease-related motor progression in patients with CP.
The proposed protocol includes several fundamental improvements with respect to previous research [20]. First, an automatic thresholding segmentation algorithm was proposed which does not require the use of any homogeneous background, thus improving the system clinical applicability and portability. Second, potential issues associated with left-right confusion [46] and skeleton tracking in presence of foreground-background segments overlap [20] were addressed by introducing a robust separation approach which relies on the maximization of the variance between classes based on depth data. Third, the analysis of the clinical concurrent validity of the method was not limited to the knee kinematics as in [20], but also extended to the ankle and hip kinematics and spatial-temporal parameters. The abovementioned improvements, besides increasing protocol robustness to variations in the experimental conditions, also contributed to an increase of about 35% of the accuracy of the knee joint gait features [20].
It is worth noting that the method for the detection of the gait events was specifically conceived for taking into account the different types of foot contact normally observed in CP patients (i.e. heel foot, flat foot and toe-ground contacts). The proposed method relies on the orientation of the foot model with respect to the ground and it represents a novelty with respect to other studies based on the 3D coordinates of ankle joint center only, thus neglecting the foot contact mechanism ([14], [19], [47], [48], [49], [50], [51]).
A. Spatial-Temporal Parameters
The proposed MS protocol showed a very good accuracy compliant with clinical requirements ([52], [53]) with MAE values equal to 1.2 cm for step length, 20 ms for stride duration, 2.5 cm for stride length and 0.02 m/s for the gait speed. The latter errors found on CP patients were comparable to those reported in previous single-camera studies but obtained on healthy subjects ([14], [17], [19], [22], [50], [51]).
To the best of the authors knowledge, this is the first study specifically validating the spatio-temporal parameters in children with CP against a MB protocol. In the literature, there are only a few single camera-based methods validated on post stroke ([49], [54]) and parkinsonian patients [47], and they showed lower performance. In particular, Ferraris et al. [54] and Cimolin et al. [47] have assessed the errors associated to the spatio-temporal parameters estimation using the Kinect v2 body tracking SDK on eleven post-stroke and ten parkinsonian subjects, respectively. Both studies found ME values equal to 0.02 m/s for gait speed and 2 cm for step length, consistently larger than those found with the proposed MS protocol (0.01 m/s for gait speed and 0.06 cm for step length). In a recent study, Lonini et al. [49] evaluated the performance of DeepLabCut software for the analysis of the gait of ten post-stroke patients using a single RGB camera, reporting a high error variability for gait speed (± 0.11 m/s in terms of ME).
B. Lower-Limb Joint Kinematics
When comparing the 2D joint kinematics estimated by the proposed MS method against the 3D joint kinematics provided by the reference MB protocol, it is convenient to discriminate between the effects associated to the use of different anatomical axes definitions and angular conventions from the actual estimation errors [42]. While the adoption of different anatomical axes definition mainly reflects on an offset between curves, the errors in the reconstruction of the joint center trajectories would affect the waveform similarity and it can be quantified by the RMSE after offset removal.
The average angular offset values were 8° for the ankle, 6° for the knee and 7° for the hip joint. The ankle offset can be partially explained considering that the foot antero-posterior axis in MS protocol is computed as the principal axis of the best-fitting inertial ellipsoid whereas in the MB protocol from the position of the markers attached to the second metatarsal joint and the calcaneus. Similarly, the offset at the knee joint can be ascribed to the different definition for the HJC identifications implemented in the MS and MB protocol. In fact, while in the MS protocol the HJC coincides with the position of GT, in the MB protocol is determined as the geometrical center of the acetabulum and it is determined based on anthropometric regression equation [2]. The offset at the hip joint is associated to the fact that in the MS protocol, the pelvis inclination is assumed to be constant during the gait cycle and coinciding to the pelvic tilt during the static upright standing acquisition.
In terms of waveform similarity, the most accurate joint angle was obtained for the knee joint (RMSE = 3.2°), followed by the hip joint (RMSE = 3.5°) and the ankle joint (RMSE = 4.5°). It is important to highlight that, from a clinical perspective, errors between 2° and 5° are likely to be regarded as reasonable but may require consideration in data interpretation [55]. The largest errors affecting the ankle kinematics are mainly due to the auto-exposure of the camera which can cause blurred images of the foot and distal part of the shank during fastest movement such as the swing phase.
In the last years, several single-camera MS methods were proposed for gait analysis, however, in many cases, a direct comparative evaluation with the proposed method was not possible because: (
To the best of authors knowledge, the only MS study involving CP children was presented by Nguyen et al. [46] that evaluated the concurrent validity of the built-in body tracking SDK (Kinect v2) against an MB gait protocol on 10 CP children (GMFS I-II) based on a frontal view. However, the reported errors were large for all the joints (RMSE = 11.2° for the hip, 10.3° for the knee, and 7.5° for the ankle).
In addition, there are a few MS studies that have been only applied and evaluated on normal gait ([17], [19], [66]).
In Yeung et al. [66], the effect of five camera viewing angles on the estimates of kinematics curves on healthy subjects by using body tracking SDK of Kinect v2 was investigated. They found that Kinect v2 performed better at frontal camera viewing angle showing a RMSE of 8° for the hip flexion/extension angle, 11.4° for the sagittal knee, and 17.4° for the ankle plantar/dorsi flexion angle.
Yamamoto et al. have tested the performance of OpenPose [17] on healthy subjects reporting comparable results in terms of reliability for knee and hip kinematics (ICC ranging from 0.60 to 0.98) but very poor reliability for ankle kinematics represented by an ICC of 0.1 for the ankle maximum dorsiflexion during the stance phase and the swing phase in contrast to the ICC of 0.66 obtained from our MS protocol.
Castelli et al. [19] reported a RMSE of 4.8 ° for the hip, 3.6 ° for the knee, and 3° for the ankle on 10 healthy adults which are comparable with the values obtained using our method on CP children.
Interestingly, the proposed MS method provide a level of accuracy similar to that obtained by popular observation-based clinical gait assessment tools such as the Salford Gait Tool based on the manual identification of the anatomical landmarks on each recorded image [67].
In particular, Larsen et al. [67] have analyzed the accuracy of the Salford Gait Tool against MB protocol on 10 adult CP patients. They showed comparable errors in terms of ME on some key gait features showing that the proposed method obtained comparable performance of observation-based clinical gait assessment with less efforts required from the clinicians since our protocol includes manual intervention exclusively for calibrating the three templates.
Conclusion
The proposed MS protocol was designed to satisfy as priority its usability in clinical setting in terms of set-up and cost. For this reason, it was decided to use a single consumer-grade RGB-D camera. Clearly, this choice inevitably limited the kinematic analysis to the joint movement in the sagittal plane and hence to the description of the flexion-extension lower limb joint angles.
Furthermore, the movement of the subject was reconstructed based on a 2D multi-segmental model defined from a single 2D RGB image as the information provided by the depth sensor were only used to extract the foreground body segments. However, the projection of human 3D bodies motion to a 2D space necessarily leads to errors and ambiguities which could be only partially compensated using the multiple anatomical calibration proposed.
Another critical factor was represented by the quality of the image recorded by the specific RGB-D camera. In fact, due to automatic exposure time implemented by the Kinect v2, the recorded images resulted blurred when capturing fast moving body parts, and it negatively affected the results of the template/mask matching. The latter problem could be easily solved by selecting a camera which allows to control the exposure parameters.
Finally, it should be acknowledged that the proposed protocol is not fully automatic as it is required a preliminary identification of the external anatomical landmarks for the subject-specific model definition. Nonetheless, it is important to consider that gait analysis in CP patients is generally preceded by a clinical examination during which the clinicians assess range of joint motion and spasticity and may easily perform the identification of few anatomical landmarks.
In conclusion, the present study demonstrated the technical validity of a MS single-camera protocol for clinical gait analysis in CP population. Results showed a good accuracy in the joint kinematics estimation and good to excellent reliability for the extraction of a complete set of clinically relevant key gait features.
NOTE
Open Access provided by 'Politecnico di Torino' within the CRUI CARE Agreement
Appendix
Appendix
In this appendix, the procedure followed to calculate the conversion factor (m/pixel) for the computation of spatial-temporal parameters was presented.
To convert the spatial parameters in meters, it is necessary to determine the pixel-to-meter conversion ratio. It should be noted that this conversion factor varies with the distance from the camera since the higher the distance, the higher the conversion factor is. To determine the conversion factor, a series of preliminary acquisitions were performed by positioning an object of known size at various distances from the camera. For each distance, the conversion factor was calculated as:\begin{equation*} Conversion \,factor (m/pixel)=\frac {Size\, (m)}{Size\, (pixel)}\end{equation*}
The conversion factor values were fitted though a linear fitting technique to best approximate them. Consequently, this procedure is able to obtain a linear model that associates each distance to its corresponding conversion factor.
The spatial-temporal parameters in meter were calculated applying the conversion factor to those values in pixel.