Processing math: 100%
A Model-Based Markerless Protocol for Clinical Gait Analysis Based on a Single RGB-Depth Camera: Concurrent Validation on Patients With Cerebral Palsy | IEEE Journals & Magazine | IEEE Xplore

A Model-Based Markerless Protocol for Clinical Gait Analysis Based on a Single RGB-Depth Camera: Concurrent Validation on Patients With Cerebral Palsy

DatasetsAvailable

A model-based markerless protocol for clinical gait analysis based on a single RGB-depth camera: concurrent validation on patients with cerebral palsy.

Abstract:

Clinical gait analysis is a diagnostic tool often used for identifying and quantifying gait alterations in cerebral palsy (CP) patients. To date, 3D clinical gait analysi...Show More

Abstract:

Clinical gait analysis is a diagnostic tool often used for identifying and quantifying gait alterations in cerebral palsy (CP) patients. To date, 3D clinical gait analysis protocols based on motion capture systems featuring multiple infrared cameras and retroreflective markers to be attached to the subject’s skin are considered the gold standard. However, the need for fully dedicated personnel and space in addition to the inconvenient requirement of multiple markers attached on the patient’s body limit their use in the clinical practice. To shorten the time necessary to setup the patient and to limit his/her discomfort motion tracking performed using markerless technologies may offer a promising alternative to marker-based motion capture. This study aims at proposing and validating on 18 CP patients, an original markerless clinical gait analysis protocol based on a single RGB-D camera. Accuracy and reliability of the spatial-temporal parameters and sagittal lower limb joint kinematics were assessed based on a 3D marker-based clinical gait analysis protocol. The smallest percent mean absolute errors were obtained for stride duration (2%), followed by the step and stride length (2.2% and 2.5%, respectively) and by gait speed (3.1%). The average angular offset values between the two protocols were 8° for the ankle, 6° for the knee and 7° for the hip joint. The smallest root mean square error values were found for the knee joint kinematics (3.2°), followed by the hip (3.5°) and the ankle (4.5°). Both protocols showed a good-to-excellent reliability. Thus, this study demonstrated the technical validity of a markerless single-camera protocol for clinical gait analysis in CP population. The dataset containing markerless data from 10 CP patients along with the MATLAB codes have been made available.
A model-based markerless protocol for clinical gait analysis based on a single RGB-depth camera: concurrent validation on patients with cerebral palsy.
Published in: IEEE Access ( Volume: 11)
Page(s): 144377 - 144393
Date of Publication: 07 December 2023
Electronic ISSN: 2169-3536

SECTION I.

Introduction

Clinical gait analysis is fundamental to understand and interpret physio-pathological characteristics of human locomotion and its importance as a clinical diagnostic tool is widely accepted. In particular, there is strong clinical evidence on its effectiveness in supporting the identification of optimal surgical procedures and the consequent rehabilitation pathways in children with bilateral cerebral palsy (CP) [1].

To date, standard 3D clinical gait analysis protocols are based on the use of infrared multi-camera system to reconstruct the trajectories of markers attached to the skin of the patient on specific locations [2], [3], [4] (MB). Unfortunately, MB protocols routine use is limited by several issues such as the need for highly qualified staff, its high price tag and skin markers acceptance which are very critical when dealing with younger patients.

To reduce the time associated to the subject preparation and discomfort, motion tracking performed using markerless technologies (MS) may offer a promising alternative to MB motion capture. Recently, different MS multi-camera solutions have been proposed for three-dimensional (3D) joint kinematics analysis [5], [6], [7], [8], [9], [10], [11], [12]. However, multi-camera set-up requires time for installation and for extrinsic cameras calibration and therefore it is not the ideal solution for ambulatory settings with no dedicated space.

Conversely, there are applications where a two-dimensional (2D) joint kinematic analysis is still clinically relevant (e.g. for screening purposes, to identify gait patterns, for follow-up over time and to evaluate treatment). For these purposes, system portability, affordability, and user-friendliness are essential requirements. Methods based on the use of a single camera with minimum set-up time would therefore be preferred. Lately, several manufacturers have been producing inexpensive tracking systems (200-400 € /{\$} ) featuring an RGB camera integrated with an infrared depth sensor (RGB-depth). By combining the information of the RGB image with depth data, these systems can be used to generate depth color images (2D+) without requiring a multi-camera set-up.

The single camera MS methods proposed in the literature can be grouped into three categories: i) black-box methods either based on software development kit (SDK) integrated with proprietary hardware ([13], [14], [15]) or commercial software (e.g. IPsoft iPi Biomech, MediaPipe Studio), ii) open source methods based on deep learning approaches ([5], [16], [17], [18]), and iii) replicable non-machine learning methods ([19], [20], [21], [22], [23], [24], [25], [26]).

Generally, black-box methods are conceived especially for animation or gaming purposes and are not compliant with clinical standards and terminology [27]. The major limitation of this category is related to its ‘ black box’ functioning resulting in the inability of fine-tuning some model parameters for pathological data at the expense of external validity and performance [28]. In addition, body tracking SDKs are developed for specific hardware solutions and therefore difficult to generalize.

The majority of the open-source methods based on deep learning approaches are often trained on synthetic generic movement data (e.g. AlphaPose, OpenPose) ([5], [16], [17], [18]), not necessarily gait, and training relies on reference data not based on clinical gait analysis standards [29] (e.g. clear and anatomical/functional rules for joint centers definition [30]). Furthermore, original training data sets do not include people with impaired gait and, therefore, methods performance is not optimized and clinical validity is not established.

Thirdly, replicable non machine learning methods have the advantage that they do not require a specific training set although they need to be optimized for the specific problem and their performance is not expected to improve with the dataset size. The most common factors limiting the clinical applicability of most past studies included the use of color filter and homogeneous background for subject segmentation ([19], [20]), single joint analysis ([20], [21], [22], [23]), lack of technical validation against gold standards and on pathological populations ([19], [23], [24], [25], [26]).

The aim of the present study is to propose and validate on 18 CP patients, an original MS clinical gait analysis protocol based on a single RGB-D camera. Accuracy and reliability of the sagittal lower limb joint kinematics and spatial-temporal parameters were assessed based on a 3D MB clinical gait analysis protocol.

SECTION II.

Material and Methods

A. Subjects

Gait data were collected from 18 participants, 4 females and 14 males, age between 6.5 and 28 years old (mean 15 y.o.). Most participants showed bilateral CP (11), some showed unilateral CP (3), some suffered from dyskinetic CP (3), and one from ataxic CP. In the Gross Motor Function Classification System (GMFCS), six of them were classified at level I, eleven at level II, and one at level III. The study was approved by the regional ethical review board in Gothenburg, Sweden (approval number 660-15).

B. Experimental Protocol

Instrumentation - An RGB-depth camera (Kinect 2 for Xbox One, Microsoft, RGB images: 1920\times 1080 pixels at 30 fps, FOV = 84.1^{\circ }\,\,\times53.8^{\circ } ; Depth images: 512\times 424 pixels at 30 fps, FOV = 70.6^{\circ }\,\,\times 60^{\circ } ) was positioned laterally at a 2.5-meter distance from the center of the walkway and a 5-meter distance from the background. The total Kinect capture volume was 7.08 m (length) \times5.77 m (height) \times5 m (width). The image coordinate system (I ) of the video camera was aligned to the sagittal plane \left ({{x_{I},y_{I}} }\right) identified by the direction of progression and the vertical direction. To prevent blurred images due to automatic exposure, two additional cool-white LED lamps (3360 Lux/m) were used (Fig. 1).

FIGURE 1. - Experimental setup.
FIGURE 1.

Experimental setup.

Subject preparation – Each subject was asked to wear colored ankle socks (red for the right and blue for the left) and underwear. External anatomical landmarks including the lateral malleolus (LM), lateral epicondyles (LE), great trochanter (GT), anterior superior iliac spine (ASIS) and posterior superior iliac spine (PSIS) were identified by palpation by an expert operator and marked with a black felt pen.

Data collection – Two static lateral views (right and left side) of the subject while standing upright were captured at the beginning of the experimental session. Participants were then asked to walk at a comfortable self-selected speed along a straight 10-meter walkway. Ten gait trials per subject were recorded including five right and left full gait cycles. The dataset containing MS data from 10 CP patients has been uploaded on IEEE DataPort (RGB-Depth_CP_patients_POLITO_dataset \vert IEEE DataPort (ieee-dataport.org)).

Validation – a 12-camera stereo-photogrammetric system (Oqus 400 Qualisys medical AB, Gothenburg, Sweden) was used to collect 3D reference data at 100 fps. The capture volume was of 14 m \times3 m \times8 m and completely included the Kinect volume. Thirty-eight retro-reflective spherical markers (14 mm diameter) were attached to the subjects according to the modified Helen-Heyes protocol [2]. Calculations of 3D reference joint angles were performed using the Visual 3D software (C Motion Inc., USA).

C. Image Pre-Processing

Calibration refinement and camera lens correction was implemented using the Heikkilä undistortion algorithm ([31], [32]). A matching operation was carried out by using intrinsic and extrinsic parameters obtained from the calibration refinement of both RGB and Depth sensor to overlap RGB and Depth images of the same size (Nrow = 1080, Ncol = 1536).

D. Method Description

The proposed method consisted of four main stages: gait cycle identification, subject segmentation, subject-specific models calibration and joint center trajectories estimation (Fig. 2). MATLAB codes have been made available on Github (https://github.com/dilettabalta/ModelBased_MarkerlessProtocol.git).

FIGURE 2. - Block-diagram of the proposed MS protocol.
FIGURE 2.

Block-diagram of the proposed MS protocol.

1) Gait Cycle Identification

From each gait trial, the most central gait cycle was selected and analyzed based on the identification of initial foot contacts. To this purpose, a specific algorithm was developed to account for different types of foot-ground contacts commonly encountered in subjects with CP as shown in Fig. 3 [33].

FIGURE 3. - Different types of foot contacts: a) fore-foot contact in equinus gait, b) Foot-flat contact in individuals who walk in a crouch gait with excessive knee flexion, c) rear-foot contact in patients classified at a low level of the GMFCS.
FIGURE 3.

Different types of foot contacts: a) fore-foot contact in equinus gait, b) Foot-flat contact in individuals who walk in a crouch gait with excessive knee flexion, c) rear-foot contact in patients classified at a low level of the GMFCS.

Specifically, for each video frame, a binary segmentation mask { }^{I} \boldsymbol {M}_{foot } expressed in the image coordinate system \boldsymbol {I} , was obtained for each foot by applying a segmentation technique based on color filters [34]. An ellipse was fitted on each { }^{I} \boldsymbol {M}_{foot } (Fig.4a). Then, a foot coordinate system (f ) was defined with the axes coincident to the inertial ellipsoid principal axes and the origin coinciding with the centroid. The transformation matrix { }^{I} \boldsymbol {T}_{f} from f to I was computed by simple geometrical rules and applied to transform { }^{I} \boldsymbol {M}_{foot } in the f\left ({{ }^{f} \boldsymbol {M}_{foot }}\right) . From { }^{f} \boldsymbol {M}_{foot } , the point Q \left ({{ }^{f} \boldsymbol {Q}=\left [{Q_{x f}, Q_{y f}}\right]}\right) with the highest y-coordinate was identified, the foot points included between Q_{y f} and (Q_{y f}{ }{-}\,\,\epsilon ) were isolated and the least square fitting line was computed to approximate the sole of the foot (Fig. 4b). A similar procedure was implemented to reconstruct the line approximating the posterior contour of the foot starting from point R with the lowest x-coordinate (Fig. 4b).

FIGURE 4. - Identification of the MRF and FF. a). An ellipse was fitted on each foot; the centroid and the principal axes (
$\text{x}_{f}$
, 
$\text{y}_{f}$
) were identified. b). The intersection between the sole of the foot (light blue area) and the posterior area (light green area) of the foot was identified as the mid-rear foot (MRF) while the extremity of the foot along the x- axis was identified as the forefoot (FF).
FIGURE 4.

Identification of the MRF and FF. a). An ellipse was fitted on each foot; the centroid and the principal axes (\text{x}_{f} , \text{y}_{f} ) were identified. b). The intersection between the sole of the foot (light blue area) and the posterior area (light green area) of the foot was identified as the mid-rear foot (MRF) while the extremity of the foot along the x- axis was identified as the forefoot (FF).

The mid-rear foot (M R F) position in f \left ({{ }^{f}{\boldsymbol {MRF}}=\left [{MRF_{x f}, MRF_{y f}}\right]}\right) was identified as the intersection between the sole and posterior lines of the foot, while the forefoot (F F ) position in f \left ({{ }^{f}{\boldsymbol {F F}}=\left [{F F_{x f}, F F_{y f}}\right]}\right) was identified as the point with the highest x-component (Fig. 4). The { }^{f} \boldsymbol {MRF} and { }^{f} \boldsymbol {F F} were then transformed in I based on { }^{I} \boldsymbol {T}_{f} .

The foot points M R F and F F were assumed to be in contact with the ground when the vertical velocity along y_{I} and horizontal velocity along x_{I} were below a given threshold T h (Fig. 5a). Then, the foot initial contact (IC) was identified as the first instant of time characterized by zero-velocity between M R F and F F .

FIGURE 5. - Computation of stride length and duration, step length and gait speed. A) Velocity of MRF and FF coordinates. The red line in bold defines the first initial contact (IC #1) while the blue one represents the following initial contact (IC #2). The green areas represent the intervals in which MRF and FF are assumed to be in contact with the ground (stationary condition). B) The stride length is the distance between two consecutive initial contacts of the foreground foot (IC #1 and IC#2 in orange). Step length is the distance between initial contact of the foreground foot (IC #1 in orange) and the initial contact (IC #1 in blue) of the contralateral one.
FIGURE 5.

Computation of stride length and duration, step length and gait speed. A) Velocity of MRF and FF coordinates. The red line in bold defines the first initial contact (IC #1) while the blue one represents the following initial contact (IC #2). The green areas represent the intervals in which MRF and FF are assumed to be in contact with the ground (stationary condition). B) The stride length is the distance between two consecutive initial contacts of the foreground foot (IC #1 and IC#2 in orange). Step length is the distance between initial contact of the foreground foot (IC #1 in orange) and the initial contact (IC #1 in blue) of the contralateral one.

The gait cycle was identified by two consecutive ICs of the foreground foot, whereas the step was identified by the first IC of the foreground and the subsequent IC of the background foot.

Spatial-temporal parameters (stride and step length and stride duration) were calculated based on the positions of the relevant foot points (M K F / F F ) at the IC instants (Fig.5b). Finally, gait speed was computed by considering pixel/meter conversion factor (see appendix A for a detailed description).

2) Subject Segmentation

For each frame, a preliminary background subtraction was performed between the RGB image { }^{I} \boldsymbol {I} , containing the subject, and the image background { }^{I} \boldsymbol {B} to obtain the difference image { }^{I} \boldsymbol {D} as follows:\begin{equation*} {}^{I} D\left ({{x,y,c} }\right) = \left |{ {^{I} I\left ({{x,y,c} }\right)-{}^{I} B\left ({{x,y,c} }\right)} }\right |\end{equation*} View SourceRight-click on figure for MathML and additional features. where I_{D}(x, y, c),{ }^{I} I(x, y, c) , and I_{B}(x, y, c) are the generic pixels expressed in i and c=[r, g, b] is the color channel vector.

The resulting difference image {}^{I} \boldsymbol {D} was converted to grayscale { }^{I} \boldsymbol {D}_{gray} by computing the norm of color channel of each pixel:

{ }^{I} D_{gray} \left ({{x,y} }\right) = \sqrt {^{I} D\left ({{x,y,r} }\right)^{2}+{}^{I} D\left ({{x,y,g} }\right)+{}^{I} D\left ({{x,y,b} }\right)^{2}} The subject was separated from the background by applying a proper threshold on the grey levels of the image pixels. The threshold level was set to the weighted mean of the grayscale histogram [35]:\begin{equation*} Th=\frac {\sum \limits _{i=0}^{255} {w_{i} \cdot g_{i}} }{\sum \limits _{i=0}^{255} {w_{i}}}\end{equation*} View SourceRight-click on figure for MathML and additional features. where w_{i} is the histogram count (occurrence) for the i -th grayscale level (g_{i} : 0,\ldots,255).

The segmentation mask ^{I} \boldsymbol {M}_{sub} was obtained from the {}^{I} \boldsymbol {D}_{gray} as follows:\begin{align*} {}^{I}{M}_{sub} (x,y)=\left \{{{\begin{array}{llllll} 1,&\quad \left |{ {^{I}D_{gray} (x,y)\ge Th} }\right. \\ 0,&\quad \left |{ {otherwise} }\right. \\ \end{array}} }\right \}\end{align*} View SourceRight-click on figure for MathML and additional features.

Undesired residual small regions due to noise or time-variant shadows were removed under the assumption that the subject is associated to the largest connected area.

The feet segmentation was then refined implementing a color filter technique exploiting the use of colored socks to avoid inaccuracies due to the presence of shadows during foot approach to the ground.

3) Multi-Segmental Model Definition

A 2D subject-specific kinematic lower limb model was introduced to estimate lower limb joint angles. The model consisted in four body segments (foot, shank, thigh and pelvis) connected by revolute joints (ankle, knee and hip joints) for a total of 6 degree of freedom (DoF). The foot segment was assumed to be the parent segment and its motion was characterized by two translational and one rotational DoFs. The ankle joint (AJC) was centered with the lateral malleolus (LM), the knee joint center (KJC) with the lateral epicondyles (LE) and the hip joint (HJC) with the great trochanter (GT).

4) Anatomical Calibration and Body Segment Templates Definition

The body segments’ templates and the relevant coordinate systems were calibrated on the static upright standing acquisition (image “0”) by manually selecting the anatomical landmarks (LM, LE, GT, ASIS, PSIS) to obtain their position vectors in I ({ }^{I} \boldsymbol {LM}_{0} , {}^{I} \boldsymbol {LE}_{0} , {}^{I} \boldsymbol {GT}_{0} , {}^{I} \boldsymbol {ASIS}_{0} , {}^{I} \boldsymbol {PSIS}_{0} ). The identification of the point MRF and FF ({ }^{I} \boldsymbol {MRF}_{0} and {}^{I} \boldsymbol {FF}_{0} ) was obtained following the procedure presented in par.I. To account for potential right/left asymmetries, the subject-specific model was defined for both sides.

a: Foot Template

From {}^{I} \boldsymbol {M}_{foot} , the mid-rear foot portion was extracted to define a template { }^{I} \boldsymbol {TMP}_{foot} where the value of its generic pixel { }^{I} TMP_{foot} (x,y) in the I was obtained as:\begin{align*}& ^{I}TMP_{foot} (x,y) \\ &=\left \{{{\begin{array}{llllll} 1,&\left |{ {^{I} {M}_{foot} (x,y)=1} }\right.\cap MRF_{xi} < x < MRF_{xi} +0.9l_{f} \\ 0,&\left |{ {otherwise} }\right. \\ \end{array}} }\right \}\end{align*} View SourceRight-click on figure for MathML and additional features. where {}^{I} {M}_{foot} (x,y) is a generic pixel of {}^{I} \boldsymbol {M}_{foot} expressed in the I , l_{f} is the distance between {}^{I} \boldsymbol {MRF}_{0} and {}^{I} \boldsymbol {FF}_{0} (Fig. 6).

FIGURE 6. - Body segment templates definition for the right side.
FIGURE 6.

Body segment templates definition for the right side.

The foot coordinate system f_{0} was defined as described in par. I. and the transformation matrix { }^{I} \boldsymbol {T}_{f0} from f_{0} to I determined and applied to transform { }^{I} \boldsymbol {TMP}_{foot} in the f_{0} ({ }^{f0} \boldsymbol {TMP}_{foot} ).

b: Shank Template

The central shank portion was extracted as the region included in the anulus centered in { }^{I} \boldsymbol {LM}_{0} and defined by the radius l_{shank25} and the radius l_{shank75} equal to the 25% and the 75% of the distance between { }^{I} \boldsymbol {LM}_{0} and {}^{I} \boldsymbol {LE}_{0} , respectively (Fig. 6).

Then, the generic pixel {}^{I} TMP_{shank} (x,y) of {}^{I} \boldsymbol {TMP}_{shank} in I was obtained as:\begin{align*}& ^{I}TMP_{shank} (x,y) \\ &=\!\left \{{{\begin{array}{llllll} 1,&\left |{ {^{I}{M}_{sub} (x,y)=1} }\right.\cap l_{shank25} < \sqrt {x^{2}+y^{2}} < l_{shank75} \\ 0,&\left |{ {otherwise} }\right. \\ \end{array}} }\right \}\end{align*} View SourceRight-click on figure for MathML and additional features.An ellipse was fitted on { }^{I} \boldsymbol {TMP}_{shank} . Then, a shank coordinate system (s0 ) was defined with the axes coincident to the inertial ellipsoid principal axes and the origin coinciding with the centroid. The transformation matrix {}^{I} \boldsymbol {T}_{s0} from s0 to I was computed by simple geometrical rules and applied to transform { }^{I} \boldsymbol {TMP}_{shank} in the s0 ({ }^{s_{0}} \boldsymbol {TMP}_{shank} ).

c: Thigh Template

The central thigh portion was extracted as the region included in the anulus centered in { }^{I} \boldsymbol {LE}_{0} and defined by the radius l_{thigh\,25} and the radius l_{thigh\,75} equal to the 25% and the 75% of the distance between { }^{I} \boldsymbol {LE}_{0} and {}^{I} \boldsymbol {GT}_{0} , respectively. (Fig. 6).

Then, the generic pixel {}^{I} TMP_{thigh} (x,y) of {}^{I} \boldsymbol {TMP}_{thigh} in I was obtained as:\begin{align*}& ^{I}TMP_{thigh} (x,y) \\ &=\left \{{{\begin{array}{lllllll} 1,&\left |{ {^{I}{M}_{sub} (x,y)=1} }\right.\cap l_{thigh25} < \sqrt {x^{2}+y^{2}} < l_{thigh75} \\ 0,&\left |{ {otherwise} }\right. \\ \end{array}} }\right \}\end{align*} View SourceRight-click on figure for MathML and additional features. Similarly to the {}^{I} \boldsymbol {TMP}_{shank} , the thigh coordinate system t_{0} and the transformation matrix {}_{0}^{I} \boldsymbol {T}_{t_{0}} .from t_{0} to I were defined and applied to transform { }^{I} \boldsymbol {TMP}_{thigh} in the t_{0} ({}^{t_{0}} \boldsymbol {TMP}_{thigh} ).

d: Pelvis

The pelvis inclination, with respect to the x_{I} , was determined during the static upright standing acquisition based on the positions of the { }^{I} \boldsymbol {ASIS}_{0} and {}^{I} \boldsymbol {PSIS}_{0} (Fig. 6).

5) Joint Centers Trajectories Estimation

For each frame of the gait cycle, the joint center positions were identified following a bottom-up tracking approach from the foot to the pelvis.

a: Ankle Joint Center (AJC) Estimation

The foreground foot was extracted from the RGB image based on color filters ({ }^{I} \boldsymbol {M}_{foot} ) and the posterior part of the foot was isolated following the same procedure presented in the par. IV. The foot coordinate system f and the relevant transformation matrix { }^{I} \boldsymbol {T}_{f} was defined as described in par. I (Fig. 7a).

FIGURE 7. - Ankle joint center estimation. a) 
${}^{I} {M}_{foot}$
 and its relevant 
${}^{I} {T}_{f} $
 b) 
${}^{I} {TMP}_{foot}$
 with 
$^{I}{LM}_{0} $
. c) 
${}^{I} {TMP}_{foot} $
 and 
${}^{I} {M}_{foot}$
 in the common I. d) The origins of 
${f}$
 and 
${f}_{0}$
 were made to coincide and e) the 
${}^{f_{0}} {TMP}_{foot}$
 was matched with the 
${}^{f} {M}_{foot}$
 and the relevant matrix 
${}^{f_{0}} {T}_{f} $
 determined.
FIGURE 7.

Ankle joint center estimation. a) {}^{I} {M}_{foot} and its relevant {}^{I} {T}_{f} b) {}^{I} {TMP}_{foot} with ^{I}{LM}_{0} . c) {}^{I} {TMP}_{foot} and {}^{I} {M}_{foot} in the common I. d) The origins of {f} and {f}_{0} were made to coincide and e) the {}^{f_{0}} {TMP}_{foot} was matched with the {}^{f} {M}_{foot} and the relevant matrix {}^{f_{0}} {T}_{f} determined.

After having expressed {}^{I} \boldsymbol {TMP}_{foot} (Fig. 7b) and {}^{I} \boldsymbol {M}_{foot} in the common I (Fig. 7c), the origins of f and f_{0} were first made to coincide (Fig. 7d) and then, using an iterative closest point (ICP) technique [36], the { }^{f_{0}} \boldsymbol {TMP}_{foot} was matched with the {}^{f} \boldsymbol {M}_{foot} and the relevant matrix {}^{f_{0}} \boldsymbol {T}_{f} (4\times 4 ), determined (Fig. 7e).

Finally, the ^{I} \boldsymbol {AJC} position in I , coincident to^{I} \boldsymbol {LM} , was obtained for each frame based on the position of LM in the template ^{I} \boldsymbol {LM}_{0} by applying the three subsequent transformations:\begin{equation*} ^{I} \boldsymbol {AJC}\equiv ^{I} \boldsymbol {LM}=^{I} \boldsymbol {T}_{f} ^{f} \boldsymbol {T}_{f_{0}}^{f_{0}} \boldsymbol {T}_{I}^{I} \boldsymbol {LM}_{0}\end{equation*} View SourceRight-click on figure for MathML and additional features.

b: Knee Joint Center (KJC) Estimation

The separation between the foreground and background shanks was carried out using two alternative strategies depending on whether there was or was not overlap between foreground and background shanks.

To discriminate between overlap/non overlap conditions, a circle centered in ^{I} \boldsymbol {LM} with radius equal to the distance between ^{I} \boldsymbol {LM}_{0} and ^{I} \boldsymbol {LE}_{0} was drawn. If there was no overlap, the segmentation mask ^{I} \boldsymbol {M}_{sub} were grouped in two separated regions, and the foreground shank, being closer to the camera, coincided with the largest area (Fig. 8a).

FIGURE 8. - Separation between foreground and background shanks. A circle centered in 
$^{I}{LM}$
 with radius equal to the euclidean distance between 
$^{I}{LM}_{0}$
 and 
$^{I}{LE}_{0}$
 was drawn. A) No-overlap between the two shanks. On the left, Two separated regions were found. On the right, the foreground shank 
$^{I}{M}_{shank}$
. was identified B) Overlap between the two shanks. On the left, a single connected region was found, the histogram of depth values inside the region was computed and the Otsu method was implemented for separating the two shanks. On the right, the foreground shank 
$^{I}{M}_{shank} $
 was identified.
FIGURE 8.

Separation between foreground and background shanks. A circle centered in ^{I}{LM} with radius equal to the euclidean distance between ^{I}{LM}_{0} and ^{I}{LE}_{0} was drawn. A) No-overlap between the two shanks. On the left, Two separated regions were found. On the right, the foreground shank ^{I}{M}_{shank} . was identified B) Overlap between the two shanks. On the left, a single connected region was found, the histogram of depth values inside the region was computed and the Otsu method was implemented for separating the two shanks. On the right, the foreground shank ^{I}{M}_{shank} was identified.

Conversely, when there was overlap, a single connected region was found, and auxiliary depth sensor data were used to separate foreground and background shanks (Fig. 8b). To this purpose, the histogram of depth values within the region was computed and the Otsu method [37] was applied for a binary classification (class 0: foreground shank, class 1: background shank) based on the minimization of the variance between classes.

The central portion of the foreground shank (^{I} \boldsymbol {M}_{shank} ) was extracted as the region included in the anulus centered in ^{I} \boldsymbol {LM}_{0} and defined by the radius l_{shank25} and the radius l_{shank75} .

The shank coordinate system s was defined with the axes coincident to the inertial ellipsoid principal axes of the ^{I} \boldsymbol {M}_{shank} and the transformation matrix ^{I} \boldsymbol {T}_{s} . from s to I computed. Similarly to the ankle joint center estimation, the origin of s and s_{0} was made to coincide and then, using an ICP technique [36], the { }^{s_{0}} \boldsymbol {TMP}_{shank} was matched with the ^{s} \boldsymbol {M}_{shank} and the relevant matrix {}^{s_{0}} \boldsymbol {T}_{s} (4\times 4 ), determined.

Finally, the ^{I} \boldsymbol {KJC} position in I , made to coincide with ^{I} \boldsymbol {LE} , was obtained for each frame based on the position of LE in the shank template ^{I} \boldsymbol {LE}_{0} by applying the three subsequent transformations:\begin{equation*} {}^{I} \boldsymbol {KJC}\equiv {}^{I} \boldsymbol {LE}={}^{I} \boldsymbol {T}_{s} {}^{s} \boldsymbol {T}_{s_{0}}{}^{s_{0}} \boldsymbol {T}_{I}{}^{I} \boldsymbol {LE}_{0}\end{equation*} View SourceRight-click on figure for MathML and additional features.

c: Hip Joint Center (HJC) Estimation

To separate foreground thigh from the background thigh and the hand during arm oscillation, two alternative procedures were implemented depending on whether the foreground hand was superimposed to the foreground thigh or not. Preliminarily, a circle centered in ^{I} \boldsymbol {LE} with radius equal to the distance between ^{I} \boldsymbol {LE}_{0} and ^{I} \boldsymbol {GT}_{0} was drawn and the envelope of the histogram of depth values of the pixels within this circle was computed and the maxima were identified.

In case of foreground hand superimposition, three peaks, corresponding to the foreground hand (class 0), foreground thigh (class 1) and background thigh (class 2), were found on the histogram envelope (Fig. 9a). Then, the Otsu method [37] was applied for a three-classes classification. Alternatively (Fig. 9b), a binary classification was implemented (class 0: foreground thigh, class 1: background thigh).

FIGURE 9. - Separation between foreground and background thighs. A circle centered in 
$^{I}{LE}$
 with radius equal to the euclidean distance between 
$^{I}{LE}_{0}$
 and 
$^{I}{GT}_{0}$
 was drawn, the histogram of depth values inside the region was computed and the envelope was calculated. a) foreground hand superimposed to the foreground thigh. The envelope was characterized by three peaks and the Otsu method was implemented for a three-classes classification. b) No foreground hand inside the circular region. Only two peaks were present in the envelope and the Otsu method was implemented for binary classification.
FIGURE 9.

Separation between foreground and background thighs. A circle centered in ^{I}{LE} with radius equal to the euclidean distance between ^{I}{LE}_{0} and ^{I}{GT}_{0} was drawn, the histogram of depth values inside the region was computed and the envelope was calculated. a) foreground hand superimposed to the foreground thigh. The envelope was characterized by three peaks and the Otsu method was implemented for a three-classes classification. b) No foreground hand inside the circular region. Only two peaks were present in the envelope and the Otsu method was implemented for binary classification.

The central portion of the foreground thigh (^{I} \boldsymbol {M}_{thigh} ) was extracted as the region included in the anulus centered in ^{I} \boldsymbol {LE} and defined by the radius l_{thigh25} and the radius l_{thigh75} .

The thigh coordinate system t was defined with the axes coincident to the inertial ellipsoid principal axes of the ^{I} \boldsymbol {M}_{thigh} and the transformation matrix ^{I} \boldsymbol {T}_{t} . from t to I computed. The origin of t and t_{0} was made to coincide and then, using an ICP technique [36], the { }^{t_{0}} \boldsymbol {TMP}_{thigh} was matched with the ^{t} \boldsymbol {M}_{thigh} and the relevant matrix ^{t_{0} } \boldsymbol {T}_{t} (4\times 4 ), determined.

Finally, the ^{I} \boldsymbol {HJC} position in I , made to coincide with ^{I} \boldsymbol {GT} , was obtained for each frame based on the position of GT in the shank template ^{I} \boldsymbol {GT}_{0} by applying the three subsequent transformations:\begin{equation*} {}^{I} \boldsymbol {HJC}\equiv {}^{I} \boldsymbol {GT}={}^{I} \boldsymbol {T}_{t} {}^{t} \boldsymbol {T}_{t_{0}}{}^{t_{0}} \boldsymbol {T}_{I}{}^{I} \boldsymbol {GT}_{0}\end{equation*} View SourceRight-click on figure for MathML and additional features.

6) Subject-Specific Models Calibration

It must be highlighted that within the recorded gait cycle, size and shape of the lower limb body segments vary due to soft tissue deformation [38], changes in the subject position relative to the camera field of view, and potential out-of-plane movements, thus limiting the effectiveness of the matching procedure between the body segment templates and the segmented body segment masks. To overcome these limitations, a multiple calibration procedure [39] was implemented based on three sets of body segment templates, the first defined from the standing posture (Fig 10a), the second and the third from frames selected during the loading and the swing phases of the gait cycle, respectively (Fig. 10b, and 10c). The procedure for the identification of joint centers trajectories as described in par.V, was then repeated using the additional templates, thus obtaining three different trajectories for each center.

FIGURE 10. - Set of body segment templates definition during static phase (a), loading phase (b) and swing phase (c).
FIGURE 10.

Set of body segment templates definition during static phase (a), loading phase (b) and swing phase (c).

7) Joint Kinematics Estimation

Joint kinematics was determined based on the segment inclination as defined by the lines connecting the joint centers. For the ankle, the plantar-dorsi flexion angle was determined as the angle between ^{I} \boldsymbol {AJC}-^{I} \boldsymbol {TOE} and the ^{I} \boldsymbol {AJC}-^{I} \boldsymbol {KJC} vectors, the knee joint flexion-extension angle was determined as the angle between the ^{I} \boldsymbol {AJC}-^{I} \boldsymbol {KJC} and ^{I} \boldsymbol {KJC}-^{I} \boldsymbol {HJC} vectors. The-hip joint flexion-extension angle was determined as the angle between the ^{I} \boldsymbol {KJC}-^{I} \boldsymbol {HJC} vector and the time-invariant direction identified by the ^{I} \boldsymbol {ASIS}_{0}-^{I} \boldsymbol {PSIS}_{0} vector (pelvic tilt) during the standing posture. For each joint, three kinematic curves were obtained based on the three sets of body segment templates. These curves were then combined into a single curve by a nonlinear sinusoid weight function [40] based on the percentage of the gait cycle.

E. Performance Assessment and Statistical Analysis

The accuracy of the gait events identification was evaluated by computing the time difference in terms of the mean absolute error (MAE) and mean error (ME) between the gait events found by visual inspection from the RGB images and those estimated by the automatic MS method over trials and subjects.

The spatial-temporal gait parameters estimated were assessed in terms of MAE, MAE%, ME, ME% with respect to the estimates provided by the 3D MB protocol over trials and subjects.

Before comparison, both the MS and MB kinematic curves were filtered using a fourth order Butterworth filter (cut off frequency at 7 Hz) and were time-normalized to the gait cycle (1-100%) [41].

For each subject, gait trial and joint, the performance of the proposed MS method were assessed in terms of offset and waveform similarity [42]. The offset was computed as the absolute difference between the mean value of the MS (\overline {MS} ) and MB kinematic curves (\overline {MB} ) within a gait cycle:\begin{equation*} Offset_{s,t,j} =\left |{ {\overline {MB_{s,t,j}} -\overline {MS_{s,t,j}}} }\right |\end{equation*} View SourceRight-click on figure for MathML and additional features.

For each joint, the latter values were then averaged across trials and subjects:\begin{equation*} Offset_{j} =\frac {1}{N_{S}}\sum \limits _{s=1}^{N_{S}} {\frac {1}{N_{T} }\sum \limits _{t=1}^{N_{T}} {Offset_{s,t,j}}}\end{equation*} View SourceRight-click on figure for MathML and additional features. where N_{T}=10 is the number of trials and N_{S}=18 is the number of subjects.

For each subject, gait trial and joint, the waveform similarity was evaluated as the root mean square error (RMSE) of the MS joint kinematic curves with respect to the MB joint kinematic curves, after removing their mean values [42]:\begin{equation*} RMSE_{s,t,j} \!=\!RMS\left ({{\left ({{MB_{s,t,j} \!-\!\overline {MB_{s,t,j}}} }\right)-\left ({{MS_{s,t,j} -\overline {MS_{s,t,j}}} }\right)} }\right)\end{equation*} View SourceRight-click on figure for MathML and additional features.

For each joint, the latter values were then averaged across trials and subjects:\begin{equation*} RMSE_{j} =\frac {1}{N_{S}}\sum \limits _{s=1}^{N_{S}} {\frac {1}{N_{T} }\sum \limits _{t=1}^{N_{T}} {RMSE_{s,t,j}}}\end{equation*} View SourceRight-click on figure for MathML and additional features. where N_{T}=10 is the number of trials and N_{S}=18 is the number of subjects.

In addition, a set of clinically relevant key gait features were extracted according to [43] from the MB and MS sagittal lower limb joint kinematics after offset removal (Fig. 11):

  1. the knee flexion at the initial contact (0% of the gait cycle);

  2. the knee maximum flexion during the loading response (0 - 40% of the gait cycle);

  3. the knee maximum extension during the stance phase (25- 75% of the gait cycle);

  4. the knee maximum flexion during the swing phase (50 - 100% of the gait cycle);

  5. the ankle maximum dorsiflexion during the stance phase (25 - 75% of the gait cycle);

  6. the ankle maximum dorsiflexion during the swing phase (50- 100% of the gait cycle);

  7. the hip maximum extension during the stance phase (25 - 75% of the gait cycle).

FIGURE 11. - Key gait features extracted from sagittal lower limb joint kinematics.
FIGURE 11.

Key gait features extracted from sagittal lower limb joint kinematics.

For each key gait feature, the MAE and the ME were computed with respect to the MB estimates along with 95% confidence intervals (95% CI) were computed. Normality of the key gait features distributions was assessed by applying Shapiro–Wilk test. After verifying normality, a two-sample t-test with a significant level of 95% was implemented to quantify differences between the MS and MB methods. A p-value of ME and MAE less than 0.05 were considered statistically significant.

For each gait feature and each method (MS and MB), the reliability was evaluated with intraclass correlation based on absolute agreement and 2 way random effects (ICC (2,k)) computed based on the formulas reported in [44] based on the data collected over subjects (n = 18) for the different gait cycles (k = 10).

ICC values less than 0.5 indicate poor reliability, values between 0.5 and 0.75 indicate moderate reliability, values between 0.75 and 0.9 indicate good reliability, and values greater than 0.90 indicate excellent reliability [44].

Spearman’s correlation coefficient (R) was used to correlate the differences between MS and MB systems. The estimated values, derived from R, can be interpreted as follows: values below 0.19 indicate a negligible relationship, values between 0.2 and 0.29 suggest a weak relationship, values between 0.3 and 0.39 indicate a moderate relationship, values between 0.4 and 0.69 imply a strong relationship, and values greater than 0.70 signify a very strong relationship. [45]

SECTION III.

Results

Results for gait events identification, spatial-temporal gait parameters and lower limb joint kinematics, are reported in Table 1. Regarding spatial-temporal parameters, the smallest errors in terms of MAE % values were obtained for stride duration (2%), followed by the step and stride length (2.2% and 2.5%, respectively) and by gait speed (3.1%).

TABLE 1 Mean Absolute Error and Mean Error Between the Visual Inspection on the RGB Images and the Automatic Identification of Initial Contacts for Both Feet in Seconds. Gait Spatial Parameters. Mean Absolute Error (MAE), Mean Error(ME) of the Gait Speed, Stride Length, Stride Duration and Step Length. Lower Limb Joint Kinematics. The Average Root-Mean-Square Errors (RMSE) Value Between the Joint Kinematics Curves Estimated by the MS Method and the MB System are Computed Over the Gait Cycle and Averaged Across Trials and Subjects
Table 1- 
Mean Absolute Error and Mean Error Between the Visual Inspection on the RGB Images and the Automatic Identification of Initial Contacts for Both Feet in Seconds. Gait Spatial Parameters. Mean Absolute Error (MAE), Mean Error(ME) of the Gait Speed, Stride Length, Stride Duration and Step Length. Lower Limb Joint Kinematics. The Average Root-Mean-Square Errors (RMSE) Value Between the Joint Kinematics Curves Estimated by the MS Method and the MB System are Computed Over the Gait Cycle and Averaged Across Trials and Subjects

The RMSE values computed for lower-limb joint kinematics ranged between 3.2° and 4.5°, smallest RMSE values were found for the knee joint kinematics (3.2°), followed by the hip (3.5°) and the ankle (4.5°).

Results related to the extracted key gait features in terms of ME, MAE and their 95% CI are summarized in Table 2. Overall, MAE values were significantly different from zero and ranged from 3.1° to 5.9°.

TABLE 2 Mean error (ME) and Mean Absolute Error (MAE) Between MS and MB Protocol in the Estimation of Gait Features. 95 % CI: 95% of Confidence Interval
Table 2- 
Mean error (ME) and Mean Absolute Error (MAE) Between MS and MB Protocol in the Estimation of Gait Features. 95 % CI: 95% of Confidence Interval

Results for ICC(2,k) and R for both MS and MB protocols, are reported in Table 3. Both MS and MB measurements revealed excellent reliability for K1, K2, K3, K5 and H3 (ICC = 0.90-0.94) while for A3 and A5 both protocols showed a good reliability (ICC = 0.80-0.88). Correlation between MB and MS kinematics ranged from very strong for all knee and hip gait features (≥0.85) and strong (= 0.66) for ankle kinematics features.

TABLE 3 Comparison of Estimates From the MS and MB Protocol for Each Gait Feature. MS ICC(2,k): Intraclass Correlation for MS Protocol, MB ICC(2,k): Intraclass Correlation for MB Protocol, R: Spearman’s Correlation Coefficient
Table 3- 
Comparison of Estimates From the MS and MB Protocol for Each Gait Feature. MS ICC(2,k): Intraclass Correlation for MS Protocol, MB ICC(2,k): Intraclass Correlation for MB Protocol, R: Spearman’s Correlation Coefficient

An ensemble view of the normalized joint kinematics curves, averaged over trials and subjects, are reported in Fig. 12.

FIGURE 12. - Sagittal lower limb joint kinematics (hip, knee and ankle) averaged over subjects and trials (average: dashed lines; SD: shaded area; red = MB system; blue = MS system).
FIGURE 12.

Sagittal lower limb joint kinematics (hip, knee and ankle) averaged over subjects and trials (average: dashed lines; SD: shaded area; red = MB system; blue = MS system).

SECTION IV.

Discussions

The aim of the study was to present and evaluate accuracy and reliability of a clinical markerless gait analysis protocol based on the use of a single RGB-depth camera for estimating spatial-temporal parameters and sagittal lower limb joint kinematics on patients with cerebral palsy. The protocol was specifically devised to provide a quantitative tool which could be easily implemented for screening and monitoring disease-related motor progression in patients with CP.

The proposed protocol includes several fundamental improvements with respect to previous research [20]. First, an automatic thresholding segmentation algorithm was proposed which does not require the use of any homogeneous background, thus improving the system clinical applicability and portability. Second, potential issues associated with left-right confusion [46] and skeleton tracking in presence of foreground-background segments overlap [20] were addressed by introducing a robust separation approach which relies on the maximization of the variance between classes based on depth data. Third, the analysis of the clinical concurrent validity of the method was not limited to the knee kinematics as in [20], but also extended to the ankle and hip kinematics and spatial-temporal parameters. The abovementioned improvements, besides increasing protocol robustness to variations in the experimental conditions, also contributed to an increase of about 35% of the accuracy of the knee joint gait features [20].

It is worth noting that the method for the detection of the gait events was specifically conceived for taking into account the different types of foot contact normally observed in CP patients (i.e. heel foot, flat foot and toe-ground contacts). The proposed method relies on the orientation of the foot model with respect to the ground and it represents a novelty with respect to other studies based on the 3D coordinates of ankle joint center only, thus neglecting the foot contact mechanism ([14], [19], [47], [48], [49], [50], [51]).

A. Spatial-Temporal Parameters

The proposed MS protocol showed a very good accuracy compliant with clinical requirements ([52], [53]) with MAE values equal to 1.2 cm for step length, 20 ms for stride duration, 2.5 cm for stride length and 0.02 m/s for the gait speed. The latter errors found on CP patients were comparable to those reported in previous single-camera studies but obtained on healthy subjects ([14], [17], [19], [22], [50], [51]).

To the best of the authors knowledge, this is the first study specifically validating the spatio-temporal parameters in children with CP against a MB protocol. In the literature, there are only a few single camera-based methods validated on post stroke ([49], [54]) and parkinsonian patients [47], and they showed lower performance. In particular, Ferraris et al. [54] and Cimolin et al. [47] have assessed the errors associated to the spatio-temporal parameters estimation using the Kinect v2 body tracking SDK on eleven post-stroke and ten parkinsonian subjects, respectively. Both studies found ME values equal to 0.02 m/s for gait speed and 2 cm for step length, consistently larger than those found with the proposed MS protocol (0.01 m/s for gait speed and 0.06 cm for step length). In a recent study, Lonini et al. [49] evaluated the performance of DeepLabCut software for the analysis of the gait of ten post-stroke patients using a single RGB camera, reporting a high error variability for gait speed (± 0.11 m/s in terms of ME).

B. Lower-Limb Joint Kinematics

When comparing the 2D joint kinematics estimated by the proposed MS method against the 3D joint kinematics provided by the reference MB protocol, it is convenient to discriminate between the effects associated to the use of different anatomical axes definitions and angular conventions from the actual estimation errors [42]. While the adoption of different anatomical axes definition mainly reflects on an offset between curves, the errors in the reconstruction of the joint center trajectories would affect the waveform similarity and it can be quantified by the RMSE after offset removal.

The average angular offset values were 8° for the ankle, 6° for the knee and 7° for the hip joint. The ankle offset can be partially explained considering that the foot antero-posterior axis in MS protocol is computed as the principal axis of the best-fitting inertial ellipsoid whereas in the MB protocol from the position of the markers attached to the second metatarsal joint and the calcaneus. Similarly, the offset at the knee joint can be ascribed to the different definition for the HJC identifications implemented in the MS and MB protocol. In fact, while in the MS protocol the HJC coincides with the position of GT, in the MB protocol is determined as the geometrical center of the acetabulum and it is determined based on anthropometric regression equation [2]. The offset at the hip joint is associated to the fact that in the MS protocol, the pelvis inclination is assumed to be constant during the gait cycle and coinciding to the pelvic tilt during the static upright standing acquisition.

In terms of waveform similarity, the most accurate joint angle was obtained for the knee joint (RMSE = 3.2°), followed by the hip joint (RMSE = 3.5°) and the ankle joint (RMSE = 4.5°). It is important to highlight that, from a clinical perspective, errors between 2° and 5° are likely to be regarded as reasonable but may require consideration in data interpretation [55]. The largest errors affecting the ankle kinematics are mainly due to the auto-exposure of the camera which can cause blurred images of the foot and distal part of the shank during fastest movement such as the swing phase.

In the last years, several single-camera MS methods were proposed for gait analysis, however, in many cases, a direct comparative evaluation with the proposed method was not possible because: (i ) the kinematic outputs were not validated against a clinically-accepted gold standard ([56], [57], [58], [59]) (ii) the method performance was only validated in terms of accuracy in tracking joint centers ([46], [60]), (iii) the objective was to classify motor activities or to detect gait abnormalities ([54], [61], [62], [63], [64], [65]).

To the best of authors knowledge, the only MS study involving CP children was presented by Nguyen et al. [46] that evaluated the concurrent validity of the built-in body tracking SDK (Kinect v2) against an MB gait protocol on 10 CP children (GMFS I-II) based on a frontal view. However, the reported errors were large for all the joints (RMSE = 11.2° for the hip, 10.3° for the knee, and 7.5° for the ankle).

In addition, there are a few MS studies that have been only applied and evaluated on normal gait ([17], [19], [66]).

In Yeung et al. [66], the effect of five camera viewing angles on the estimates of kinematics curves on healthy subjects by using body tracking SDK of Kinect v2 was investigated. They found that Kinect v2 performed better at frontal camera viewing angle showing a RMSE of 8° for the hip flexion/extension angle, 11.4° for the sagittal knee, and 17.4° for the ankle plantar/dorsi flexion angle.

Yamamoto et al. have tested the performance of OpenPose [17] on healthy subjects reporting comparable results in terms of reliability for knee and hip kinematics (ICC ranging from 0.60 to 0.98) but very poor reliability for ankle kinematics represented by an ICC of 0.1 for the ankle maximum dorsiflexion during the stance phase and the swing phase in contrast to the ICC of 0.66 obtained from our MS protocol.

Castelli et al. [19] reported a RMSE of 4.8 ° for the hip, 3.6 ° for the knee, and 3° for the ankle on 10 healthy adults which are comparable with the values obtained using our method on CP children.

Interestingly, the proposed MS method provide a level of accuracy similar to that obtained by popular observation-based clinical gait assessment tools such as the Salford Gait Tool based on the manual identification of the anatomical landmarks on each recorded image [67].

In particular, Larsen et al. [67] have analyzed the accuracy of the Salford Gait Tool against MB protocol on 10 adult CP patients. They showed comparable errors in terms of ME on some key gait features showing that the proposed method obtained comparable performance of observation-based clinical gait assessment with less efforts required from the clinicians since our protocol includes manual intervention exclusively for calibrating the three templates.

SECTION V.

Conclusion

The proposed MS protocol was designed to satisfy as priority its usability in clinical setting in terms of set-up and cost. For this reason, it was decided to use a single consumer-grade RGB-D camera. Clearly, this choice inevitably limited the kinematic analysis to the joint movement in the sagittal plane and hence to the description of the flexion-extension lower limb joint angles.

Furthermore, the movement of the subject was reconstructed based on a 2D multi-segmental model defined from a single 2D RGB image as the information provided by the depth sensor were only used to extract the foreground body segments. However, the projection of human 3D bodies motion to a 2D space necessarily leads to errors and ambiguities which could be only partially compensated using the multiple anatomical calibration proposed.

Another critical factor was represented by the quality of the image recorded by the specific RGB-D camera. In fact, due to automatic exposure time implemented by the Kinect v2, the recorded images resulted blurred when capturing fast moving body parts, and it negatively affected the results of the template/mask matching. The latter problem could be easily solved by selecting a camera which allows to control the exposure parameters.

Finally, it should be acknowledged that the proposed protocol is not fully automatic as it is required a preliminary identification of the external anatomical landmarks for the subject-specific model definition. Nonetheless, it is important to consider that gait analysis in CP patients is generally preceded by a clinical examination during which the clinicians assess range of joint motion and spasticity and may easily perform the identification of few anatomical landmarks.

In conclusion, the present study demonstrated the technical validity of a MS single-camera protocol for clinical gait analysis in CP population. Results showed a good accuracy in the joint kinematics estimation and good to excellent reliability for the extraction of a complete set of clinically relevant key gait features.

NOTE

Open Access provided by 'Politecnico di Torino' within the CRUI CARE Agreement

Appendix

In this appendix, the procedure followed to calculate the conversion factor (m/pixel) for the computation of spatial-temporal parameters was presented.

To convert the spatial parameters in meters, it is necessary to determine the pixel-to-meter conversion ratio. It should be noted that this conversion factor varies with the distance from the camera since the higher the distance, the higher the conversion factor is. To determine the conversion factor, a series of preliminary acquisitions were performed by positioning an object of known size at various distances from the camera. For each distance, the conversion factor was calculated as:\begin{equation*} Conversion \,factor (m/pixel)=\frac {Size\, (m)}{Size\, (pixel)}\end{equation*} View SourceRight-click on figure for MathML and additional features.

The conversion factor values were fitted though a linear fitting technique to best approximate them. Consequently, this procedure is able to obtain a linear model that associates each distance to its corresponding conversion factor.

The spatial-temporal parameters in meter were calculated applying the conversion factor to those values in pixel.

References

References is not available for this document.