B-Spline Modeling of Inertial Measurements for Evaluating Stroke Rehabilitation Effectiveness

Patients who experience upper-limb paralysis after stroke require continual rehabilitation. Rehabilitation must be evaluated for appropriate treatment adjustment; such evaluation can be performed using inertial measurement units (IMUs) instead of standard scales or subjective evaluations. However, IMUs produce large quantities of discretized data, and using these data directly is challenging. In this study, B-splines were used to estimate IMU trajectory data for objective evaluations of hand function and stability by using machine learning classifiers and mathematical indices. IMU trajectory data from a 2018 study on upper-limb rehabilitation were used to validate the proposed method. Features extracted from ${B}$ -spline trajectories could be used to classify individuals in the 2018 study with high accuracy, and the proposed indices revealed differences between these groups. Compared with conventional rehabilitation evaluation methods, the proposed method is more objective and effective.


B-Spline Modeling of Inertial Measurements for Evaluating Stroke Rehabilitation Effectiveness
Yi-Ting Hwang , Yu-Qian Tung, Chun-Shu Chen , and Bor-Shing Lin , Senior Member, IEEE Abstract-Patients who experience upper-limb paralysis after stroke require continual rehabilitation.Rehabilitation must be evaluated for appropriate treatment adjustment; such evaluation can be performed using inertial measurement units (IMUs) instead of standard scales or subjective evaluations.However, IMUs produce large quantities of discretized data, and using these data directly is challenging.In this study, B-splines were used to estimate IMU trajectory data for objective evaluations of hand function and stability by using machine learning classifiers and mathematical indices.IMU trajectory data from a 2018 study on upper-limb rehabilitation were used to validate the proposed method.Features extracted from B-spline trajectories could be used to classify individuals in the 2018 study with high accuracy, and the proposed indices revealed differences between these groups.Compared with conventional rehabilitation evaluation methods, the proposed method is more objective and effective.

I. INTRODUCTION
S TROKE can occur when a blood vessel in the brain clogs or ruptures and can cause cerebral ischemia, cerebral infarction, and even death.In 2020, one in six deaths attributed to cardiovascular disease was associated with stroke [1].
Stroke is also a leading cause of long-term disability; more than half of stroke survivors aged 65 years or older have reduced mobility [2].Only two-thirds of individuals who have experienced stroke can walk independently, and less than half of those who lose basic upper-limb function regain it after 12 months [3], [4].Such patients require upper-limb rehabilitation to restore their ability to complete daily tasks, such as showering, eating, and dressing, without assistance.In rehabilitation, a therapist selects individualized treatments and tasks, such as brick stacking, card flipping, and biking, in accordance with the needs of the patient [6].Patients must perform these tasks repeatedly to regain the motor function of an impaired body part.However, patients might develop negative attitudes toward this repetitive treatment, limiting the effectiveness of rehabilitation [5].To enhance patient enthusiasm for rehabilitation, [7], [8] have proposed computerbased systems for poststroke rehabilitation at home.Moreover, numerous virtual reality (VR)-based upper-limb systems have been designed to increase patient motivation for rehabilitation [9], [10], [11], [12], [13], [14].Comprehensive reviews of these systems are provided in [15] and [16].
Therapists evaluate the effectiveness of rehabilitation by using scale-based tests, such as the Fugl-Meyer test, Action Research Arm Test, Box and Block Test, Jebsen-Taylor hand function test, and Brunnstrom stage test [17].In computer-or VR-based rehabilitation systems, range of motion, movement speed and duration, hit rate, and subjective scale parameters may be used to assess improvement [9], [10], [11], [12], [13], [14].The limitation of such indices is that they can only be used to perform subjective evaluations with a final outcome; individualized treatment plans cannot be designed if no detailed movement information is provided [18].
In rehabilitation-oriented games, digital systems with sensors are often used to record patient movements.In [19], a wearable system with several embedded inertial measurement units (IMUs) was designed to detect complex trunk and shoulder movements and to visualize these movements on a smart device.In [13], a system with a motion tracking device (MTD) was designed to track movement trajectories and used by patients playing a VR game.In [20], a fullbody sensor system with 14 IMUs was used to capture position and orientation data for the body segments of four patients with stroke; these data were used to derive joint angles, which were in turn used to evaluate rehabilitation progress.Descriptive statistics were used to summarize the data collected using the aforementioned sensor system.In [14], two Nintendo Wiimotes (wireless movement sensors) were used to track the movements of patients with stroke who were undergoing rehabilitation; these movements were mapped onto a three-dimensional body model that could be viewed on a computer screen.A modified wearable inertial sensing system that comprised eight IMUs with triaxial accelerometers and gyroscopes was developed to compute spatiotemporal kinematic metrics for assessing patients' upperlimb movements after stroke [21].In [22], the associations between kinematic variables obtained from four sensors and clinical assessments were investigated.In [23], a seven-camera Vicon system was used to capture position data for body segments at a sampling rate of 120 Hz; these data were then used to calculate upper-body angles.The participants in [23] were asked to repeat a tasks multiple times, and the trial-totrial variability of each kinematic parameter was computed.The performance of the the system used was evaluated using only subjective scale tests, such as the Fugl-Meyer test.The data collected by the IMUs were used only for visualization or computing the maximum patient movement angles during the evaluation process.
The trajectories of accelerometer and gyroscope signals obtained by IMUs during tasks provide crucial information regarding the movement ability of patients.When a simple task, such as forward movement, is executed, the IMU trajectory should be smooth.The smoothness of a trajectory is a potential stability index for assessing rehabilitation effectiveness.However, accelerometer and gyroscope data are obtained from signals transmitted by multiple devices.When a server cannot receive and process data instantly, the data spikes, packet loss, and data gaps may occur [24].In [25], a functional data technique was used to remove noise, and IMU trajectory coefficients determined using this technique were useful for characterizing the motor impairment levels of patients with stroke.In the present study, a functional data technique was used to fit IMU data and obtain movement trajectories.Evaluation indices for movement stability were developed on the basis of these trajectories for objectively assessing the improvement in the motor abilities of patients with stroke after rehabilitation.The data collected in [13] were used to demonstrate the feasibility of these indices.The proposed indices are simple and can be calculated from IMU data.Assessing the improvement in motor ability after rehabilitation is crucial, and such assessment is currently performed using only subjective scale-based instruments.Future systems can use the indices proposed in this paper for objective and improved evaluations of the effectiveness of rehabilitation.

A. Modeling IMU Trajectories
A wearable inertial sensing system often contains an accelerometer, a gyroscope, or a magnetometer.During a task, signals from the accelerometer, gyroscope, or magnetometer are collected by a microprocessor.Under normal conditions, these signals have a smooth trajectory over time, and their rate of change is small.Thus, the rate of change of an IMU signal over the entire time interval for a task is a potential index for assessing the functional ability of a patient.Because wearable inertial measurement systems provides only discrete data, extreme raw data values can strongly affect the obtained rate of change of a signal.A functional model can be used to fit the raw signal and determine its rate of change.
Observed signals are a function of time, and the number of signals for a task depends on the sampling rate.A linear model for determining a parametric function for IMU signals is described in the following text.Let y i , i = 1, . . ., N , denote a sample trajectory collected from an IMU; let t i , i = 1, . . ., N , denote the corresponding recording time; and let N denote the number of recorded observations.A functional model for fitting the signal trajectory for a variable can be expressed as follows: where f (•) is a prespecified smoothing function and ϵ i denotes independent random variables with a mean of 0. Given a class of basis functions {φ j (•), j = 1, • • • , K }, the smoothing function can be expressed as follows: where K is a prespecified number of bases and d represents the degree of the polynomial function.To improve fit, the time interval can be divided into K + 1 disjoint intervals of τ k , with k = 1, . . ., K , where a < τ The terms a and b are the lower and upper time bounds, respectively; these parameters are known as inner knots.Moreover, boundary knots can be selected such that f (t) is differentiable near the boundary.Let the knots be expressed as follows: where The other knots (ξ 1 , . . ., ξ d and ξ d+K +3 , . . ., ξ 2d+K +2 ) are selected arbitrarily but are often assumed to be equal to the boundary knots.A B-spline basis function was selected in this study.For d > 0, a B-spline basis function can be defined recursively as follows: where k = 1, . . ., and The spline fits are robust with respect to d. Cubic polynomials (d = 3) are the common standard because the resulting curve appears perfectly smooth [26].According to (2), the rate of change of a signal over time is d f (t)/dt.Thus, we propose two metrics based on the rate of signal change over (a, b) for evaluating the stability of the trajectory; these metrics are expressed as and

B. Example of a VR Upper-Limb Rehabilitation System
Data regarding a VR rehabilitation game from the study of Lin et al. [13] were used to test whether the proposed method for IMU trajectory modeling could be used to evaluate rehabilitation.An experiment was conducted at Chang Gung Memorial Hospital in Taoyuan, Taiwan.Eighteen patients with stroke were enrolled in the study.Prior to the start of the experiment, written consent was obtained in accordance with the Declaration of Helsinki.The Institutional Review Board of Chang Gung Medical Foundation approved the experiment (IRB No. 102-3680B).
The considered system comprised an MTD, an electroencephalography (EEG) device, and a VR game that was designed by psychiatrists and therapists to increase patient interest and improve patient motion.A screenshot during gameplay is presented in Fig. 1.
The MTD required approximately 1 min for calibration, and each patient was asked to hold the MTD still during this time until the calibration process is complete.They then had 5 s to perform their best pronation and supination movements with the affected forearm, enabling the VR game to place game elements in accordance with the patients' maximum movement range.In each game, the patient first had 25 s to move the MTD backward and forward to load a shell into a cannon; the game proceeded to aiming when the cannon had been loaded.After loading, the cannon would fire automatically 10 s later.During this time, the patient must rotate the MTD to aim the cannon at a ship and maintain this position until the cannon fired.Finally, the patient was given 5 s to rest before the next round.If a patient's mean hit rate exceeded 80%, the difficulty of the game is automatically increased; that is, the angle at which the ship might appear and thus the maximum range of motion required for aiming increased.The difficulty would not decrease, regardless of patient accuracy.EEG data were used to enhance the patient's attention; if the patient's concentration waned, the system displayed a stimulus to increase patient focus.Additional details regarding the system are presented in [13].
The experiment comprised three equally sized groups: a control group (A) that participated in conventional rehabilitation, a group (B) that played the VR game, and a group (C) that played the VR game and used the EEG device.The EEG data were used only for monitoring and ensuring patient attention.Besides conventional rehabilitation, every week for 4 weeks, the patients in groups B and C performed three game-based rehabilitation experiments that lasted 35 min; each experiment comprised two 15-min play sessions that interrupted by a 5-min rest period.The patients participated in a total of 24 rehabilitation sessions, each of which comprised up to 22 games lasting a maximum of 40 s.The total score, hit rate, number of games, and maximum wrist turning angle were recorded for each patient during each session.The MTD could track forearm pronation and supination trajectories and angles on the basis of acceleration, angular velocity, and magnetic field strength measurements collected by embedded triaxial accelerometers, gyroscopes, and magnetometers, respectively.Data were recorded every 1/20 s in the x-direction, y-direction, and z-direction.Nine trajectories were obtained for each game.In this paper, ACX, ACY, and ACZ denote the accelerometer measurements; GRX, GRY, and GRZ denote the gyroscope measurements; and MAGX, MAGY, and MAGZ denote the magnetic field measurements.A sample acceleration trajectory in the z-direction for a patient playing the game is displayed in Fig. 2. Because the objective of the present study was to identify new metrics for IMU trajectory data, the patients in group A were not included in the current analysis.

C. Statistical Analysis
The duration for each game was not fixed; thus, the time interval of each game was standardized to [0, 1] as follow.A new time variable was defined as follows: To combine the data for each session, the MTD measurements were normalized as follows: Let y i , i = 1, . . ., N , denote the set of measurements for one game.The normalization equation is given in (9): where ȳ is the mean and s y is the standard deviation of the set.Fig. 2   basis function was selected in this study under the assumption that d = 3 if only three inner knots existed (τ 1 = 0.25, τ 2 = 0.50, and τ 3 = 0.75; that is, K = 3).When d = 3, the resulting fitting curve had seven estimated coefficients: ĉ j , j = 1, 2, . . ., 7.
Patients undergoing rehabilitation typically perform the same motion numerous times during each session and attend multiple rehabilitation sessions.Thus, the number of rehabilitation sessions was set to R, and the number of patients was set to n. N ir denoted the number of games for the r th session and ith patient, where r = 1, 2, . . ., R, and i = 1, 2, . . ., n.Because (4) is a recursive formula, the estimated curve had no closed form.To combine data for each subround, the time intervals (a, b) were partitioned into m equal subintervals of h = (b − a)/m, which were denoted as s 0 = a, s 1 = a + h, . .., s m = b.Here, firk (s t ) denotes the fitted curve for the kth game in the r th session for the ith patient, where t = 0, 1, 2, . . ., m, k = 1, 2, . . ., N ir , r = 1, 2, . . ., R, and i = 1, 2, . . ., n.This curve was derived through replacement of B d j (x), j = 1, 2, • • • , K + d + 1, and the estimated coefficients in (2).Because the number of games varied between sessions and patients, the fitted value for the r th rehabilitation session was defined as the average of all repeated motions during the session as follows: where t = 0, 1, 2, . . ., m, r = 1, 2, . . .R, and i = 1, 2, . . ., n.
In this study, we assumed that m = 100.The numbers of rehabilitation sessions and patients were R = 24 and n = 12, respectively.A fitted curve for the normalized z-direction acceleration trajectory for a game is displayed in Fig. 2 (solid line); the fitted z-direction acceleration for the games in a session is red, and the corresponding average curve for the session is black.Although the curves vary between games, the average fitted curve is smooth and follows the line y = 0. Fig. 4 presents the average fitted curves for all 24 sessions for Fig. 3. Trajectories for the z-direction acceleration for a session.Fitted curves for the games are in red, and the average session trajectory is in black.
groups B and C. The curves for group C have less variation than do those for group B.
The fitted curve in Fig. 2 trends toward 1; to avoid an excessively strong boundary effect, alternative estimates of the rates of signal change in ( 6) and ( 7) were determined as follows: and where r = 1, 2, . . ., 24 and i = 1, 2, . . ., 12.For simplicity, the four indices expressed in ( 11) -( 14) are denoted as I, II, III, and IV, respectively.Nine variables were recorded, resulting in 36 stability indices.

D. Evaluation
Three aspects of the feasibility of the aforementioned indices for rehabilitation evaluation were considered.
First, we evaluated whether the fitted curves of groups B and C differed by inputting the average of the estimated seven coefficients for each session into classification algorithms, which were used to classify the patients into groups B and C. For a given motion replication k in session r for the ith patient, the vector of estimated coefficients was denoted as ĉikr .The vector of the average of the estimated coefficients for all motions in a session was expressed as follows: ĉikr , r = 1, 2, . . ., 24, i = 1, . . ., 12. ( 15) Each session was associated with nine variables and thus 63 features.Moreover, each of the 12 patients participated in 24 sessions, which resulted in 288 observations.Five classifiers were used, namely, the k-nearest neighbor (KNN), support vector machine (SVM), decision tree (DT), random forest (RF), and naive Bayes (NB) classifiers.Regression splines were applied the spline package in R with the default hyperparameters.The Caret package was also used, and this package includes various classification methods and training functions.A five-fold cross-validation scheme was used for training.For the KNN classifier, the tuneGrid function in the Caret package was selected for automatically determining the key parameters for k values of 1-30.For the SVM classifier, a radial basis function kernel was used, and total number of unique combinations was set to ten for the grid search [27].For the DT classifier, the cp value in tuneGrid was determined by searching from 0 to 0.05 in steps of 0.005.For the NB classifier, the default turning parameters for rf, usekernel, and adjust were used.Classification performance was assessed in terms of accuracy, specificity, and sensitivity.In this paper, sensitivity indicates the proportion of patients in group C that were correctly classified as being in group C, and specificity refers to the proportion of patients in group B that were correctly classified as being in group B. The final performance in terms of the aforementioned indices was the average value over 30 repetitions.
Second, the overall stability for 24 sessions was evaluated as follows: ( The stability of groups B and C was compared, and the Wilcoxon test was used to determine whether the overall stability of the two groups was significantly different.Third, the progress of the rehabilitation (long-term effectiveness) was examined.Trajectories were expected to stabilize as the number of sessions increased.A linear mixed model was used to examine whether the stability declined as the number of sessions increased.This model included fixed effects and random effects.The fixed effects were those of group (GP), session (SR), and group-session (GP * SR) interactions, and the two random effects were the intercept and slope of the trajectories.The Wald test was performed to evaluate the significance of the effects.When group-session interactions were significant, the rate of change in signal trajectory stability over sessions was significantly different between groups B and C. If the corresponding estimated coefficient was negative, the stability for group C declined faster than that for group B. The four proposed indices and original variables, including the total score, hit rate, number of games, and maximum wrist angle, were subsequently used to assess long-term rehabilitation effectiveness.

A. Classification
A spline function comprises coefficients and a basis.The estimated coefficients provide information regarding trajectory characteristics.To demonstrate the effectiveness of the proposed stability indices, we examined whether the trajectory characteristics differed between groups B and C. Features were defined as the average of estimates of the coefficients over the N ir games in a session.The results of a two-independent-samples T -test (Table I) revealed the variables that differed significantly between groups B and C.Among the nine variables, ACZ and MAGZ had the highest significance, whereas ACY, GRY, and MAGX had the lowest significance.

TABLE I SIGNIFICANT DIFFERENCES BETWEEN GROUPS B AND C (IDENTIFIED ON THE BASIS OF A TWO-INDEPENDENT-SAMPLES T -TEST) TABLE II CLASSIFICATION PERFORMANCE TABLE III COMPARISON OF RESULTS BETWEEN GROUPS B AND C
Table II presents the classification results based on the average coefficients.The accuracy, specificity, and sensitivity of SVM and RF were higher than 90%; specifically, the sensitivity of SVM was 98%, and the specificity of the RF method was 93%.The KNN method had marginally lower accuracy than did the aforementioned two methods, and the DT and NB methods exhibited low performance.The accurate classification results obtained using the SVM and RF methods indicated that the characteristics of the fitted curves of groups B and C differed.

B. Overall Evaluations
The Wilcoxon test was used to compute whether the average values of the total score, hit rate, number of games, and maximum wrist angle over the 24 sessions for each patient differed between the groups (Table III).The number of games for group B was slightly higher than that for group C; this effect was marginally significant.However, although the total score and hit rate were slightly higher in group C, these effects were nonsignificant.
The p values obtained in the Wilcoxon text are displayed in a heat map in Fig. (5); the rows in this figure represent variables, the columns represent the four stability indices,

TABLE IV STATISTICS OF THE FOUR INDICES FOR ACZ AND GRX
and the colors indicate the significance level ( p < 0.05 was considered significant).ACZ had the most significant indices, followed by AC and GRX.In particular, all four indices derived from ACZ were significant.Moreover, all indices except for index II for AC and GRX were significant at the 0.05 level.Finally, the differences between groups B and C in the indices derived from ACX, GRZ, and MAGY were nonsignificant at p = 0.30.
Table IV provides the statistics of the four indices for ACZ, AC, and GRX.The median, mean, and standard deviation of these indices were significantly higher for the group B patients than for the group C patients.Moreover, the median, mean, and standard deviation of indices I and II were greater than those of indices III and IV.Although the median and mean of index II for group B were twice those for group C, the variability in index II due to GRX was larger; thus, this result was nonsignificant.

C. Progressive Evaluation
The total score over the 24 sessions for the patients in groups B and C are displayed in Fig. 6, in which the x-axis indicates the number of sessions.The scores increased as the number of sessions increased; however, the difference between groups B and C was small.
The total score estimates according to the linear mixed model are presented in Table V.In this table, GP is a binary variable for which 1 indicates group C and 0 indicates group

TABLE V ESTIMATES OF TOTAL SCORE FROM LINEAR MIXED MODEL
B, and SR denotes the number of sessions.The interaction between SR and GP (SR * GP) was not significant, which indicated that the total score over time was not significantly different between groups B and C.However, higher SR was significantly associated with higher scores.Similar results were observed for the hit rate and maximum wrist turning angle.The difference in the number of games for SR, GP, and SR * GP was not significant.
A heat map of the p values obtained for SR, GP, and SR * GP from the linear mixed model is displayed in Fig. (7); the rows in this figure represent variables, and the columns represent estimates.The labels I_SR, I_GP, and I_SR * GP indicate the estimates derived from SI i ; other elements are defined similarly.The colors indicate the significance level ( p < 0.05 was considered significant).The most significant estimates were observed for ACZ, followed by ACY and MAGZ.A significant SR * GP value indicated that the stability was affected by the group and the number of sessions.The obtained SI V i values indicated that SR * GP was significant for ACY, ACZ, and MAGZ.SR * GP was significant for ACZ, as indicated by the SI I I i values.Moreover, SR * GP was significant for MAGZ, as indicated by the SI i and SI I i values.SR was significant for all four stability indices for ACY and ACZ.After the first session, stability was not significantly different between the groups.
The most significant stability indices were derived from ACZ, and many indices derived from MAGZ were significant for SR * GP.Thus, detailed estimates for ACZ are provided in this paper.Table VI presents the model estimates for indices I and II.The average stability of index II for ACZ was not significantly different between the groups during the first session.SR * GP was significant, which indicated that the rate of change differed between groups B and C. The rate of  change for group B was 0.106, whereas that for group C was approximately 0.001; these values were derived by combining effects of SR and SR * GP.However, for index I, the rate of change in the average stability for group B significantly increased by 2.3%, and the difference in the rate of change between groups B and C was weakly significant ( p = 0.053).After the effects of SR and SR * GP were combined, the rate of change in the average stability derived from index I for group C was approximately -0.002; that is, the average stability remained constant as the number of sessions increased.
Table VII presents the model estimates for indices III and IV.For index IV, the rate of change in SR * GP differed significantly between groups B and C. The rate of change for group B was 0.071.After the effects of SR and SR * GP were combined, the rate of change for group C was approximately -0.004, which indicated that index IV was constant over the time interval.The average stability of index III for ACZ did not differ significantly during the first session.However, the rate of change in the average stability for group B increased significantly by 1.9%; the difference in the rate of change between groups B and C was nonsignificant.VIII and IX).The rate of change for group B with SR was not significantly different from 0; that is, the variability remained constant as SR increased.However, the significance of SR * GP indicated that the rate of change with SR for group C was different from 0; that is, as SR increased, the stability index and variability decreased.Finally, SR * GP was not significant for index II because of a large standard error.

IV. DISCUSSION
In this study, functional analysis was performed for effectively using sensor data to evaluate a stroke rehabilitation program.Groups in a stroke rehabilitation program were differentiated according to coefficients estimated using a fitted B-spline function.Four indices based on the variation of the B-spline function were adopted to evaluate the rehabilitation program.
Recruiting sufficient suitable participants is a major challenge in clinical research.Factors such as game score, number of games, hit rate, and maximum wrist turning angle in rehabilitation games can be easily measured but do not provide information on intermediate motions.Thus, these factors might be insufficient for evaluating the overall difference between experimental groups from a small sample, as in the study of Lin et al. [13].However, the indices proposed in this paper, which are based on intermediate data, reveal that, in the study by Lin et al., the patients in group C outperformed those in group B in terms of overall rehabilitation and progress during their rehabilitation program.
Rehabilitation programs include static actions, microactions, and intense actions.The accelerometer data collected by an IMU are most accurate for static actions.By contrast, the gyroscope data collected by an IMU enable the derivation of more accurate measures when microactions and intense actions are executed than when a static action is executed.Thus, filtering procedures are required for generating the IMU trajectory [21], [28], [29].The proposed spline smoothing techniques do not require data preprocessing to remove noise.The estimated spline function coefficients were found to be effective features for machine learning.In contrast to statistical parameters such as mean and standard deviation, the coefficients in functional data analysis are derived from a basis function that is fitted to the data, and these coefficients are less influenced by outliers than are such typical statistical parameters.A similar finding was reported by [25], who demonstrated that the coefficients derived from spline functions are excellent features for classifying patients with stroke by their level of motor impairment.Nevertheless, the maximum duration of each game in the study by Lin et al. was 10 s, and only three inner knots (K ) were selected in the present study to avoid data overfitting.When IMU trajectory tracking is longer, a higher value of K can be selected to increase modeling accuracy.
The trajectories derived from the acceleration and magnetic field strength in the z-direction were key features.The tilt angle of the MTD was equal to tan −1 ACY AC Z .A higher ACZ value resulted in a higher tilt angle, which indicated greater shaking of the MTD and inferior hand function.Similarly, a lower ACZ value indicated better hand function.
The trajectory displayed in Fig. 3 can be attributed to the data collection method.At the beginning of each game, the patient was focused and static; however, motor and concentration impairment might have resulted in the patient being unable to stop immediately when required.Therefore, indices III and IV proved slightly more effective for evaluating the performance of groups B and C than did indices I and II.Furthermore, because the rate of change can be negative, the absolute norm and squared norm were used to compute the indices.Among the four considered indices, index IV was the most effective at detecting differences between groups B and C; this index is commonly used because it has excellent mathematical properties.
Except for [13], most relevant studies have collected data using a cross-sectional method [21], [22], [23].In particular, in [23], four repetitions were performed, and standard deviation was used to examine trial-to-trail variability.Although follow-up research was conducted in [20], only descriptive statistics such as the average and standard deviation of joint range of motion were determined.The data used in the present study were collected using a sophisticated design; that is, patients were asked to play games in 24 sessions with similar conditions.The trajectories were expected to be similar after a patient became familiar with the game and achieved improved functioning.The aim of the present study was to devise a stability index for assessing improvement in motor ability during a rehabilitation program.Thus, the trajectories for each session were averaged to simplify the construction of the indices.In a future study, the difference between a trajectory and the averaged trajectory for all games in each session can be computed, and an instability index can be defined as the average of this difference.However, when the rehabilitation task is performed only once, such an instability index cannot be computed.
V. CONCLUSION Continual rehabilitation is required for the recovery of patients with upper-limb paralysis after stroke.An objective evaluation of rehabilitation progress must be performed to enable therapists to modify the program appropriately.Data collected by IMUs can indicate hand function but are discretized.Therefore, this paper proposes the use of B-splines for extracting the features of IMU trajectories to construct objective indices for evaluating rehabilitation effectiveness.
In the present study, the IMU trajectory data collected by Lin et al. using a VR upper-limb rehabilitation system were used with the proposed method to obtain not only feature information for classifiers but also stability indices for assessing the effectiveness of sensor-based rehabilitation systems.The best stability indices (indices III and IV) can be produced if the trajectory movement is only recorded for the target action.Compared with game-based or therapistevaluated scores, the proposed method is more objective and effective for evaluating rehabilitation progress.

Fig. 2 .
Fig.2.Acceleration trajectory in the z-direction for a game, where "•" denotes the raw trajectory and the solid line denotes the fitted trajectory.

Fig. 4 .
Fig. 4. Average z-direction acceleration trajectories for 24 sessions for (a) group B and (b) group C.

Fig. 5 .
Fig. 5. Heat map for the p values of the differences in the proposed indices between groups B and C in the Wilcoxon test.

Fig. 7 .
Fig. 7. Heat map of p values for indices obtained using a linear mixed model.