Learning Post-Stroke Gait Training Strategies by Modeling Patient-Therapist Interaction

For safe and effective robot-aided gait training, it is essential to incorporate the knowledge and expertise of physical therapists. Toward this goal, we directly learn from physical therapists’ demonstrations of manual gait assistance in stroke rehabilitation. Lower-limb kinematics of patients and assistive force applied by therapists to the patient’s leg are measured using a wearable sensing system which includes a custom-made force sensing array. The collected data is then used to characterize a therapist’s strategies in response to unique gait behaviors found within a patient’s gait. Preliminary analysis shows that knee extension and weight-shifting are the most important features that shape a therapist’s assistance strategies. These key features are then integrated into a virtual impedance model to predict the therapist’s assistive torque. This model benefits from a goal-directed attractor and representative features that allow intuitive characterization and estimation of a therapist’s assistance strategies. The resulting model is able to accurately capture high-level therapist behaviors over the course of a full training session (r2 = 0.92, RMSE = 0.23Nm) while still explaining some of the more nuanced behaviors contained in individual strides (r2 = 0.53, RMSE = 0.61Nm). This work provides a new approach to control wearable robotics in the sense of directly encoding the decision-making process of physical therapists into a safe human-robot interaction framework for gait rehabilitation.

mon gait dysfunctions in post-stroke individuals involve a combination of impaired muscle strength, coordination, and proprioception [2]. To help these patients regain their normal walking ability, manual gait therapy is currently provided by a physical therapist (PT) using motor learning principles [3]. This one-to-one physical therapy is effective but timeconsuming, costly, and labor-intensive.
To make gait therapy easily accessible and affordable for patients, lower-limb exoskeletons have been studied for post-stroke gait rehabilitation [4], [5]. Despite considerable progress in wearable exoskeletons, it remains a major challenge to design personalized robot control strategies for different patients. Previous studies have suggested that modeling and identifying human-human sensorimotor interactions can lead to the development of robots that physically interact and move with humans in an intuitive and efficient manner [6]. Physical rehabilitation is a form of human-human interaction in which the goal of the PT is to train patients to improve their motor performance. However, in the context of rehabilitation robotics, there is no widely accepted framework to learn from this human-human interaction [6]. This is due to a scarcity of studies that collect and model the interactions between PTs and patients to help identify the sensorimotor principles of such interactions. Galvez et al. [7] collected the interaction force and leg kinematics in treadmill-based training. They showed that different PTs applied significantly different forces, resulting in different leg kinematics. However, no specific strategy on how the forces contributed to the task performance or modeling of the assistance, was purposed. In [8], a backdrivable robot was used to measure foot kinematics, and assist in the manual stepping training of spinal cord injury patients. The robot's linear actuators were connected to the treadmill sidebars, and the apex of the robot was attached to the subject's ankle joint center through two linkages and a revolute joint, applying a 2-D force directly to the foot. The measured average trajectory of the apex in manual training was used as the desired trajectory of an adaptive impedance controller of the robot. Here, interaction forces and joint trajectories were not collected and the impedance gains were tuned on a step-by-step basis with an error-based learning law error. Therefore, it fails to capture the variation in the imposed trajectories by the PTs as well as the impedance gains within a gait cycle (different gait phases), which can lead to poor generalizability. In a more recent study, Fong et al. [9] proposed an impedance-based learning from demonstration (LfD) framework targeting foot-dropping assistance during treadmill-based gait therapy, using a robotic arm. Despite the This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ successful modeling and implementation, the experiments did not include the actual physical therapist-patient (PT-P) interaction to extract gait rehabilitation insights.
In this work, a custom-made wearable sensor system is developed to measure the interaction forces and leg kinematics during post-stroke gait training of hemiparetic patients with moderate weakness in their quadriceps muscles. The collected data is then used to characterize the patient's gait patterns and the corresponding assistive torques applied by the PTs. The results offer important insights into how gait rehabilitation principles, when used in tandem with clinical experience and patient gait observations, can lead to better timing and magnitude of assistance. To achieve this, an LfD-based approach to virtual impedance control is selected as the mechanism for reproducing PT assistance and incorporating these clinical insights.
Impedance controllers are a second-order dynamic model inspired by biological mechanisms for compliant interaction. It is a widely used approach for controlling robots that physically interact with humans due to its admittance of and stability in response to disturbances [10] which keeps interaction forces within a safe range. In robot-aided rehabilitation contexts, impedance control also provides assist-asneeded interaction which has been shown to promote motor learning and recovery [11]. Additionally, impedance admits disturbances (human interaction forces) instead of rejecting them, as in simpler direct position and velocity controllers, which can lead to discomfort and safety concerns in rehabilitation. Conversely, complex approaches for learning predictive torque control models (e.g., inverse reinforcement learning (IRL) and artificial neural network (ANN)) may suffer from issues of tractability, non-uniqueness, poor interpretability, or large amounts of data required for training. The latter issue is particularly relevant in rehabilitation settings since access to patients is often limited. Furthermore, learning impedance strategies over patient states allows us to learn an interpretable model with a physical intuition due to the correlation with a spring-mass damper system. Consequently, impedance provides a means for model verification and extraction of PT assistance insights in a way not immediately available in other approaches. Therefore, impedance affords modeling PT strategies while providing the aforementioned benefits for complexity, interpretability, physical intuition, safety, and accuracy of the learned model.
The contributions of this paper are summarized in Fig. 1 and are as follows: First, high-resolution data of actual over-ground gait training interactions is obtained through a custom-made wearable system, for identifying abnormal gait patterns and the therapists' assistance strategies. Next, an impedance learning algorithm with insightful feature selection and a goal-directed attractor definition is proposed as a means to reproduce therapist assistance in a way that integrates clinical insights into the control of lowerlimb exoskeletons. To the author's knowledge, no other work has collected such data in this setting to model PT decision-making for the reproduction of assistive torque in gait rehabilitation. Overview of the robot-aided gait rehabilitation framework presented in this paper. In (A), data was collected from gait training sessions and analyzed using insights from the clinical literature. This analysis informed a Gaussian mixture model (GMM) inspired impedance learning algorithm (B) that was used to reproduce observed PT torque behaviors by applying a Gaussian mixture regression (GMR) prediction algorithm (C). These reproduced torques were evaluated on a subset of patient states to test the ability of the algorithm to replicate PT behaviors when applied in the lower-limb exoskeleton during future work (D).

A. Data Collection System
To capture the interaction dynamics between the PT and the patient, we designed a wearable sensor system to record force exerted by the PT, knee joint kinematics, and ground reaction forces at the affected side.
During the gait therapy sessions, the PT touches and exerts force on different areas of the body. However, to stabilize the knee motion during weight-shifting and prevent knee buckling, the assistive forces are mostly focused on two areas: the upper and lower anterior knee (where the force sensors are placed in Fig. 2). Therefore, we designed two separate soft force sensor braces for each segment, intending to capture both the magnitude and distribution of the force in a non-intrusive manner. The distribution of the force is necessary to calculate the aggregate assistive torque. Each brace embodies a flexible pressure sensor array (3 × 6 and 2 × 5 for the upper and lower knee, respectively) consisting of air pressure sensors enclosed by air-filled silicon pads with an elastomeric pillar array ( Fig. 2-b), inspired by [12]. The force is measured by the increase in the air pressure caused by pillar compression. This design allows for a compact structure by introducing a high measured force-to-volume ratio and is shown to have a linear and repeatable pressure-force behavior [12]. Each sensor is calibrated separately with a linear model between measured force (N ) and sensor output (volts) (see supplementary materials-A).
The knee joint kinematics were captured using the BNO085 IMU sensor (SparkFun Electronics, CO). The Smart Shoes [13] were used to capture the ground reaction forces (GRFs) of the affected side in real-time (Fig. 2). All data was collected by a host microcontroller (Intel UP-Board, Intel Cop., CA), which along with a 12V battery and PCB boards was put  B. Experimental Protocol 1) Patient Recruitment: Post-stroke patients with hemiparetic gait, specifically with weakness in the quadriceps leading to knee instability, were recruited at Barrow Neurological Institute (BNI) in Phoenix, Arizona. Participants had to be able to ambulate with minimal assistance or less for up to 5 minutes with or without the use of a single-point cane and/or ankle foot orthosis (AFO). Participants with manual muscle testing (MMT) scores of knee flexion/extension less than 2+ were excluded from the study. Participants with modified Ashworth of hemiparetic lower extremity less than or equal to 1+, and with flexion contracture that cannot extend the knee beyond 10 • were also excluded from the study. These criteria was set to allow for recruiting patients with less impairment, so the PTs' assistance would focus on correcting certain gait abnormalities related to knee instability. Before enrollment, a series of physical assessments were performed by the PTs to ensure that all the criteria were met. The protocol of the study was reviewed and approved by the institutional review board at BNI (protocol number 19-500-271-70-19).
Initially seven patients were recruited, out of which four patients' data are used in this study. Data from two patients were discarded as the PT assistance was not significant due to their relatively functional and independent gait. Thus the inclusion/exclusion criteria was adjusted accordingly to recruit patients with higher impairment levels. Another patient's data collection was unsuccessful because of technical difficulties with the wearable sensor system during the gait training session. Anthropomorphic information and the knee and hip  Table I.
2) Experimental Procedure: The BNI team conducted participant enrollment after which a two-hour data collection session was scheduled with participants that consented. Each data collection session consisted of three sub-sessions: 1) no training (NT), 2) training by the first PT (PT1), and 3) training by the second PT (PT2). We collected two PTs' data in each session to study the similarity and differences in their gait training strategies.
At the beginning of each session, patients first put on the wearable sensors with the help of the PTs. If the patient wore an AFO, it was removed and replaced by an ace wrap to prevent foot drop. In the first NT sub-session, the patients walked by themselves for three minutes, with minimal assistance from the PT only when needed for safety. Minimal assistance in this context is defined as assisting with less than 25% of the weight support and balancing, through pelvis and upperbody assistance. After the first sub-session was completed, the patient rested for at least five minutes depending on patient readiness. Next, the training with the PT1 began, in which the PT, sitting on a wheeled stool, facilitated the upper knee motion of the paretic side to support weight-shifting and knee stability. These two sub-sessions also lasted for three minutes each and consisted of multiple 10-15 meter laps, at the end of which each patient would turn, and stay steady and straight for 10 seconds (to help with initializing IMUs and the force sensor). Real-time feedback for the quality of each gait cycle from PTs was collected (see Sec. III-A.5). After resting for at least five minutes, the other training sub-session was performed by PT2, similar to the previous sub-session. To assess if the data collection system interfered with patient kinematics or PT assistance, which would invalidate the data, both the patient and PT filled out a survey expressing their opinion on the training session and data collection system. None of the trials indicated significant interference from the system as seen by the responses found in supplementary materials-B

III. BIOMECHANICAL CHARACTERIZATION
In this section, using the collected data, including the forms and recorded videos, quantitative and qualitative analyses Fig. 3. The kinematics diagram shows the relevant kinematic variables for knee θ, shank α, and thigh β angle displacements. It also shows the sagittal force applied by the therapist F pt . The ground reaction force F g is applied at the foot (paretic side) and is used to calculate percent weight-shift w by scaling F g by the patient's weight F w .
are performed to characterize the patient's gait, along with facilitation strategies used by the PTs.

A. Initial Data Processing
Initially, all sensor data was collected on the host PC at different frequencies. In the post-processing, all data were synchronized using the common time frame in the host PC and re-sampled to 75 Hz. The following variables were extracted: 1) Gait Sub-Phases and Percentage: Using the four readings from the Smart Shoe, it is possible to segment the gait into six sub-phases: heel-strike (HS), loading-response (LR), midstance (MST), terminal-stance (TST) or heel-rise, pre-swing (PSW) or toe-off, and swing (SW). The first five sub-phases belong to the stance phase. This segmentation is based on insole sensor measurements, using a fuzzy logic algorithm developed in [14]. By incorporating these sub-phases, we can segment the data into individual gait cycles, and the stance time and stance phase percentage of each gait cycle can be extracted.
2) Lateral Weight-Shift: The sum of sensor readings obtained from the Smart Shoes gives the net force applied to the foot by the ground F g . Normalizing this value by the weight of each patient F w , the approximate percent of weight shifted on the paretic side w can be calculated at each time step.
3) Joint Kinematics: The thigh and shank kinematics are calculated using the net quaternion rotation q = qq −1 0 for each IMU where q is the current quaternion value and q 0 is the reference rotation established by having the PT place the patient's knee at full extension at the beginning of each lap. It was observed that a consistent reference q 0 was challenging to achieve and is estimated to produce an error of ±5 • . This may explain the lack of knee hyperextension in the observed joint trajectories as seen in Fig. 4. Sagittal thigh β and shank α angles are then recovered by applying q to a unit vector and projecting it onto the sagittal plane. With this information, the patient's knee angle θ can then be calculated as the difference between β and α as seen in Fig. 3. 4) Assistive Knee Torque: Although the PT might need to support the lower knee, in our data collection sessions there were very few instances where the PT was exerting force on the force sensor at the shank. Therefore, we only considered the sagittal forces on the thigh (F pt as shown in Fig. 3). This force was always applied in the depicted direction which results in positive torque. Using our force sensor, we were able to calculate both the magnitude and center of pressure (CoP) of the exerted force on the thigh. As shown in the supplementary materials-A, the effect of this force can be represented by a knee and ankle joint torque. These two torque components are coupled as they are the result of the same force F pt . For simplicity, we choose to learn the knee torque component τ = ℓ|F pt | only, as the ankle joint torque component can be calculated based on the learned torque τ , given kinematic parameters θ and α (Fig. 3) and shank length (see supplementary materials-A).
Note that the gripping and torsional forces can affect sensor readings but are treated as noise. Due to the low relative magnitude and different frequency, which is partially removed by filtering, gripping and torsional forces are considered to have a negligible impact on the final knee torque calculation. 5) Ratings: In each session, the PT facilitating the patient's lower limb rated the quality of their assistance during each gait cycle as "good," "ok" and "bad," while the other PT input the rating into the system by pushing the corresponding button on a detachable remote connected to the backpack. Only gait cycles rated as "good" (more than 70% of the cycles), were included in the data set for analysis and learning.

B. Manual Gait Training With Hemi-Paretic Patients
Patients with hemiparesis often present with a slow, asymmetrical gait pattern, which may be due to knee instability and inadequate weight-shifting. During the assessment, knee instability is often confirmed when weakness in hamstrings or quadriceps muscles and/or proprioceptive deficits are identified [15]. A hemiparetic gait includes knee hyperextension during initial loading due to the lack of control of the quadriceps muscles. At mid-stance, overall muscle weakness can cause increased hip and knee flexion, which may be observed as knee buckling. During terminal stance, knee hyperextension may occur if the hamstrings are too weak to counteract the quadriceps to slow down knee extension and the knee may quickly hyperextend.
Patients with hemiparesis may compensate for weakness at different joints during a gait cycle to improve stability. In the stance phase, two common strategies to prevent knee hyperextension or buckling are to limit stance time by decreasing the weight-shifting on the involved extremity or to maintain excessive knee flexion. These gait compensations often result in an asymmetrical gait pattern with decreased step length on the contralateral side and altered joint kinematics throughout the lower extremity [15], [16]. Over time, compensatory movement patterns may reinforce the abnormal tone and movement of the involved side. In summary, gait abnormalities related to knee instability on the paretic side can be categorized as: • Knee hyperflexion in mid-stance • Knee hyperextension in terminal stance • Late loading and early unloading (insufficient weightshift) from the paretic side to the non-paretic side (limiting stance time on the paretic side) These gait impairments often result in decreased step length on the non-paretic side, asymmetric gait, and altered lower-limb joint kinematics on both sides. It should be noted that a patient may not have all three of these features.
The stance phase is key to facilitating a symmetrical gait pattern since increased stance time on the involved side allows the patient time to take a normal step with the non-involved lower extremity. Functional stance is achieved when appropriate weight is shifted onto the involved lower extremity while stabilizing the knee, which may require manual support and facilitation of the quadriceps or hamstrings. Timing is critical, and facilitation is performed as needed to achieve proper joint kinematics during each phase of stance. Repetition of proper gait mechanics is needed for motor learning to occur, which is important as learning is required for neural adaptation [3]. On the other hand, active participation of the patients is essential to prevent over-reliance on the PT and automation [17].
Consequently, the main training strategies of the PTs would be to 1) facilitate the knee extension during mid-stance (specifically when reaching the maximum weight-acceptance) and 2) increase the stance time (weight-shifting period) on the paretic side to allow enough time for the non-paretic side to take a normal step. For the rest of this section, we aim to characterize the patient's gait patterns and gait training features described above through visualizations of the collected data.

C. Characterization of Patients' Gait
We use the MMT test results (in Table I) to characterize the impairment levels of the four patients. MMT scores range from 0 to 5, with 0 being no contractions (on the muscles), and 5 being full range of motion against significant resistance [18]. Therefore, we can conclude that P7 has the healthiest gait among other patients. P3 has the lowest scores on knee flexion and hip extension. Based on MMT results, we can group the patients into lower and higher muscular weakness (LMW and HMW) groups. As shown in Table I, P7 and P5 are in LMW, and P6 and P3 are in HMW. Figure 4 shows the knee angles and weight-shift of the patients, as well as the assistive torque for analyzing the PT strategies corresponding to abnormal gait patterns. This figure includes the data for P6 and P7, each representing one muscular weakness group. To provide a reference for comparison, We also included a knee angle pattern of over-ground walking of healthy individuals in their natural speed, by averaging over 50 subjects data using the public dataset in [19]. Overall, we did not observe any case of severe knee buckling and hyperextensions in the NT sessions, as our patients all had MMT scores >2+ and were evaluated by the PTs to be able to ambulate with minimal assistance or less. Nevertheless, for P7 we observe excessive knee flexion angle in early mid-stance and at maximum weight-acceptance (as labeled in the knee angle plot for P7 in Fig. 4), which can be associated with the weakness of knee extensors [16]. On the other hand, we observe small or no knee flexion when entering mid-stance for P6 (as labeled in the knee angle plot for P6 in Fig. 4), which is another compensation strategy to stabilize the knee with weakness of knee extensors at midstance [16]. Knee extension at heel-strike as a compensatory mechanism for the weakness of planter-flexor muscles [16] is also observed in P6.
All patients showed similar irregular GRF patterns. The first peak, which represents the total weight acceptance, happens relatively late at the mid-stance, and the second peak, which is expected to be observed at terminal-stance/pre-swing, is blended into the first one and decreased in magnitude, shown in Fig. 4 (second row). Similar patterns were also seen in previous studies of post-stroke patients [20]. This can be explained partly by the lack of proper heel-strike and push-off due to ankle dorsiflexion and plantarflexion weakness.

D. Gait Training (PT Strategy) Characterization
Here, we examine the outcome of each gait training session, highlighting the similarities and differences between the strategies employed by the PTs. According to the assistive torque plot in Fig. 4 (third row), generally lower magnitude torque is exerted for LMW patients, as expected. Peak torque is mostly located at 25% to 50% gait cycle, near where the maximum weight-acceptance in mid-stance takes place. This observation demonstrates an assistance strategy where the PT facilitates weight shifting onto the paretic side while stabilizing the knee to prevent buckling.
One goal of the PT is to increase stance time on the paretic side. As seen in Table II, the stance time is increased in the training sessions. This increase is less frequently observed for P7, which could be related to the low impairment level of this patient compared to the others. More specific outcomes can also be observed for other patients. For example, P7's excessive knee flexion in mid-stance is corrected in the training sessions. It should be noted that motor skill acquisition through manual gait training is gradual and requires multiple sessions for the patient's joint kinematics to have a significant shift towards healthier gait patterns.

IV. VIRTUAL IMPEDANCE LEARNING
The work thus far has provided a foundation for understanding and analytically evaluating how the PT provides assistance to address different types of gait behaviors. We will now present a low-dimensional model of PT decision-making driven by insight gained from the previous analysis. The intended result of the algorithm is to produce a predictive model, that follows principles of gait rehabilitation, faithfully captures high-level PT strategies, and provides a compliant and safe mode of control for lower-limb exoskeletons. The remainder of this section will first provide an overview of the insights and novel aspects incorporated in this algorithm. Then, a brief formulation of the algorithm will be given and followed by an analysis and results of its implementation.

A. Incorporating Insights
The analysis in Sec. III informs the proposed algorithm of rehabilitation insights by providing a quantitative description of impairment that aligns with the identified deterministic features of PT assistance; patient kinematics and weightshift. Instead of solely using knee angle as the descriptor of kinematics, we expand on this insight by first assuming that there are relatively small changes in ankle flexion and torso angle due to the ankle being constrained with an ace wrap and the torso being supported by a second therapist during data collection. Therefore, the shank α and thigh β angles are selected as the model's features that describe patient kinematics since they explicitly account for knee angle and contain partial information about hip and ankle kinematics. Weight-shift w is also selected as a feature due to the aforementioned significance in facilitating increased stance time, as seen in Table II, to prevent asymmetric gait patterns. Lastly, percent progression through the stance phase t is selected as the time-indexing feature to account for PT assistance being specific to different gait sub-phases and its invariance to different walking speeds. These features are selected due to their determinism in the observed assistance and they allow the proposed algorithm to identify classes of gait behaviors that enable characterization of unique assistive responses from the PT.
The next insight considered is that PT assistance is applied to intentionally move the patient into preferred states, as seen in Fig. 4, when comparing assisted and unassisted sessions. Also, we assert that the PT is not simply correcting the patient's current state, but the PT is instead facilitating the forward progression of the patient through the gait cycle. Given the fairly safe assumption that the PT's assistance objective was met during strides rated as "Good", we can define these preferred future states as the PT's goal state and as the attractor of the virtual impedance system.
One caveat of this definition for the attractor is that it prevents our model of PT behavior from describing torque components that are not directly determined by the displacement between the patient's current and future states. To address this, we consider that the PT may use strategies that generate forces that do not directly contribute to the patient's progress through the previously defined states. One such strategy can be described as proactively bracing the knee to prevent knee instability. These types of strategies are included in the algorithm by reproducing the unexplained assistance, described by the residual torque from our model of interaction, found during the optimization of the impedance gains.

B. Mathematical Formulation
In this paper, PT's assistive torques to the knee will be described by the following virtual impedance model: where K P t and K V t are the continuous time-varying stiffness and damping values, τ t is the torque applied by the PT around the patient's knee, x t = [w t , α t , β t ] T is the state of the virtual impedance system, and y t is the future-state attractor at a percent progression through the stance phase t.During reproduction, we add an auxiliary term r t to (1) as our estimate of the residual torque, which is discussed in Sec. IV-A.
The attractor y t is simply the state x t that will be observed in an additional t f percent stance phase in the future where y t = [w t+t f , α t+t f , β t+t f ] T . Here, t f is a design parameter manually tuned for each PT-P pair to optimize the performance objective of f (t f ) = 0.5(r 2 ) + 0.5(RMSPE) where r 2 and RMSPE are the coefficient of determination and root mean squared percent error, respectively, of the final torque prediction generated by this algorithm. Intuitively, the subscript t + t f describes the state that is t f percent gait phase in the future relative to the state at time t. The mean value for the time-offset was found to bet f = 28.25% with a standard deviation of σ t f = 10.9% Intuitively, this suggests that the attractive state of the virtual impedance system was found to be, on average, the patient state in the next sub-phase of the gait cycle.
An estimate of the observed assistance can be recovered once the time-varying prediction of residual torque r t , attractor y t , stiffness K P t and damping K V t gains are known. Therefore, the objectives of the learning step are to 1) create a gait pattern classifier such that a learned assistance strategy is uniquely defined in response to each class, 2) parameterize stiffness gains K P i , damping gains K V i , and residual torques K r i for each ith classification of gait patterns, and 3) learn a predictive model for y t . 1) Training: Given a set of demonstrations d = [t, x t , y t , τ t ], we provide two additional samples of previous states (x t−t p and x t−2t p ) uniformly spaced t p apart to improve the attractor prediction during reproduction. The training data then becomes ξ = [t, x t , x t−t p , x t−2t p , y t ] where we encode ξ into a Gaussian mixture model (GMM) as described in [21]. We use the Bayesian information criterion (BIC) [22] as the model selection criteria to specify the number of clusters k. The parameterized GMM, specified by the mixing coefficients π i , the mean vector µ i , and covariance matrix i , can then be used to estimate K P i , K V i and K r i by the following optimization: i and K V i are the stiffness and damping gains found for each i-th Gaussian component, µ y i is attractor's dimension along the GMM's center, p t,i is the posterior probability from the GMM encoding, and K P lb and K P ub are the lower and upper bounds of feasible stiffness values. During optimization, we find the mean residual torque for each cluster K r i by taking the weighted average of r t,i for every time-step: K r i = n t t=0 p t,i (r t,i ) where n t is the number of time steps.
2) Reproduction: Reproduction begins with an encoded GMM in the form N (µ i , i ) and K P i , K V i , and K r i vectors for i ∈ {1, 2, . . . , k}. Gaussian mixture regression (GMR) [23] is performed with the observed predictors ξ I t = [t, x t , x t−t p , x t−2t p ] to find the probabilityĥ t,i that a sample from ξ I t belongs to each Gaussian component. Then, µ y i , K P i , K V i and K r i are mixed according toĥ t,i by taking the weighted sum of each component as follows: whereŷ t is the predicted attractor and r t is the estimate of the unexplained assistance for each continuous time-step t.
The final estimate of the observed torqueτ t is recovered by substituting these parameters into (1) and then adding r t : C. Analysis This analysis will first evaluate the trained model on its ability to appropriately classify patient gait patterns and characterize PT impedance strategies. A hard clustering procedure, where samples are classified according to the Gaussian component with the highest posterior probability, will be performed on the testing data set. Qualitative analysis will then be used to determine if the GMM can adequately identify unique presentation of gait patterns. Stiffness gains are then compared between classes to see if the characterized impedance strategies align with what we would expect from a PT responding to different classes. For example, we would expect more aggressive (higher) stiffness gains during classes that characterize significant compensatory behaviors like knee hyperextension during mid-stance. More aggressive stiffness gains would then align with increased PT intervention to facilitate desired assistance outcomes. Here, we look at the unsigned magnitude of the impedance gains due to the ambiguity introduced by allowing K P i to be positive or negative such that K P i (y t − x t ) generates a positive torque. In place of analyzing all patient gait pattern classifications, which would be exceedingly lengthy, a more in-depth analysis of P6's session with PT2 will be provided as an example.
The algorithm's ability to characterize the observed assistance using the interaction model (1) is evaluated by analyzing the residual torque estimated during optimization K r i and the residual torque included in the reproduction r t . This will help identify the degree of reliance on the residual torque estimation during reproduction. To provide additional support for the validity of the trained model, the GMR's ability to predict y t will also be evaluated using the coefficient of determination r 2 and root mean squared error RMSE. Similarly, r 2 and RMSE are reported as a baseline for the predictive power ofτ t .
The overall performance of the algorithm is evaluated by how well the reproduced assistance matches key characteristics of PT behavior. Performance in replicating the magnitude of observed assistance is evaluated using the peak torque τ p and impulse J for every jth stride. J can be found by integrating τ t over the stance duration for each stride as J = where t 0 s and t 1 s are the time at 0% and 100% of each stride's stance phase.
The timing of peak assistive torque t p , in percent stance phase, is evaluated due to the relevance of reproduction accuracy during the most safety-critical time; mid-stance. Additionally, an approximation of cumulative timing accuracy is given using a dynamic time warping (DTW) algorithm [24]. This algorithm minimizes the Euclidean distance between the normalized signals by stretching t. The average delay over all strides D t is then calculated and rescaled for every timeindex t ∈ [0, 100], which gives an approximate delay of the reproduced assistance. For brevity, only the mean delayD is reported instead of the delay at every time step.
The PT may apply pseudo-random perturbations to gauge the current status of the unassisted gait, prevent overreliance on assistance, or automation of the task by the patient, which have all been shown to negatively impact rehabilitation outcomes [17]. Therefore, the standard deviation of both peak torque σ τ p and impulse σ J is analyzed to evaluate the ability to emulate how much the PT adapts or changes their behavior. and high (hyper) knee extension (Behavior 2) throughout the training. The GMM was able to successfully differentiate between these patient behaviors and assign appropriate stiffness gains to characterize PT2's assistance. Larger stiffness gains K P i were identified in response to the higher magnitude of assistance observed during Behavior 2. (b) Reproduction results showing the observed torque from the PT τ t (black) and the assistance reproduced by the impedance learning algorithmτ t (blue). The solid lines show the mean assistance while the shaded region show the ± one standard deviation of the data. The dashed red line shows torque modeled by predicting the residual term r t added to the interaction model.
Lastly, we will briefly look at the performance of the algorithm in a more general sense by looking at the high-level behaviors of the PT. This is done to investigate an emergent behavior of the algorithm that shows a loss of information describing stride-to-stride variance. This is partly to be expected due to the abstraction of trajectories with high variance into a finite set of components during GMM encoding. Therefore, increased performance when evaluating the general response of the PT is expected. Here, "general response" refers to the mean torque over all strides for each t within a PT-P session. To investigate this expectation, the same performance criteria mentioned above are computed for the general response of the PT.
For valid comparison between MW groups, which require different assistance behavior, metrics may be reported as the mean percent error of the reproduced assistance defined as RMSPE [25] when necessary.

D. Results
The algorithm was observed to accurately emulate highlevel PT assistance behaviors while still capturing some strideto-stride variation. For each PT-P session, the mean torque and standard deviation for observed and reproduced assistance can be found in Fig. 5(b). For brevity, the full reproduction results can be found in the supplementary materials-C where the performance metrics for each PT-P session can be found in supplementary materials Table SII and the data for between-group comparisons is given in supplementary materials Table SIII. The remainder of this section will use these results, in addition to the remaining analysis strategies in the previous section, to highlight relevant observations and evaluate the proposed algorithm from multiple perspectives.
1) Gait Pattern Classification: The algorithm successfully demonstrated the ability to classify different types of patient gait behaviors and characterize appropriate impedance strategies in response. Figure 5(a) shows the results of a hard clustering procedure on samples (N = 2100) from PT2-P6's training session. Here, the GMM can be seen differentiating between P6's knee flexion by classifying separate samples when the patient presented with moderate knee flexion (Behavior 1) and full knee extension (Behavior 2) during mid-stance. Consequently, unique impedance strategies were able to be characterized in response to Behavior 1 This suggests that PT2 provided more assistance during Behavior 2 which can be interpreted as a response to when the patient demonstrates less "appropriate knee-flexion" (i.e. hyperextended) than Behavior 1. Consequently, this aligns with the higher stiffness gains found for Behavior 2 indicating a more aggressive torque interaction when pursuing the intended future state of the patient.
2) Residual Estimation: The model was able to explain the majority of the observed torque with the modified interaction model during optimization. A mean residual ofr i = 0.08N .m was found for all sessions with the exception of P6, an outlier, that hadr i = 0.53N .m. This is still believed to be sufficiently accurate with 11% of the observed peak torque being unexplained. Fig. 5(a) shows the residual torque component r t (dashed, red line) included in the final reproduction ofτ t . Observations of r t in Fig. 5(a) support the assumption that it is reproducing unmodeled assistance strategies like knee bracing. When present, r t occurs most significantly during periods of maximum weight acceptance which requires a higher magnitude of assistance from the PT. The timing of increased r t aligns with what we would expect from these strategies since the higher requirement for assistance also applies to these unmodeled strategies. Further evidence supporting that r t represents these strategies is given by the absence of significant r t in some patients, particularly those in the LMW group. This suggests that r t is caused by a systemic perturbation from PT-P interaction and not by the sensing system, otherwise, we would likely observe this increased r t throughout all sessions. The predicted assistance had an average r 2 = 0.53 and R M S E = 0.81, 0.41 for HMW and LWM groups, respectively. Notable differences in the performance in r 2 were observed between MW groups (r 2 = 0.61 and r 2 = 0.45 for HMW and LMW, respectively), while little difference was observed for R M S P E for any group. 5) Magnitude of Assistance: The magnitude of the reproduced assistance achieved a RMSPE of 28.00% and 24.91% for τ p and J , respectively. Small differences in RMSPE were observed in supplementary materials Table SII for τ p between both MW and PT groups while differences in J were trivial. 6) Variation in Assistance: For all sessions, reproduction showed lower variation than the observed response in τ p and J with a mean percent error of approximately 15% for both. This is with the exception of the variation in τ p for the PT1-P6 and PT2-P5 sessions where the reproduced assistance showed 20% and 8% more variation than the observed assistance, respectively. Between PTs, the variation for τ p was more accurate for PT2's when compared to PT1's with 8% and 22% error, respectively. Less significant variance in τ p was observed between MW groups with 12% and 18% error for the LMW and HMW group, respectively. 7) Timing: The algorithm displayed accurate timing of assistance with a mean error of D = 2.49% and t p = 7.28% over all sessions. More accurate prediction of peak timing and delay was observed for the HMW group when compared to the LMW group. No differences in timing were observed between PTs. 8) High-Level Behaviors: The analysis thus far indicates the moderate ability for the algorithm to emulate PT behavior on a stride-to-stride basis. However, analysis of the general response of the PT, illustrated by the solid lines in Fig. 5(b), shows an increased performance of the algorithm when evaluated in this scope. An improvement, on average across all sessions, was seen in mean regression metrics (r 2 = 0.92 and R M S P E = 0.32%), mean timing (D = 2.6% and t p = 3.25% error) and in mean peak torque τ p = 10.1%. These results indicate that the algorithm is primarily able to capture highlevel PT behavior while being able to explain some of the variations between strides. Further explanation of the causes and implications of this is given in the next section.

V. DISCUSSION
In this paper, significant effort was invested into creating an insight-driven and data-informed foundation for learning PT assistance behaviors. The analysis in Sec. IV showed that a unique and diverse set of gait behaviors and assistive responses emerged from patients that would generally be considered to have the same level of impairment. Although the presentation of impairment was unique for each patient, most observations aligned with insights from previous post-stroke rehabilitation studies, such as excessive knee flexion during mid-stance or episodes of knee hyperextension. This observation shows the importance of learning assistance strategies from PT-P interactions.
The proposed approach includes the feature of weight-shift to emulate qualities of the PT. While weight-shift is generally an important and well-studied phenomenon in gait training of post-stroke patients [26], [27], it has received far less attention when developing wearable robots. We use weight-shift in conjunction with a future-state attractor and more common kinematic features (thigh and shank), to generate positive results which match the magnitude and timing of PT assistance. Timing of the assistance is especially critical for effective rehabilitation and reduction of metabolic cost [28]. Success in these areas implies that a lower-limb exoskeleton would be able to provide the appropriate assistance to the patient.
However, the algorithm showed some challenges in reproducing the nuances in stride-to-stride assistance variation. This observation is likely due to a combination of three factors: 1) the PT may be unable to assist in a perfectly consistent way due to stochasticity of human-human interaction and the associated challenges with manual gait training, 2) inherent limitations in the sensing system may cause variation in observed data, and 3) the abstraction of high variance torques to a cumulative model of behavior during GMM encoding causes loss of information describing variation between strides in an attempt to be robust against over-fitting and noise. Alternatively, the model may be missing the deterministic feature(s) that explain stride-to-stride variations such as adaptive PT assistance as the patient fatigues or improves. However, we must be cautious when adding additional features to balance the trade-off between 1) a low-dimensional and generalizable model, 2) adding cumbersome sensors that interfere with patient rehabilitation, and 3) possible introduction of noisy inputs.

A. Implications
The proposed algorithm functions as a high-level decision framework that generates a continuous control input (reproduced torque) in a way that emulates assistive PT behaviors. Any arbitrary low-level controller (i.e. PID) can then use this control input to actuate a 1-DoF lower-limb knee exoskeleton to provide assistive knee torques. Specifically, this work shows how additional expert PT knowledge and clinical insights can be identified and integrated into a high-level decision model. Done successfully, this implies that a more robust, effective, and personalized robotic assistance can be generated. Furthermore, the characterization of discrete impedance strategies in response to classes of gait behaviors reduces the learning problem from encoding the assistive response for every possible continuous-state combination to one that is tractable, computationally efficient, and more robust against over-fitting. Opportunity for generalizing to new patients is also present given the fairly safe assumption that assistance strategies do not significantly change between patients with similar gait behaviors. Therefore, once a sufficient amount of data is collected to characterize common gait patterns, the trained model will require little to no additional training when being used to assist a new patient.

B. Limitations and Future Work
A common limitation in studies like this is the limited access to patients, especially amid the COVID-19 pandemic. With a sample size of four patients over eight sessions, conclusions about statistical significance are difficult to make for both the biomechanical characterization and algorithm performance, and more trials are needed to be conducted in the future.
One significant limitation of the wearable sensor system is its ability to capture the information available to PT. A vast set of observations, such as verbal communication, context, and full-body patient kinematics are all exceedingly difficult to capture in full. Although the latter can be addressed with additional sensors, a trade-off between the resolution of data and interference in rehabilitation sessions must be considered such that it does not decrease the validity of the data or detract from the patient's rehabilitation outcomes. However, one addition that we do plan to capture is joint kinematics on the non-affected side to observe the symmetry of the patient's gait, which is an important gait training outcome. Similarly, estimating the center of mass trajectory could be advantageous for a more detailed description of how weight is shifted.
As for the algorithm, the primary goal of future work is to implement a more supervised approach to the GMM encoding. Although the GMM was successful in classifying gait behaviors, placement of cluster centers could more directly align with how a PT would interpret these classes. This would result in clearer boundaries between class definitions and enable a more accurate representation of the PT decision-making model. Finally, by dynamically adjusting the time offset t f of the future-state attractor we would likely help the model better align with true modes of PT decision-making.
For successful implementation of the presented model for robot-aided training, the validity of the model in reproducing PT behavior should be further tested with a large number of patients and PTs. Additionally, the reliability and safety of the proposed control model should be verified when presented with perturbations that resemble unseen but possible events during the gait training, such as tripping.

VI. CONCLUSION
This paper presented a predictive model for lower-limb exoskeletons that adopts an insight-driven and data-informed approach to modeling PT assistance. Data were collected during live gait rehabilitation sessions at a resolution not commonly found in PT behavioral modeling. Analysis of this data allowed for a nuanced description of patient impairment in terms of practical rehabilitation principles found in the surrounding literature. A novel LfD impedance algorithm was then implemented and observed to successfully characterize patient gait patterns and capture high-level PT response strategies. Implications of this research provide an opportunity for an insightful model-based approach to controlling rehabilitation robotics. Plans for implementing this modeling framework in a lower-limb exoskeleton is in place to evaluate if the proposed approach translates to a real rehabilitation setting.