SFDA: Domain Adaptation With Source Subject Fusion Based on Multi-Source and Single-Target Fall Risk Assessment

In cross-subject fall risk classification based on plantar pressure, a challenge is that data from different subjects have significant individual information. Thus, the models with insufficient generalization ability can’t perform well on new subjects, which limits their application in daily life. To solve this problem, domain adaptation methods are applied to reduce the gap between source and target domain. However, these methods focus on the distribution of the source and the target domain, but ignore the potential correlation among multiple source subjects, which deteriorates domain adaptation performance. In this paper, we proposed a novel method named domain adaptation with subject fusion (SFDA) for fall risk assessment, greatly improving the cross-subject assessment ability. Specifically, SFDA synchronously carries out source target adaptation and multiple source subject fusion by domain adversarial module to reduce source-target gap and distribution distance within source subjects of same class. Consequently, target samples can learn more task-specific features from source subjects to improve the generalization ability. Experiment results show that SFDA achieved mean accuracy of 79.17 % and 73.66 % based on two backbones in a cross-subject classification manner, outperforming the state-of-the-art methods on continuous plantar pressure dataset. This study proves the effectiveness of SFDA and provides a novel tool for implementing cross-subject and few-gait fall risk assessment.


I. INTRODUCTION
I T IS commonly known that falls are a major public health issue, seriously affecting the life quality of the elderly due to their high frequency of onset, expensive cost of treatment, and longtime recovery [1], [2], [3].Unlike fall detection [4], [5], fall risk assessment is an approach that focuses on early prevention and risk diagnosis through the development of sensor-based facilities [6].In the field of sensor-based health monitoring, inertial sensors are widely used and have shown good accuracy in fall risk assessment [11], [12], [13], [14], [15], [16], [17].However, when it comes to ease of use for the elderly, inertial sensor-based facilities typically require deployment on multiple parts of the body to achieve reliable performance [11], [12], [18], [19].In contrast, smart shoes with integrated pressure sensor offer both comfort and practicality [7], [8], [66].Consequently, assessing fall risk through few-gait plantar pressure monitoring is suitable for elderly in their daily lives [9], [10].
In classification of plantar pressure data, each subject can be considered as an independent domain because of their individual information, leading to domain discrepancies.When dealing with un-seen subjects, a well-trained model based on supervised learning struggles to perform well due to the domain discrepancies.Therefore, the objective of fall risk assessment is to address distribution mismatches and develop a model that can be generalized to well classify novel subjects, providing practical real-world applications and clinical diagnoses.
More recently, another new setting called multi-source has emerged, and accordingly, multiple subjects must exist within the source domain in continuous plantar pressure classification and this could be defined as multi-source and single-target domain setting (MSST).Nevertheless, above single-source domain adaptation (SDA) approaches only focus on adapting source-target domain, regardless of marginal [25], conditional [62] or joint distribution adaptation [26], and ignore the potential correlation among sources [36], [37], [38], [39], [40].Specifically, since domain shift also exists among different sources, SDA methods which mix multiple training subjects' samples as a combined source domain cannot guarantee that target samples will be effectively adapted to source subjects' distribution [36].
Similar to SDA, a straightforward approach for multi-source domain adaptation (MDA) to deal with multi-source data is also to merge all sources into one domain [39], which leads to an insufficient variance elimination in MSST [36].In order to fully exploit multiple subjects' data distribution, some MDA began exploring feature representation approaches and combination of pre-learned classifiers [39], [40], [41], [42], [43].The former approaches try to align the latent space of different domains based on optimizing the discrepancy loss, such as Rényi-divergence [48], L2 distance [49] or align the features through adversarial objectives, such as GAN loss [57], Wasserstein distance [58], [69].The later approaches attempt to train per source separately and pairwise align the target with each source [37], [41], [48], [69].Another solution is to assign a weight for each pre-learned classifier according to the relationship between source and target domain [39], [58].
However, many of these existing MDA approaches have notable limitations that the number of parameters and computational complexity in the model would sharply increase as more source subjects are involved when employing individual source classifiers or unshared feature extractors [37].In our MSST setting, this strategy would require over 30 source classifiers, which is impractical.Even in a shared feature space, the scattering of multi-source representations degrades the effect of task-specific training (MSST fall risk assessment in this paper) for MDA [68].Thus, a better solution is to alleviate source subjects' distribution shifts, aiming at fully exploiting task-specific features from more source subjects.
Consequently, it is necessary to reduce the domain gap among source domains from same class (referred as source subject fusion, SF) to avoid neglecting task-specific information of each source subject.SF, which is based on domain adversarial module, helps extracting task-specific features from as many source subjects as possible.To limit the complexity of model, a shared feature extractor is used to map data into a common space, and two discriminators are employed to reduce the domain discrepancy between disparate source subjects and alleviate the distance between source and target domain.Then, the generalization ability and robustness of the model can be enhanced by prompting SF and source-target domain adaptation (DA).Noting that in MSST, as the information from single-target subjects is unknown, we can regard them as a whole domain and we can interchangeably use the term "domain" and "subject" in this paper.Our main contributions are summarized as follows: • We propose an end-to-end MDA framework by promoting SF and DA synchronously, where multiple source subjects are fused by adopting domain adversarial module to reduce their distribution distance.
• We illustrate the relation between SF and DA, that is, reducing the discrepancy within source subject encourages reducing the upper bound of the discrepancy between source domain and target domain, and the optimal weight of the SF and DA in training is verified by experiment.
• Empirically the proposed SFDA method can significantly outperform the state-of-the-art methods in continuous plantar pressure dataset.
This paper is organized as follows.In the Section I, the overview of fall risk assessment is summarized.In the Section II, the related work of fall risk classification, unsupervised domain adaptation and multi-source domain adaptation are investigated.In Section III, we illustrate the methodology, including problem statement, concrete modules of SFDA, training procedure and theoretical analysis of subject fusion.In Section IV, all experimental details and results are demonstrated and discussed.Finally, the summary based on this work are concluded in Section V.

II. RELATED WORK A. Fall Risk Assessment
Due to the portability and wearable comfort, existing inertial datasets are not suitable for daily assessment, while plantar pressure could be used for long-term monitoring.Existing methods can be summarized as conventional machine learning and deep learning [6].Through constructing manual bipedal or weak foot features, simple classifiers can output results at the cost of scalability [44], [45].Dispense with feature engineering, deep learning methods could also output results in real-time by fewer raw gaits.Liang et al. [9] implemented ConvLSTM trained on raw plantar data.Nevertheless, non-cross-subject validation results in inflated accuracy (information leakage) [14].Tunca et al. [11] compared LSTM and traditional classifiers based on spatio-temporal gait parameters in hold-out validation.Meyer et al. [12] evaluated the performance through LOSO validation.However, they overlook eliminating the domain shift between source and target data.With DG-DANN as the benchmark [46], Wu et al. [10] proposed a hierarchical framework to improve sensitivity and generalization ability.But generalization methods require comprehensive samples which cannot be guaranteed in realworld application.

B. UDA
For generalizing well when target samples are un-seen, many shallow layers approaches reduce domain gap according to various discrepancy criteria, such as MK-MMD, JDD [25], [26] while some statistic approaches achieve alignment in high dimensional space [24], [29].Long et al. [63] proposed a deep network for marginal distribution adaptation based on multi-layer and multi-kernel MMD.JAN was constructed in [26] for eliminating the joint distribution gap between source and target domain.Wang et al. [65] evaluated the weight of domain adaptation between marginal and conditional distribution.Deep adversarial approaches are another useful method for domain adaptation, which alleviate distribution difference by confusing the domain discriminator.Ganin et al. [32] proposed DANN with gradient reverser layer to realize adversarial function.Kumar et al. [34] proposed Co-DA to align global and local distribution at the same time.Fan et al. [70] adapted the source models on the target domain by modulating the domain-specific statistics in features of BN layers in sleep staging tasks.

C. MDA
Unlike SDA, it is necessary to make full use of every subject within training set [36], which is also important in fall risk assessment.Zhang et al. [47] aligned each source subject and the target data in individual feature spaces for sEMG classification.Wang et al. [68] proposed clustering embedded adversarial training with dependent source representations for multi-source sentiment analysis and digit classification.Zhu et al. [50] alleviated the conditional distribution in subdomain across different domains based on a local maximum mean discrepancy (LMMD) without adversarial training.Dai et al. [39] constructed extractors for each domain to separate shared and privacy features and assign different weight to different domain.Karimpour et al. [43] employed l 2,1 norm to reweight the multiple source data for reducing the impact of unrelated source samples in image classification.Deng et al. [41] trained separated classifiers to align each source and target data in ECG classification together with sample-imbalance aware mixing strategy.Similarly, Liu et al. [71] trained DANN based on each source domain and the specified target domain.Wei et al. [72] aligned the data distribution for each pair of subjects and output by decision fusion.

III. METHODOLOGY
In this section, we will present the details on problem statement, concrete modules of SFDA, training procedure and theoretical analysis of subject fusion based on fall risk assessment.

A. Problem Statement
In the field of continuous plantar pressure classification, each participants' data space can be seen as an independent domain, which is defined as a joint distribution P d (x, y) on X × Y, where X , Y are relatively input space and output space.In MSST setting, as the cost of obtaining sourcedomain subject labels is very low, d ∈ D {1, . . ., S} denote the individual subjects within source domain.Accordingly, P d (x) is defined as distribution from target domain, where d ∈ D{1, . . ., T } and D ∩ D = ∅.Here, the marginal probability distribution P d (x) can be obtained, as well as the conditional distribution P d (y|x), where d ∈ D .In MSST, we assume that P i (x) ̸ = P j (x) and P i (y|x) ≈ P j (y|x), ∀i ̸ = j where i, j ∈ D and there is no need to concern about the ill-posed problem for each subject was labelled as one class, which can hold the setting that the conditional distributions stay stably.
Specifically, {(X d , y d ) ∼ P d (x, y)} S d=1 are defined as training data and each X d contains N samples x d m where m ∈ {1, . . ., N }.Note that y d m ∈ {0, 1}, ∀m ∈ {1, . . ., N } where {0} represents "low risk" and {1} represents "high risk" and u d m ∈ {1, . . ., S} indicates subject label, which could also be understood as subject ID in source domain.Un-seen data from T target domains {(X d )} T

B. Source Subject Fusion
In MSST DA, especially in plantar pressure classification, neither a relevant feature from each source subject should be ignored or priorly downgraded.And it is worth noticing that, besides dataset shift [51], the shift between subjects might include more recessive reasons, such as their waking habit, weight, mental health, disease history, the devices or time that collected data [10], [14], which reflect in the discrepancy in source subjects.Thus, target samples are difficult to transfer to all the source domains in the same class due to above inevitable characteristics.However, SDA might mix source subjects together and MDA require too many individual feature extractors.Consequently, we need target samples to be transferred to more source subjects in the same class to extract task-specific features, improving the generalization ability of model and interpretability in clinical diagnosis.
To illustrate the relationship between internal SF and external overall DA in statistic way, let H be Reproducing Kernel Hilbert Spaces (RKHS), and φ : X → H, where φ is a representation function that maps the instance set to feature space.For simplicity, we assume that there are only two subjects in the source domain and MMD is used to measure the distance or discrepancy between two distributions which are mapped into H [53].The empirical estimate of the distance between X ∼ P(x) and Y ∼ P(y), as defined by MMD, is where X = {x 1 , . . ., x m1 } and Y = {y 1 , . . ., y m2 } are random variable sets.Thus, considering the independent domain distribution in H, two kinds of distance can be well-estimated by MMD: 1) distance between source and target domain, 2) distance between each subject inside source domain.
Assuming that X 1 ∼ P 1 (x), X 2 ∼ P 2 (x) in source domain D , where P 1 (x) ̸ = P 2 (x).And X d is defined as samples from target domain, where d ∈ D and X d = { x1 , . . ., xm d}.The distance between source and target domain can be approximately estimated as the sum of the distances between two subjects and D: As shown in ( 1) and ( 2), Dist(D, D) and the distance between the above two subjects can be respectively empirically calculated by the squared MMD and written as: where C is transformation matrix.It is known that there must be a subject in source domain is closer to D in the H space.
Assume that X 1 is closer to D than X 2 in a H and we have: Actually, the gap between different subjects will be increased when data are collected after long time interval or with different devices under the assumption that X 1 , X 2 , X d were subjects who have individual information without significant differences, such as their ages, genders, heights, weights, health conditions.Therefore, the distribution discrepancy between source samples collected in the early stage and target samples to be transferred is likely to be greater than that between source subjects.Thus, we have: The detailed reason for assuming equation ( 6) is listed in Appendix.According to (3), ( 4), ( 5), ( 6) and assuming that the representation distribution of X 1 stay stably in H, we can learn that, on a finite probability space, reducing the proposed discrepancy between subjects in source domain could push Dist(X 2 , X d ) be closer to Dist(X 1 , X d ).Naturally, the upper bound of discrepancy between source and target domain could be reduced since Dist(X 2 , X d ) becomes smaller under subject fusion.
In detail, besides classification task, another purpose of domain adaptation with subject fusion is to find a transform that can reduce the distance between D and D, as well as confuse the distribution of subjects in D, which can be optimized as: where λ, γ are the tradeoff parameters for transformation penalty.In other words, when minimize the distance between source domain and target domain, we require that the discrepancy between source subjects will not be too large, aiming at adapting more efficiently.

C. Model Summary
According to the concept of subject fusion, we could easily transfer the two-subject setting, as shown in equation ( 7), to multi-subject setting.In this part, we will concretely introduce the modules in SFDA framework, as shown in Fig. 1.
Here, binary classifier L, D and multi-class classifier F all utilize cross entropy losses for optimizing the gradients, which is denoted as L L , L D , L F , respectively.Given a shared feature extractor E (•;θ e ) with parameters θ e , a task classifier L (•;θ l ) with parameters θ l , a domain discriminator D (•;θ d ) with parameters θ d , a subject fusion discriminator F •;θ f with parameters θ f , the goal of SFDA could be achieved by the following modules.
1) The Shared Feature Extractor E: Like RKHS, the shared feature extractor E aims at producing a common space that could allow features transformation C and extract deep shared features from both source and target domain.The tradeoff parameters λ, γ are achieved by the L2 regularization in all experiments.
2) The Label Classifier L: L is a usual classifier and its input E(x d m ) all come from source domain.The formulated objective of L is: where y d m refer to the class label of source sample x d m , ∀m ∈ {1, . . ., N } and d refer to the subject label within source domain, ∀d ∈ D = {1, . . ., S}.As equation ( 8) demonstrated, this optimization pushes L to access fall risk correctly among labelled source data.
3) The Subject Fusion Discriminator F: Here, the second optimization objective of equation ( 7) could be achieved by F. It is complex and time-consuming to calculate the MMD between each pair of samples between two different subjects within source domain.Hence, motivated by multi-source training methods [39], [46], we adopt adversarial module to achieve the goal that alleviating the domain gap between source subjects.It is obvious that MMD tends to be 0 when the distributions become similar or confused, thus the MMD optimization could be replaced by distribution fusion.The formulated objective of F is: where u d m is the subject ID label for source sample x d m , ∀m ∈ {1, . . ., N }.G(•) indicates Gradient Reverse Layer (GRL) that could reverse the gradient in backpropagation.In the training procedure when G(•) is frozen, the goal Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Feed E(x) into L and get L(E(x)) 9: Feed E(x) into F and get F(E(x)) 10: Feed E(x), E(x) into D and get D(E(x)), D(E(x)) 11: Updating E, L, F, D to minimize L 16: until convergence of F is to distinguish exactly where the source samples come from.After gradient reverse through GRL, as equation ( 9) presents, F makes the parameters in E hard to extract source domain-specific features, which achieves the goal that alleviates the subject distribution shift within source domain.
4) The Domain Discriminator D : Here, the first optimization objective of equation ( 7) could be achieved by D. Similar to F, D is a usual adversarial module (binary classifier) that pays attention on source and target domain and its formulated objective is: where o d m , o d n refer to the source and target domain label and d ∈ D = {1, . . ., T }.As shown in equation (10), in the training procedure when G(•) is frozen, the binary classifier D plans to precisely identify which domains do the samples come from.When updating parameters of model in backpropagation, this optimization is to align distribution from source and target domain through GRL.When the training of D is converged, distribution gap between source and target data could be reduced.In other words, MMD between source and target domain could also be alleviated without complex computing.Noting that GRL will not be used in training forward stage.
5) The Training Procedure: In general, the proposed endto-end framework requires backpropagation to update the parameters in four modules, as demonstrated in Algorithm 1.
Here, the realization of proposed optimization problem is summarized.Source data and target data will be input to E (could be seen as C in equation ( 7)) at the same time, while the features of source data will enter L, F, D and features of target data will only enter D. Concretely, as demonstrated in equation (7), optimizing L D help minimizing Dist(D, D) and L F help minimizing Dist(X 1 , X 2 ).Thus, we could satisfy the demand in optimization ( 7) by synchronously carrying out DA and SF.The total loss function of SFDA is formulated in weighted summation formula: where α is non-training configuration that represent the loss weight of SF and can regulate the relation between SF and DA.

D. Theoretical Analysis
Aiming at better analyzing the generalization ability of our proposed SFDA framework, we report a universal theoretical generalization bound analysis in field of DA based on the theories from [54], [55], and [56].
Theorem 1: Let H be a hypothesis space and S, T is referred as source domain, target domain, respectively.If we denote h ∈ H is a function that need to be tested in un-seen target domain, then for any h ∈ H, we have: where d H (S, T ) indicates the domain discrepancy between source domain S and target domain T and C can be seen as the shared error of ideal function h * ∈ H for S, T .In addition, R S (h), R T (h) refer to the expected error on S, T , respectively.In fall risk assessment through SFDA in MSST setting, we can analyze the generalization error according to (12).Focusing on d H (S, T ) while C could be disregarded as [25] and [32] did, we could assume that source domain involves S subjects and it is natural to extend (5) into multisubject situation: where P i indicates the i-th global distribution in S, P d indicates the global distribution of T , P f and P f * denotes the distribution of source subject which is the most different and most similar with P d , respectively.Before training started, the largest distribution gap within multiple source subjects could be defined as: According to equation ( 7) and ( 11), SFDA can achieve the goal of minimizing Dist(X 1 , X 2 ) (the distance within source subjects) and Dist(D, D) (the distance between source and target domain) at the same time.When paying attention on optimizing Dist(X 1 , X 2 ) itself, parameters in SFDA modules can map the input data into a feature space that can narrow the distribution gap between disparate source subjects.The source subject distance can be denoted as: Herein, the result of equation ( 15) will be reduced when subject fusion training converges.Ideally, each subject in source domain will mix with each other and every personal distribution P i can be regarded as P f * in source domain.Meanwhile, when optimizing Dist(D, D) (source-target domain adaptation), the distance between source subjects and target subjects will be decreased.Thus d H (P f * , P d ) is reduced.

TABLE I DOMAIN DIVISION IN CONTINUOUS PLANTAR PRESSURE DATASET
At this time, d H (S, T ) = S • d H (P f * , P d ) and R T (h) will be bounded.In conclusion, the synchronization of SF and DA can guarantee the generalization performance of the model.

IV. EXPERIMENT AND DISCUSSION
In this section, we introduce the dataset and experimental details for fall risk assessment such that an expert should be able to reproduce the main results.When discussing the assessing ability of SFDA, we empirically evaluate and compare the performance of proposed methods, including accuracy, sensitivity, boxplot results, proxy −A− distance and t-SNE visualization.
A. Experimental Setup 1) Dataset: As the same in our previous work [10], [45], we train and evaluate the proposed model in the continuous plantar pressure dataset, which can be used for fall risk assessment in long-term monitoring comfortably [67].This plantar pressure dataset contains 48 subjects and 7462 samples in total, including 23 high risk subjects and 25 low risk subjects.As shown in Table I, the dataset is divided into four different MSST setting in order to traverse the whole dataset and produce comprehensive experimental results, as 4-fold cross validation did [79], [80].Noting there is no information leakage between divisions when adopting leave-onedivision-out.
2) Baselines: We compare proposed SFDA with the closely related state-of-the-art baselines.When compared with unsupervised single-source domain methods, all the subjects within source domain will be regarded as a whole single domain while neglect their internal discrepancy.All baselines in the comparison involve: DAN [63], which reduces global discrepancy through MMD; Deep CORAL [27], which reduces global distance through second-order statistics; DANN [32], which alleviates global distance through adversarial module; CDAN [62]: which alleviates discrepancy through conditional adversary; DSAN [50], which reduces global and local discrepancy in multi-subdomain adaptation; MhNet [10], which construct a hierarchical network for fall risk assessment.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.3) Implementation Details: Above methods reproduced in Pytorch and GeForce GTX 1080 Ti / GeForce 2080 Ti.In pre-processing stage, raw data are split with overlapping of 2 gaits in time sequence.Then, 16-channel plantar pressure are arranged vertically in sequence to form 3-gait samples, which are 2D tensors of 16×69 (channel × time points) as [10] did.In four MSST settings, each division Pi, i ∈ {1, 2, 3, 4} refers to a multi-subject domain, which includes m subjects (#).When evaluating the performance of methods, each Pi will be taken in turn as a target dataset while the rest divisions will be taken as source training dataset.When testing Pi, i ∈ { 1, 2} , the SGD optimizers select 10 −3 , 10 −2 , 10 −2 as learning rate of pre-train model finetuning, classifier and adversarial module while 10 −4 , 10 −3 , 10 −3 are used for Pi, i ∈ { 3, 4} .α could be selected from [0, 1].
For our framework, ResNet-50 [59] and VggNet-16 [60] are adopted as the backbone network.A 2-layer fullyconnected layer is used as classifier while adversarial module is set as 2 layers with 2048 nodes if needed.The batch size is set to 12 in ResNet-50 and 32 in VggNet-16, and the maximum training epoch is set to 100.The weight decay is fine-tuned from {5×10 −4 , 1×10 −3 , 2×10 −3 } and momentum is set to 0.95.We choose the best performance under different configurations for each method.Specifically, Coral, MMD, LMMD loss weight are set as 0.5, 0.35, 0.2 for better convergence.Since MhNet is superior owing to hierarchical structure and voting mechanism without pre-trained model, it is fairer to discuss the average voting classification accuracy (Segment=7).Also, as MhNet did not use any domain adaptation method, it is not suitable to analyze the Proxy−A− distance or t-SNE visualization.

B. Experimental Results and Analysis
The cross-subject classification results are shown in Table II and Table III.Additional experiment results from Inception-V3 [78], ResNet-34 [59] and VggNet-19 [60] are presented in Appendix.Each baseline is implemented in five replications, and SFDA outperforms all the compared baselines in terms of the average accuracy and MhNet votes and averages segment under five thresholds (10%, 20%, 30%, 40% and 50%).
When using ResNet-50 as pre-trained backbone, SFDA ranks first in 3 out of 4 MSST settings in Table II, achieving average accuracy of 79.17%.Particularly, SFDA outperforms the second-best baseline by nearly 3.3% in terms of accuracy when P4 is considered the target domain.In addition, SFDA demonstrates stable performance in all four MSST settings, while the performances of other baselines vary significantly across scenarios, highlighting the robustness of SFDA.Concretely, DeepCoral performs well when P1 is target domain but exhibits inferior accuracy in other settings, and even shows negative transfer when P3 is the target domain.MhNet achieves good performance when P1 and P2 are target domains but performs the worst when P3 is target domain.Especially in P2, MhNet outperform other methods by around 30%, which indicates the superiority of hierarchical structure and voting mechanism in certain fall risk assessment setting.In general, DSAN, which pays attention on multiple subdomains adaptation, performs relatively good in MSST settings.Both joint distribution adaptation and adversarial approaches are superior to methods that solely focus on marginal distribution alignment.When using VggNet-16 as pre-trained backbone, SFDA still performs stable and good in 4 MSST settings, as shown in Table III.MhNet performs well except for P3 and P4, which may result from the insufficient domain-variant feature extraction.Although SFDA achieves the best average accuracy, the overall performance is inferior to that of ResNet-50.This may be attributed to the shallower network's inability to extract enough task-specific features.Hence, compared to the baselines, SFDA significantly improves generalization  ability in cross-subject fall risk assessment.The combined framework of SF and DA could ensure stability across various MSST settings.
1) Sensitivity: Sensitivity is a key index in physiological classification and medical diagnosis, as it indicates the initial screening capacity for disease detection.In the case of ResNet-based results, as shown in Fig. 2 (a), although all the baselines exhibit poor sensitivities when P2 is target domain, the box plot indicates the superiority of SFDA for its high median and upper quartile.Moreover, SFDA and MhNet have relatively few sensitivities outliers (represent by dots outside the box), revealing their stability.As shown in Fig. 2 (b), sensitivity based on VggNet-16 perform similar to those based on ResNet-50 for all baselines.However, SFDA outperform other methods (except for MhNet) in terms of median, upper quartile, and the number of outliers, indicating its practicality in medical diagnosis compared to other baselines.Thus, voting a segment from multi-layer architectures may be a more sensitive way to identity high risk samples when compared to shallow pre-trained model, such as VggNet-16.To concretely record the sensitivity from each baseline, Table format of relevant results are provided in Appendix.
2) Ablation Study on SF: We evaluate the efficiency and optimal weight of SF by testing in four MSST settings, as shown in Fig. 2 (c).The different-colored polylines represent the accuracy of the proposed method under different MSST settings with varying weights.As equation (11) demonstrated, α was set from 0, 0.2, 0.4, 0.6, 0.8, 1 to explore the optimal weight.We set the loss weight of SF as 0 to represent the framework without SF.As the weight of SF increases, the overall accuracy of SFDA is getting higher based on both ResNet-50 and VggNet-16, indicating the improvement according to subject fusion.The effectiveness of subject fusion is evident as all the model with SF perform better than that without SF.As for the optimal weight between SF and DA, SFDA generally performs better when weight of SF is larger than 0.6.
3) Distribution Discrepancy: As proxy−A− distance can be applied to measure the distribution distance between domains as dist A = 2 (1 − 2ε), where ε is generalization error [55], [61], we evaluate the distance between source and target domain in all MSST settings when implementing all the domain adaptation related methods.In Fig. 3 (a), compared to baselines, SFDA performs smaller dist A for both ResNet-50 and VggNet-16, showing efficiency of reducing domains gap by SFDA.Furthermore, the overall dist A based on VggNet-16 is smaller than that of ResNet-50 due to the feature transferability of shallow network [63].When P2 is target domain, PAD of CDAN trained on VggNet-16 is the smallest according to Fig. 2 (b).However, it is worth noticing that the accuracy of CDAN in Table III shows that the low PAD value may result from the poor performance in identifying risk samples rather than excellent discrepancy-reducing ability.While in other MSST settings, SFDA consistently achieves the smallest PAD according to Fig. 3 (c) and (d), demonstrating its effective alleviation of domain discrepancy between source domain and target domains through subject fusion and domain adaptation.

4) Source Subject Fusion Visualization:
We visualize the feature representations from the last layer in shared feature extractor through t-SNE [64].Among them, two ResNetbased case studies with P1 and P2 as target domains (  In general, SFDA successfully distinguishes high and low risk samples in all MSST settings, regardless of ResNet-50 or VggNet-16.Additionally, SFDA efficiently aligns the distribution regions between source and target domain, showcasing its superior generalization ability when facing novel samples.Specifically, in VggNet-based clustering shown in Fig. 4 (c) and (d), the raw data distributions of low risk and high risk samples overlap severely without SFDA, resulting in poor task-specific feature extraction, which is matched with the low accuracy illustrated in Table III.Furthermore, as displayed in Fig. 4 (a) and (b), SFDA facilitates the fusion of source subject distribution, thereby alleviating the shift within source subject and enabling target samples to fully exploit source task-specific features.
To quantitatively evaluate the classification performance in feature clustering, the Average Euclidean Distance (AED) is computed from feature matrix.Within source or target domain, Euclidean Distance from each pair of high risk and low risk samples is calculated (stored as a distance matrix where each element represents the distance between a pair of high risk and low risk samples).Subsequently, dividing the sum of the distance matrices by the total number of samples yields the AED, where a higher AED indicates easier distinction between high risk and low risk samples in the source or target domain.Based on AED presented in Fig. 4, employing SFDA results in obvious higher AED, showing its efficient task-specific feature extraction and classification ability.Thus, a high AED can be a proof of high accuracy as shown in Table II and Table III.

C. Results Summary and Limitations
In summary, the superior performance of SFDA, as shown in Table II and Table III, can also be explained by the t-SNE clustering and PAD results as follows: (i) The classification accuracy relies on task-specific features, which form the fundamental of proposed model.As depicted in Fig. 4, in certain MMST settings, the high risk subjects and low risk subjects could be easily distinguished in feature space, which guarantees a lower bound for fall risk assessment accuracy.(ii) Since task-specific features capturing are inevitable in MSST cross-subject setting, source subject fusion module helps reducing internal gap within source domain and prompts target samples to fully adapt to the task-specific features.(iii) In addition to the better performance in reducing domain gap (good at extracting domain-invariant features) demonstrated in Table II, Table III, Fig. 3 and Fig. 4, the lower PAD may also result from poor classification ability (bad at extracting taskspecific features).For example, CDAN achieved the lowest PAD when P2 is target domain and VggNet-16 is backbone, but its accuracy in this setting is poor.
Despite the improvement achieved by SFDA, there are some limitations.Although SFDA outperforms all the baselines in average accuracy, there are a few MSST settings where SFDA fails to achieve the best performance.The outliers and larger standard deviation of sensitivity in Fig. 2 (a) and (b) of SFDA may be attributed to the challenge in fusing source subjects for two reasons.First, subjects shifts within source domain are too difficult to eliminate.Second, the coverage or representative of source subjects is limited, thereby resulting in slight fluctuation.This deficiency reveals the need for further improvement, particularly dealing with the subjects that have huge domain gaps.Moreover, the key of increasing lower bound of generalization error is that the source subjects should be representative and comprehensive enough, allowing SF to extract more task-specific features.
Based on the summary and limitations, we plan to explore the outlier identification procedure in data pre-processing to boost the overall performance.Additionally, in the future, we will investigate the combination of universal domain adaptation with subject fusion, as there are always patients with specific characteristic who are not involved in the source domain dataset.On top of that, as gait features closely connect to various diseases, it is essential to construct a new dataset containing plantar pressure data from patients with different diseases to establish the mapping between complex plantar pressure features and diseases features.In physiological data classification, other types of data, such as ECG, EEG, PPG, EMG, accelerometer data, could be used and evaluated in SFDA and related modified methods.We expect to propose a general method that can be applied in various physiological data based on the main idea of subject fusion and domain adaptation.

V. CONCLUSION
In this paper, we propose a novel method named SFDA for multi-source and single-target fall risk assessment.The proposed approach improves the generalization ability and efficiency of domain adaptation through alleviating source subject shifts.Subject fusion is achieved by adopted the adversarial module to mix the source subject distributions.By synchronously carrying out DA and SF, the robustness of adversarial network is enhanced.Our proposed approach is revealed to outperform the state-of-the-art methods in continuous plantar pressure dataset, providing a method that could be adopted in real-world clinical application.

VI. CONFLICT OF INTEREST
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

C. External Validation
To evaluate whether the good performance of SFDA is only specific to dataset [67] or not, we added external validation experiment on Parkinson Disease (PD) Classification based on a public PD dataset (Gait in Parkinson's Disease [52]), which is available on: https://www.physionet.org/content/gaitpdb/1.0.0.This dataset collected multichannel recordings from force sensors beneath the feet of 93 patients with Parkinson's Disease, and 73 healthy controls (CO), and detail could be checked in above website.To better implement proposed SFDA in this dataset, we split the raw data into 16×200, and each sample contained one gait cycle from 16 sensors.As the fall risk assessment dataset was built through flat walking, we therefore selected a treadmill walking study from PD dataset.Among this study, 29 CO and 35 PD patients were collected and we divide them into 4 MSST settings (4-fold cross validation) as we did before.The MSST division are summarized in Table IX.Due to the GPU memory limitation, we adopt ResNet-34 and VggNet-16 as backbones.Also, the dimensions of fully connected layer are set as 128 for less computation complexity.The accuracy of SFDA based on two backbones are shown in Table X.SFDA performed better in all MSST settings in Parkinson disease classification, which could also demonstrate that SFDA is not only specific to the fall risk assessment dataset.

D. Explanation of Equation (6)
In sensor-based human data classification, the changes in sensors' state during different experiment sessions and days [46] might significantly impact the distribution of raw data, leading to a deterioration in model prediction stability.
Hysteresis, response time, and cyclic stability are factors evaluate the sensors' performance upon dynamic loadings.Among them, the inconsistent sensing performance between loading and unloading is by hysteresis behavior.Cyclic stability is considered to ensure endurance against periodic stretching and releasing cycles.However, a significant challenge for piezoresistive materials lies in their poor stability and the presence of hysteresis effects when subjected to cyclic strain loading, which causes the collected data to change over time [73].Also, the strength of interfacial interaction severely affects the performance of the flexible sensor in longterm measurements [74], [75].What's worse, the conductive material easily aggregates and slides, or even falls off from the matrix when subjected to long-term cyclic compressive strain, resulting in unstable sensor signal [76], [77].Hence, the characters of sensors determine that the gap between different subjects who were collected at different days would be larger than that were collected at the same day.
In particular, the piezoresistive flexible pressure sensors [8] from the public dataset used in this paper [67] exhibit similar characters.In [8], it is observed that the conductivity of the sensor increases slowly with the number of compression times, which is caused by the decreasing resistance because of the increasing number of conductive paths in the conductive rubber during compression.After further increasing the times of compressions to six thousand (after a device has been used for a period of time), the resistance output of the sensor drops significantly, which is determined by the characteristics of the composite sensor.
To sum up, when comparing Dist(X 2 , X d ) (distance between source data and target data that are collected at between long time interval) and Dist(X 1 , X 2 ) (distance between source data that are collected at same batch), it could be formulated as equation ( 6) in manuscript most of the time.

d=1
are defined as testing data.Data space with domain label U = {(X d , 0)} S d=1 ∪ {(X d , 1)} T d=1 is constructed, where d ∈ D , d ∈ D and the samples from S and T are labeled o d m = 0 and o d n = 1, respectively, where n indicates the number of testing samples.Based on S, our goal is to train a model f : X → Y that can assess fall risk correctly when facing un-seen subject samples X d whose distributions are unknown.Compared with traditional fall risk assessment protocols, MSST avoids any information leakage from target domain, which is more in line with practical application.

Fig. 1 .Algorithm 1 ) b m=1 6 : 7 :
Fig. 1.The framework of the proposed SFDA model.The shared feature extractor E captures features E(x d m ) and E(x d n ) from labeled source data x d m with {class label: y d ; subject label: u d ; domain label: o d } and unlabeled target data x d n with {domain label: o d}.First, E(x d m ) with y d is fed into label classifier to extract task-specific features.Next, E(x d m ) with u d is fed into subject fusion discriminator F to alleviate the internal subject shift within source domain through Gradient Reverse Layer (GRL) in backpropagation.Then,E(x dm ) and E(x d n ) along with their domain label are synchronously fed into domain discriminator D to reduce the distribution shift between source and target data.In order to figure out the interaction between SF and DA, different weight biases are adopted in L F and L D .According to the optimal weight, the parameters in F and D could be updated, which results in the well-generalized model.

Fig. 2 .
Fig. 2. (a) Average sensitivity of different methods in 4 MSST settings based on ResNet-50; (b) Average sensitivity of different methods in 4 MSST settings based on VggNet-16; (c) Ablation study on SF in 4 MSST settings.

Fig. 4 .
(a) and (b)) and two VggNet-based case studies with P3 and P4 as target domains (Fig. 4. (c) and (d)) are presented.Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE II ACCURACY
(%) OF DIFFERENT METHODS ON FOUR MSST SETTING BASED ON RESNET-50

TABLE III ACCURACY
(%) OF DIFFERENT METHODS ON FOUR MSST SETTING BASED ON VGGNET-16

TABLE IV ACCURACY
(%) OF DIFFERENT METHODS ON FOUR MSST SETTING BASED ON INCEPTION-V3

TABLE V ACCURACY
(%) OF DIFFERENT METHODS ON FOUR MSST SETTING BASED ON RESNET-34

TABLE VI ACCURACY
(%) OF DIFFERENT METHODS ON FOUR MSST SETTING BASED ON VGGNET-19

TABLE VII SENSITIVITY
(%) OF DIFFERENT METHODS ON FOUR MSST SETTING BASED ON RESNET-50

TABLE VIII SENSITIVITY
(%) OF DIFFERENT METHODS ON FOUR MSST SETTING BASED ON VGGNET-16

TABLE IX DOMAIN
DIVISION IN PD DATASETAPPENDICESIn this section, we provide external validation results as well as supplementary experimental results.A. Performance Evaluation Based on More BackbonesWe additionally evaluate the robustness of SFDA based on ResNet-34, VggNet-19 and Inception-V3.As shown in TableIV, V, VI, three average classification accuracy from all MSST settings shows that SFDA outperforms other baselines regardless of the backbone architecture (Except for Inception-V3, which is only 1.34% lower than the highest accuracy achieved).In Inception-based testing, SFDA generally achieves second-best accuracy.In backbones of ResNet-34 and VggNet-19, SFDA performs the best in average accuracy.However, SFDA performed slightly different compared with ResNet-50 and VggNet-16 mainly due to the depth of network.When training with a deeper architecture, SFDA based on VggNet-19 performs almost better in all MSST settings (especially in P1, the accuracy of SFDA based on VggNet-19 has increased by 6% compared to VggNet-16).When training with a shallower architecture, SFDA based on ResNet-34 performs poorly in all MSST settings.Compared to ResNet-50, the accuracy of SFDA based on ResNet-34 in all divisions (except P3) has significantly decreased, with a maximum drop of 7.9%.However, we still could testify the robustness of SFDA upon above results and it is natural to obtain different results from different backbones due to various factors, such as network frameworks, parameters number, etc.B.Concrete Value of Sensitivity From Fig.2We supplement the concrete sensitivity value from Fig.2. in TableVIIand Table VIII for specifically understanding.

TABLE X
ACCURACY (%) AND SENSITIVITY (%) OF SFDA ON FOUR MSST SETTING BASED ON TWO BACKBONES