AI-Based Task Classification With Pressure Insoles for Occupational Safety

Pressure insoles allow for the collection of real time pressure data inside and outside a laboratory setting as they are non-intrusive and can be simply integrated into industrial environments for occupational health and safety monitoring purposes. Activity detection is important for the safety and wellbeing of workers, and the present study aims to employ pressure insoles to detect the type of industry-related task an individual is performing by using random forest, an artificial intelligence-based classification technique. Twenty subjects wore loadsol® pressure insoles and performed five specific tasks associated with a typical workflow: standing, walking, pick and place, assembly, and manual handling. For each activity, statistical and morphological features were extracted to create a training dataset. The classifier performed with an accuracy of 82%, and a re-analysis focusing on the five most influential features resulted in 83% accuracy. These accuracies are comparable to similar task classification studies but with the benefit of added explainability, which increases transparency and, thereby, trust in the classifier decisions. The combination of random forest and in-depth feature analysis (SHAP) provided insights into the importance of certain features and the impact of their value on the classification of each task. The insights obtained from these methods can aid in the design of pressure insoles that are optimized for the extraction of impactful features and the prevention of work-related musculoskeletal disorders in Industry 4.0 operators.


I. INTRODUCTION
Pressure insoles measure the plantar pressure experienced between the foot and the sole of the shoe, and they can be used to estimate total ground reaction forces and the center of pressure in ambulation [1], [2], [3].While force plates are considered the 'gold-standard' for load measurement in biomechanics, pressure insoles such as the loadsol® (Novel GmbH, Munich, Germany), F-Scan (Tekscan, Massachusetts, The associate editor coordinating the review of this manuscript and approving it for publication was Chan Hwang See. USA), and Moticon (Munich, Germany) allow the collection of real-time pressure data inside and outside a laboratory setting with good accuracy.
Pressure insoles research extends from sporting to rehabilitation contexts, with studies assessing injuries and patients' recovery [4], [5] and aiding in sports performance [6], [7], [8].The detection of activities of daily living via plantar pressure mapping has been investigated in [9], showing the potential to monitor exercise technique and detect unhealthy posture.In clinical settings, the devices have been used to develop footwear to prevent ulcer recurrence in diabetic patients [10] and to detect freezing of gait in patients with Parkinson's disease [11].
In occupational health and safety research, pressure insoles were also used to detect loss of balance events [12], indirectly monitor loads on the lower back during manual handling [13], and design insoles that redistribute and reduce pressure on the workers' feet [14].Specifically in the construction industry, the devices were used to reduce the possibility of back pain [15], quantify physical intensity [16] and workload with the assistance of computer vision [17], identify safety hazards [18], perform fall risk assessments [19] and activity recognition [20], and classify fatigue levels from gait [21].
Industry 5.0 builds on the automation and efficiency of Industry 4.0 and denotes a move towards a human centric approach that enables humans and machines to collaborate.Among other things, the new collaboration involves minimizing the repetitive and physical workloads that can cause fatigue and, if not detected and monitored, can lead to work-related musculoskeletal disorders [22], which are a leading cause of injury and work absence [23].Pressure insoles present a non-invasive means of collecting related worker data for Industry 5.0 without interfering in worker activities.Task detection is important for both the worker and the employer and can be used to achieve an appropriate trade-off between work targets and the worker's wellbeing.Pressure insoles can be used to detect the weight of a tool being held, the center of pressure (CoP), and the distribution of force throughout the foot while an activity is performed, all of which give an indication of the type of task and the overall workload.Additionally, task detection and the weight assessment of the manipulated object or tool are crucial elements in biomechanical models for human endurance, which are also related to fatigue prevention research.Pressure insoles can be included in the personal protective equipment of smart workers and be fully integrated within a wireless sensing framework of smart manufacturing, leading to the comprehensive overview and monitoring of the safety status of the worker.
In terms of accompanying computational algorithms, machine learning (ML) has long been used in conjunction with wearable sensors to identify the state and activities of the user [24], including gait events [25], common daily activities [26], and even cooking [27].Classification is important for the monitoring of ergonomic risks and physical activity for health and wellness purposes.De Pinho André [28] outlines three categories of activity classification using pressure insoles: gait phases and patterns, common daily activities, and specific activities.We have conducted a literature review on studies involving ML algorithms and pressure insoles in line with the three aforementioned categories, and we present in detail our search strategy and important elements extracted from each paper in the supplementary material (Supp.Mat.) of this publication.We additionally provide information on open access databases containing raw pressure insole data (Table 1).In summary (Tables 1 to 3, Supp.Mat.), it was observed that many research groups have used their own custom designed insoles [29], [30], where ML-based classification is often used for validation purposes.It was also evident that there is little consistency in the classification techniques employed, as well as the validation and assessment methods [31], [32], [33].A high degree of variability in terms of segmentation methods was also apparent, ranging from gait cycle [31] to step [34] to fractions of a second [28], while time windows of different lengths were often used for activity classification, sometimes with an overlapping or sliding window.The majority of activity recognition studies utilize pressure insoles with built-in IMUs [35], [36] and motion capture [37], [38], [39].The use of additional sensors allows tasks to be distinguished easily, often with higher accuracy, for example [13].However, in terms of real-world deployment of human activity recognition (HAR) systems, a task classification system based on force data alone has benefits in reducing cost and computing power and minimizing the impact of wearables on an individual's daily activities.
Along these lines, the present study aims to use a supervised ML classifier for the task classification of industry-related work activities, using ad-hoc features evaluated on three specific pressure sensor areas in each insole.The features extracted consider both statistical and morphological aspects of force data; force data alone has yet to be considered in HAR research with pressure insoles as studies generally benefit from the assistance of accelerometers and/or additional sensors (see Supp.Mat.Tables 1 to  3).The explored method of classification, random forest (RF), and the feature analysis method, the SHAP (Shapley Additive explanation) analysis library [55], allows a thorough understanding of the impact each feature has on the overall classification and a within class evaluation provides insights into the impact feature values have on individual classes -the latter cannot be achieved with typical feature analysis methods.References [56] and [57] provide detailed explanations of SHAP and its variations, including a survey of SHAP uses on health sensor data.Random forest is commonly used in human activity recognition studies, for example, [28], [40], and [41], (Supp.Mat., Table 2), but is utilized in only a few specific recognition studies, such as [12] and [42].An explainable approach through RF and SHAP enables transparency, and therefore trust, in the decisions made by the classifier, which is essential in a health and safety context.Additionally, important features required for task identification can provide insights into how insole-based sensing systems can be optimized for real-time task classification.The sensors, the features extracted from these sensors and their analyses for enhanced explainability, and the creation of a new open-source database of pressure insole data for task classification present valuable contributions to the development of accurate, costeffective and streamlined real-time HAR systems for smart manufacturing.
The collection of insole pressure data created in this study has been made openly accessible through the publication of the open access database on Zenodo [50].While the focus of the study is the industry sector, the task classification in the present research is relevant to all areas of occupational health and safety research since manual handling, assembly and pick and place tasks are required in almost all professions to varying extents; and as such, manual handling training is a requirement in most workplaces.
The contributions of this work include the use of explainable ML methods for the task detection of five common occupational tasks; analysis of the most significant features highlighted by the SHAP analysis which give rise to recommendations for the future design and optimization of pressure insoles for occupational health and safety specifically, as well as an open access dataset of the pressure insole data included in the study.Task recognition of common work activities, such as those presented in this study, can aid in the real-time detection and prevention of work-related musculoskeletal disorders.Different tasks are associated with different levels of fatigability, depending on factors such as load [23] and task complexity [58], and using the methods described in this paper, the operator's state can be monitored with information from pressure insoles such as task type, duration, and load.

II. METHODOLOGY A. PARTICIPANTS
Twenty subjects (10 females) were recruited by word of mouth and email advertisement to take part in the study.The study had the ethical approval from the university's ethics

B. DATA COLLECTION
Subjects all wore the same type of trainers containing loadsol® pressure insoles, which sample at 100Hz, in a choice of three sizes (EU 39, 43 and 45).Subjects with different shoe size did not qualify for participation in the study.Each insole samples force data from three areas: heel, midfoot and forefoot.The heel and midfoot sensors each occupy 30% of the insole length, and the forefoot takes up 40% of the insole length.
Each participant carried out five tasks for at least one minute each, of which 55 seconds were analyzed for data balancing purposes.The first was standing still in static posture (Task 1); the second was level walking (Task 2), where, prior to the task, subjects were advised to change the direction and speed of their ambulation.The next three tasks were industryrelated; they were assembly (Task 3), pick and place (Task 4) and manual handling (Task 5).
Fig. 1 shows 3 seconds of raw sensor data for all tasks.All subjects changed their walking direction and speed throughout Task 2, at a self-elected time and speed.The assembly task involved piecing together 3d printed bolts and nuts and taking them apart repeatedly until 1 minute of data were acquired; the parts were laid out in a semicircular arrangement on the workbench within arm's reach (approximately 25cm) of the participant (Fig. 2).The pick and place task involved moving light weights (0.5kg and 1kg) from one corner of the table to the other in a self-selected manner and speed; more than one weight could be moved at a time, and weight(s) could move diagonally, side-to-side, or straight ahead.Finally, the manual handling task required the subject to lift a box containing a 10kg weight from the floor to a chair (height of 48cm) to a table (75cm); again, at a rate subjects chose, and the order of lifting was not prescribed (for example, from the floor to the table and finally to the chair).Participants were instructed to carry out the tasks continuously at a self-elected rate for at least one minute and in the order they chose (for manual handling, assembly and pick and place tasks), thus enabling the classifier to also take into account the individual differences that take place during the completion of tasks.

C. FEATURE EXTRACTION
Space-temporal and morphological information were extracted from raw data provided in force (N) from the commercial software accompanying the pressure insoles (loadsol-s), to create the training and testing datasets for the ML classifier.Features from force data, CoP and morphology of the peaks were obtained over different time windows (2, 3, 4, 5, 6, 8, 10, 12 and 15 seconds).A total of twenty-four features were extracted from the data; where the number of features related to force, CoP and peaks morphology were eleven, six and seven, respectively.The total number of data points per feature was calculated by: (total no. of samples ÷ number of samples per window) × no. of tasks (Table 1).For example, for 55 seconds of data acquisition at 100Hz and considering an observation window of two seconds, for five tasks we have: 5500 ÷ 200 = 27.5, rounding down and multiplying by the number of tasks we obtain 27 × 5 = 135 data points for the two seconds window and total data size of 24 features ×135 data points.

1) Force-related values: Force-related features (all in N)
were calculated for the total force of the insole and included mean value, standard deviation, range, median value, mode, interquartile range, skewness, covariance, and kurtosis, as well as the ratio of the forefoot force to the heel force (mean and standard deviation), which were calculated using only forefoot and heel sensors.2) Centre of pressure values: The CoP coordinates on the ground plane were evaluated considering fixed distances between the three pressure sensors of each insole during foot flexion/extension, and a constant distance between participants' feet.The X and Y position of CoP in cm considering the origin on the axis (0, 0) between the feet, was evaluated as follows: where t is the data sample that goes from 1 to 5500 (55 seconds of acquisition at 100Hz), D f is the mean distance in cm between the feet, evaluated on the basis of the body segments length based on the average high of the cohort (as per [59]), as D f = 171 × 0.191.S f is the weighted mean size in mm of the shoe used by the cohort, evaluated as where the last value is the conversion between European shoe size to cm.F L t , F R t , F F t , F M t and F H t are respectively the contribution in Newton of the sensors of the left insole, the right insole, the front pressure sensors of both insoles, the middle sensors of both sensors and the heel pressure sensors of both insoles.From the value of the CoP, three statistical values were evaluated per each coordinate (X and Y): mean value, standard deviation, and range (in cm).
1) Peak morphology values: The dynamic behavior of the pressure values can vary according to the task being performed.The static task might produce uniform pressure on the overall insole over time; while activities where the body weight moves from one foot to the other or goes across different areas of the same foot produce peaks and valleys in the data.To capture such behavior, time-domain and frequency-domain features related to the morphology of the pressure values were evaluated for each window per total insole force value: a) Shaper factor -gives an indication of peak profile: where N is the number of data sample per time window of observation and D is the sample data.

b) Peak to peak value -difference between the minimum and the maximum value of the pressure within the time window in Newton. c) Number of peaks per time window -number of peaks that exceed 60% of the maximum value of pressure, within the time window. d) Mean distance between peaks that exceed a 60% threshold of the maximum value of force within the time window in seconds. e) Mean amplitude of peaks within the time window in
Newton.f) Standard deviation of the spectral power distribution (discrete Fourier transform) within the time window.g) Main frequency within the spectral power distribution (discrete Fourier transform) -excluding the peak at 0 Hz, the main frequency was extracted per each time window.

D. TRAINING AND ASSESSMENT OF SUPERVISED MACHINE LEARNING CLASSIFIER
The features extracted per each time window were labelled according to the task being performed and were used for the training process of the RF classifier.We evaluated Support Vector Machine, k-Nearest Neighbor and Naïve Bayes; they all offered considerably worse performance.These classifiers are omitted for brevity, however, the code for all classifiers and the associated database of extracted features can be accessed online at https://github.com/patriciao-sullivan/PID4TC_Analysis.A 10-fold cross-validation was employed to train and test the data, where each fold contains all the measurements for two subjects.This split guarantees the model is learning patterns that can generalize to unseen subjects.For each fold and time window, accuracies were computed from the confusion matrices and the classifier was assessed using SHAP to evaluate feature importance.SHAP values are assigned to each feature, representing its influence on the predicted outcome relative to the other features in the dataset, where high SHAP values indicate a high influence.The analysis also provides insights into the influence of features on each task individually by indicating if a high or low feature value influences the classification of a particular task.Cross-validation and SHAP results for the folds were aggregated for each window.
The highest performing features, based on the SHAP analysis, were rerun through the classifier in the same way as described for the first iteration of the analysis to gain an understanding on how fewer features impact on accuracy.

A. PRESSURE INSOLES DATA FOR TASK CLASSIFICATION (PID4TC)
The pressure insoles data used in this work are available at doi: 10.5281/zenodo.7755802under CC.BY.40 license [50].The dataset is composed of.csvfiles and is correlated with relevant metadata (.txt files) for database description and navigation.PID4TC is organized by subjects, where each.csvcontains 55 seconds of acquisition at 100Hz, organized by task.

B. INDUSTRY TASK CLASSIFICATION
The accuracy of the classifier ranged from 80-86%, with a gradual improvement as the observation windows increased (Fig. 3).The highest accuracy (≈ 86%) was observed for windows larger than 10 seconds (1,000 frames), while reaching a peak of 82% after 5 seconds before the value briefly declines.Task prediction that takes place within 5 seconds can be considered 'real-time' and offered good accuracy, thus prompting further analyses.The confusion matrix of the 5 second window iteration is presented in Fig. 4, with the highest accuracies for manual handling and walking (dark shaded values), closely followed by assembly and standing tasks.Pick and place was incorrectly predicted as either assembly or standing tasks approximately 45% of the time.Overall, the features related to the CoP (impact SHAP values: range in y direction ≈ 0.25, range in x direction ≈ 0.21, mean in x direction ≈ 0.18, std in x direction ≈ 0.14, std in y direction ≈ 0.12), spectral entropy (≈ 0.10) and morphology of the peaks (distance between peaks in seconds ≈ 0.08) played a major role in the classification of all tasks.Furthermore, statistical features extracted from force data, such as covariance (≈ 0.09), standard deviation (≈0.09), and the ratio of the front of the foot to the heel (≈ 0.08) were also prominent.The SHAP analysis also gives a within task breakdown of feature importance by calculating a score for all the input features for a given model, resulting in a range of SHAP values for the twenty highest scoring features in a class (colored regions for each feature, Fig. 5).Manual handling, walking and assembly tasks were misclassified the least (Fig. 4) due to the strong influence of the feature values on the classification of these tasks; hence, in Fig. 5, these tasks carry greater SHAP values (e.g., purple-shaded zone for the manual handling task in the CoP YRange feature).
Fig. 6-8 also provide insights into feature importance within tasks with the overall highest SHAP scores: manual handling, walking and assembly.A clear contrast in feature values is present in the highest impacting features (high SHAP values) which is evident from the high color contrasts assigned to the ranges of SHAP values.This means the large magnitudinal differences in feature values, especially in CoP, aided in the identification of tasks.High values of the CoP range resulted in high SHAP values (≈ 0.2 to 0.3) for manual handling (Fig. 6), while low CoP ranges were found to be impactful (up to ≈ 0.1) on the classification of walking (Fig. 7) and assembly (Fig. 8).High values of CoP standard deviations and mean in the x direction (up to ≈ 0.20) for walking and up to ≈ 0.10 for assembly helped for their respective categorizations.In addition, we observed a decreased classification accuracy in the confusion matrix for tasks where the CoP is more confined (Fig. 4 and 9), for example, pick and place and standing.The natural shift  of the weight along the sagittal and coronal axes (changes in CoP) varies depending on the task, to the greatest extent during manual handling and walking and to a lesser extent when manipulating objects while both feet are stationary; this, results in a broader range of feature values with higher associated SHAP values as it is easier to distinguish features that vary depending on the task.
The analysis was repeated using only the top five features from Fig. 5, all of which are CoP related features: range and standard deviation in both directions and mean in the x direction.The 2nd RF classification returned marginally better accuracies across almost all windows, including the 5 second window which increased by 1% to 83% (Fig. 10).Fig. 11 focusses in on the SHAP analysis for the 5 second window, with highest SHAP values still associated to manual handling, walking and assembly.Interestingly, the mean CoP in the x direction became the most influential factor in the reanalysis, jumping up from third (Fig. 5).Overall, higher SHAP values are present in the new analysis (Fig. 11) compared to the analysis containing all features (Fig. 5), as the  mean CoP in the x direction attained a mean SHAP value of approximately 0.45, and all other features achieving a score higher than ≈0.25, which was also the highest SHAP value in the initial analysis.In addition, the confusion matrix in Fig. 12 shows fewer misclassifications for the updated analysis.The exclusion of lower impact features caused a small reduction in misclassifications, all the while there was no increased risk of over-fitting.

IV. DISCUSSION
This study investigated the use of pressure insoles for human task classification in the context of occupational health and safety.The pressure insole data were collected from 20 subjects while performing five industry-related tasks.From the raw data, 24 features were evaluated and labelled to form  the training dataset for the RF classifier.The best classification accuracy produced a rate of 86% above 10 seconds of data observation.While the purpose of the study was to present a highly explainable classifier, and not necessarily to achieve the highest possible accuracy, the methods show an accuracy of 82% was reached after 4 to 5 seconds, which is   in alignment with the accuracy levels achieved in the classification of other occupation-specific tasks [60], [61].The 4 to 5 second time window is aligned with studies concerning activity recognition classification (Table 3, Supp.Mat., e.g., [21], [62]) and indicates potential for real-time classification in edge-AI capable devices.In a real work environment, 5 seconds is a realistic time frame to collect sufficient samples for classification that is robust to momentary changes in activity, for example, standing still for an instant before picking up an object and assembling.To further improve real-time predictions in working environments, a sliding or overlapped time window can be implemented in future applications, as was done in [62] and [63], to take the previous activity prediction into account in combination with new data.
Occupation-specific studies generally feature a maximum of 10 subjects (Supp.Mat.Table 3), making the number of subjects in the present study higher than other occupationspecific studies, but is on the middle to lower side when compared to all ML research involving pressure insoles (Supp.Mat., tables 1-2), and therefore remains a limitation to the study.Additionally, the subjects may not be representative of the population as they do not have industry work experience.Other limitations include the completion of tasks in a laboratory as opposed to an industry setting which typically features more noise, and the inclusion of a subset of possible industry tasks.Tasks had low complexity and were performed in an isolated manner and no transitions were considered.Only one type of pressure insole with a three-sensor configuration was utilized in the study.
The release of the raw data, database of extracted features, and classifier code encourages transparency in the analysis methods, and the use of 10-fold cross validation and accuracy as the main measure of prediction success enables pressure insoles and ML studies to be compared, since these are the most frequently occurring approaches in recent years (Supp.Mat.).
ML algorithms, such as the one employed in this study, use the labelled data to establish a function that can map the features' correlation to a specific label.Some ML techniques, for instance RF, allow a more direct evaluation of the importance of each feature in the classification process.Such characteristics related to interpretability [64], are particularly relevant to industrial applications.In fact, for safety and regulatory reasons, it is preferable to implement AI-based techniques that take decisions based on models that can be fully understood in human terms, while a black-box decisionmaking behavior can lead to a lack of trust.While deep learning (DL) models showed high classification accuracies in Supp.Mat., and a subset of DL models provide a certain degree of explainability, it was decided not to proceed with a DL approach for a number of reasons.Firstly, a relatively small dataset is concerned in this study and a DL model is more likely to overfit; secondly, DL models would be more computationally demanding which limits the deployability of cheap embedded devices.In addition, DL model structure is strongly dependent on the domain and the dataset, while our approach using standard ML techniques can be easily and quickly adapted to a different dataset and problem.Finally, while SHAP can be applied to any model, the fact that the Shapley values are computed using a TreeExplainer makes it more related to a RF classifier, making the provided explanation more reliable in this study.DL models will be explored in future iterations of this research.RF was employed in this study since it is a non-linear ML technique that provides good classification performance and interpretability balance.Due to the limited dataset, we decided to not proceed with the hyperparameters tuning as the result of the cross-validation process would likely cause overfitting, rather we kept the default parameters.A comparison with other models where the parameters have been kept as default is available in the accompanying GitHub repository.In general, RF performed better, which can be verified from the analysis script stored in the repository.
SHAP analysis is advantageous over other feature performance analyses because it provides a within class feature importance breakdown and the impact feature values have on the class prediction.This information is valuable from the perspective of studying misclassifications and improving classifier performance as we gain an understanding of what feature are causing misclassifications, and the most impactful features can be selected for modelling to reduce incorrect predictions and increase accuracy, as was the case in the present study.From the SHAP analysis, CoP features were identified as having the highest impact (Fig. 5-8), and included the ranges, standard deviations and the mean in the x direction.These features produced the highest accuracy (83%) and SHAP values (between ≈0.25 to 0.45) for the 5 second window when classified alone (Fig. 10-12).This indicates that the number and disposition of the pressure sensors can be optimized for CoP, encouraging enhanced task classification in comparable set-ups.For example, a twoquadrant (front-back) or four quadrant sensing disposition (front-left, front-right, heel-left, heel-right) could offer more precise information in this regard.This will aid in keeping the number of sensors to a minimum, optimize wearability, data throughput and energy consumption, all of which are crucial factors in edge-AI devices, internet of things and smart 21354 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
manufacturing environments as a whole.Application-focused insoles may also aid in reducing the number of task misclassifications, which, with the current pressure sensor number and configuration of the insoles may affect productivity and pose health and safety risks.For instance, a worker is engaged in a pick and place that is erroneously detected as assembly or rest may lead to a collaborative robot operating in the occupied workspace or there may be implications on task rotations, work scheduling and active work time registered.
To minimize computational requirements and power consumption, force data only were analyzed from the pressure insoles and no additional sensors were utilized in the classification.This represents a challenge compared with similar works, where the classification mostly involved diverse tasks with dynamic characteristics that are more easily distinguished from one another, and in some cases, a variety of sensors were utilized [28], [32], [61], [65].In this sense, the features selected for the classification played a crucial role.The general approach was to focus on describing the morphology of the pressure signal, especially the peaks' frequency, intensity, and distribution.The weak relevance of these features (peak to peak is only the tenth most significant feature) in the classification might indicate that different subjects may have different strategies for performing the same task.Nonetheless, the spectral entropy and especially the behavior of the CoP (first 6 most significant features for classification, Fig. 4) demonstrated a good task dependency and a sufficient agnostic level towards the subjects.CoP was extracted from force data of only two occupation-specific studies involving ML and pressure insoles [13], [21] (Supp.Mat.Table 3), one study of which conducted a feature importance analysis, where CoP-related features were among the ten most important features in the assessment of lower back loading during manual tasks [13].
The confusion matrices (Fig. 4 and 12) and SHAP values in all SHAP analysis figures have highlighted tasks with limited CoP variations throughout their execution, as shown in Fig. 9, are more likely to be misclassified.This calls for further investigation, both in terms of feature selection, device design and AI classification strategy.For example, the images generated from CoP in Fig. 9 may be utilized as inputs to the classifier instead of the averaged values across time windows.Exploring the behavior of the pressure values in the frequency domain has generated useful information to distinguish manual handling from the remaining tasks.Similarly, other time-frequency domains, such as wavelet transform, may offer different indicators for task classification.

V. CONCLUSION
The paper presents a transparent, non-invasive means of assessing worker tasks using wearable sensors.A lack of explainable methods was identified in the literature and the present research exhibits comparable accuracies to such studies but with the benefit of increased explainability, meaning that industry-specific pressure insoles with optimized sensor configuration for the extraction of significant fea-tures, such as CoP can be achieved.This, in turn, can reduce data throughput and consumption, presenting potential for real-time worker monitoring in smart manufacturing environments.

FIGURE 1 .
FIGURE 1.A subject's left foot raw data while carrying out all five tasks.

FIGURE 2 .
FIGURE 2. The set up of the components for the assembly task (left) and the assembled components (right).
committee (CREC Review Reference number: ECM 4 (p) 6/7/2021), and subjects provided written informed consent to participate.Participants were excluded if they reported any musculoskeletal disorders.The mean age of the subjects was 29 years and ranged between [20 64] while their weight ranged between [53.5 102.7] with a mean of 71.6kg.Participants completed a brief warm-up routine for five minutes and were given manual handling training prior to data capture.

Fig. 5
Fig.5highlights the features that have the most impact on the classification of each task, quantified by SHAP values.Overall, the features related to the CoP (impact SHAP values: range in y direction ≈ 0.25, range in x direction ≈ 0.21, mean in x direction ≈ 0.18, std in x direction ≈ 0.14, std in y direction ≈ 0.12), spectral entropy (≈ 0.10) and morphology of the peaks (distance between peaks in seconds ≈ 0.08) played a major role in the classification of all tasks.Furthermore, statistical features extracted from force data, such as covariance (≈ 0.09), standard deviation (≈0.09), and the ratio of the front of the foot to the heel (≈ 0.08) were also prominent.The SHAP analysis also gives a within task breakdown of feature importance by calculating a score for all the input features for a given model, resulting in a range of SHAP values for the twenty highest scoring features in a class (colored regions for each feature, Fig.5).Manual handling, walking and assembly tasks were misclassified the least (Fig.4) due to the strong influence of the feature values on the classification of these tasks; hence, in Fig.5, these tasks carry greater SHAP values (e.g., purple-shaded zone for the manual handling task in the CoP YRange feature).Fig.6-8 also provide insights into feature importance within tasks with the overall highest SHAP scores: manual handling, walking and assembly.A clear contrast in feature values is present in the highest impacting features (high SHAP values) which is evident from the high color contrasts assigned to the ranges of SHAP values.This means the large magnitudinal differences in feature values, especially in CoP, aided in the identification of tasks.High values of the CoP range resulted in high SHAP values (≈ 0.2 to 0.3) for manual handling (Fig.6), while low CoP ranges were found to be impactful (up to ≈ 0.1) on the classification of walking (Fig.7) and assembly (Fig.8).High values of CoP standard deviations and mean in the x direction (up to

FIGURE 4 .
FIGURE 4. Confusion matrix showing the prediction accuracy for the 5 second observation window.

FIGURE 5 .
FIGURE 5. Analysis of feature importance for the 5 second observation window.

FIGURE 6 .
FIGURE 6. SHAP analysis of feature importance for manual handling in the 5 second observation window.

FIGURE 7 .
FIGURE 7. SHAP analysis of feature importance for walking in the 5 second observation window.

FIGURE 8 .
FIGURE 8. SHAP analysis of feature importance for assembly in the 5 second observation window.

FIGURE 9 .
FIGURE 9. Example of the trajectory of the CoP for each of the five tasks for one subject.

FIGURE 10 .
FIGURE 10.Accuracy values across all windows after re-analysis with the five most impactful classification features.

FIGURE 11 .
FIGURE 11.Analysis of feature importance for the 5 second observation window and the highest performing features.

FIGURE 12 .
FIGURE 12. Confusion matrix showing the prediction accuracy for the top 5 features in the 5 second observation window.

TABLE 2 .
Total data size per window.