A Layered sEMG–FMG Hybrid Sensor for Hand Motion Recognition From Forearm Muscle Activities

The activities of muscles in the forearm have been widely investigated to develop human interfaces involving hand motions, especially in the fields of prosthetic hands and teleoperation. Although surface electromyography (sEMG) is considered as an effective biological signal from which hand motions can be recognized, the availability and quality of sEMG data can limit the usability and intuitiveness of human interfaces. This article introduces force myography (FMG) as a supplementary signal and proposes a layered sEMG–FMG hybrid sensor that can measure both sEMG and FMG at the same skin surface location. Meanwhile, a layer fusion convolution neural network (LFC) is designed to extract multiscale features from sEMG and FMG. To evaluate the effectiveness of the hybrid sEMG–FMG sensor and LFC, a 22-hand motion classification experiment was conducted on nine able-bodied subjects. The recognition results indicated a significantly improved classification accuracy (p < 0.001) of the hybrid sEMG–FMG modality with respect to single sEMG or FMG modality. The classification accuracies (CAs) of LFC were compared with conventional machine learning methods, including support vector machine, random forest classifier, xgboost, and k-nearest neighbor. Compared with the single-modality sEMG, the CAs of the dual-modality sEMG–FMG using conventional methods, and LFC were improved by 21.31% and 16.71%, respectively. These results suggest that the layered sEMG–FMG sensing approach can effectively enhance the performance of human interfaces, which offers great potential in the clinical applications of sophisticated prosthetic hands and teleoperation.

A Layered sEMG-FMG Hybrid Sensor for Hand Motion Recognition From Forearm Muscle Activities Peiji Chen , Ziye Li, Shunta Togo , Member, IEEE, Hiroshi Yokoi, Member, IEEE, and Yinlai Jiang , Member, IEEE Abstract-The activities of muscles in the forearm have been widely investigated to develop human interfaces involving hand motions, especially in the fields of prosthetic hands and teleoperation.Although surface electromyography (sEMG) is considered as an effective biological signal from which hand motions can be recognized, the availability and quality of sEMG data can limit the usability and intuitiveness of human interfaces.This article introduces force myography (FMG) as a supplementary signal and proposes a layered sEMG-FMG hybrid sensor that can measure both sEMG and FMG at the same skin surface location.Meanwhile, a layer fusion convolution neural network (LFC) is designed to extract multiscale features from sEMG and FMG.To evaluate the effectiveness of the hybrid sEMG-FMG sensor and LFC, a 22-hand motion classification experiment was conducted on nine able-bodied subjects.The recognition results indicated a significantly improved classification accuracy (p < 0.001) of the hybrid sEMG-FMG modality with respect to single sEMG or FMG modality.The classification accuracies (CAs) of LFC were compared with conventional machine learning methods, including support vector machine, random forest classifier, xgboost, and k-nearest neighbor.Compared with the single-modality sEMG, the CAs of the dual-modality sEMG-FMG using conventional methods, and LFC were improved by 21.31% and 16.71%, respectively.These results suggest that the layered sEMG-FMG sensing approach can effectively enhance the performance of human interfaces, which offers great potential in the clinical applications of sophisticated prosthetic hands and teleoperation.Index Terms-Force myography (FMG), hybrid sensor, layer fusion convolutional neural network (LFC), surface electromyography (sEMG).

I. INTRODUCTION
T HE human hand plays an indispensable role in daily life as a tool to interact with the environment and communicate with each other.The anatomical fact that the muscles related to the movement of the fingers are mostly located in the forearm enables the possibility of estimating hand motions by measuring muscle activities on the forearm.This has been widely used to develop human interfaces, especially for prosthetic hands and teleoperation.
To date, muscle activity has mainly been investigated by measuring surface electromyography (sEMG) with electrodes attached to the skin surface.Many studies have confirmed the possibility of controlling myoelectric prosthetic hands [1] and teleoperation [2] by identifying hand motions from sEMG signals.The use of sEMG signals has long been studied using many effective signal processing methods proposed in the literature.However, in daily life, the human-machine interface based on sEMG is unstable and unreliable because of the characteristics of sEMG signals, which are extremely susceptible to interference, such as sweat and muscle fatigue.Hence, the development of methods for measuring effective control signals has become a focal issue that requires urgent attention.
Previous studies have proposed solutions to improve the acquisition quality of sEMG signals, including targeted muscle reinnervation (TMR) surgery, high-density sEMG measurement, and the introduction of supplementary signals.Kuiken et al. [3] showed that amputees treated with TMR surgery achieved encouraging results in ten different forearm motion tasks.However, this method involves some clinical inconveniences, including high invasiveness and high cost.The preliminary experimental results reported by Zhang and Zhou [4] also demonstrated the feasibility of applying pattern-recognition technology to high-density sEMG signals to recognize the hand motions of stroke survivors.However, the complicated measurement system induces both physical and mental stress in subjects, which can lead to classification degradation.
In contrast, the introduction of supplementary signals has been gradually attracting attention [5] in the field of humanmachine interfacing.Force myography (FMG), a structural signal of muscle activity, is generated by muscles during contraction and relaxation.It is relatively stable and immune to sweat conditions.Some studies on prosthesis control indicated that FMG was more stable than sEMG when a pattern was held [6].Therefore, the introduction of FMG could compensate for the limitations of sEMG.Previous studies [7], [8], [9], [10], [11] have demonstrated that the sEMG-FMG dual-modal approach can effectively improve the performance of hand gesture recognition tasks.The aim of this study is to develop a hybrid sensor that can measure sEMG and FMG signals simultaneously, without causing any extra burden to the user.Previous studies [7], [8], [9] measured sEMG and FMG separately at different locations, which required extra space to accommodate the sensors.It is desirable to develop a hybrid sensor system that can record the sEMG and FMG signals at the same location.The measurement of sEMG and FMG at the same location has been investigated in previous studies [10], [11].Both studies acquired FMG by connecting a deformable material directly to force-sensitive sensors, so that the sensors can detect the deformation due to force related to muscle contraction.When subjected to position drift due to movement, the force sensor based FMG unit developed in previous studies [10], [11], [12] can easily lead to errors owing to position-drift-induced horizontal forces.In contrast to these studies, we design an optical FMG measurement unit using reflected infrared light to detect deformation due to pressure changes, which has not yet been reported in the literature.In the FMG unit structure, the phototransistor embedded in the reflectance sensor is placed vertically on top of hollow silicone.Therefore, the output of the reflectance sensor in the proposed optical FMG unit is only affected by the vertical deformation of the silicone, which makes it less susceptible to horizontal deformation owing to position drift.
Furthermore, we designed a deep layer fusion convolutional neural network (LFC) to perform an end-to-end hand motion recognition task.The layer fusion structure was first proposed by Yang and Ramanan [13] in computer vision field [14] transferred the raw sEMG into signal image and proposed a 2D-CNN-based multiview network to perform a late fusion and early fusion for sEMG data.Considering the parameters and the extra preprocessing process for using 2D-CNN, we develop a 1D-CNN-based multi-input layer fusion network, which reduces preprocessing and can perform an independent convolution for every signal channel.This network structure is developed by considering the characteristics of the multichannel one-dimensional physiological time-series signals, and also the feasibility for clinical applications.To the best of our knowledge, this is the first study to introduce a layer-fusion 1D-CNN-based network to perform end-to-end classification of sEMG or FMG signals.
The rest of this article is organized as follows.Section II describes the development of the layered sensor system, the verification experiments we conducted are described in Section III, and the detailed structure of the LFC is introduced in Section IV.Sections V and VI report and discuss the experimental results, respectively.Finally, Section VII concludes this article.

II. SEMG-FMG HYBRID SENSOR
Muscle contraction is associated with bioelectrical activity and changes in muscle geometry.Bioelectrical activity can be captured using sEMG [15].The change in muscle geometry can be captured via FMG [16] as the pressure distribution changes.In contrast to the sEMG signal, FMG is not susceptible to electromagnetic interference, muscle fatigue, or unreliable electrode-skin impedance caused by sweat.Therefore, considering the advantages of both sEMG and FMG, cost, comfort, and the available measurement area of the user, we combine the two measurement units into a layered sensor system and design a sensor band to measure three-channel sEMG and FMG signals simultaneously at the same location on the skin surface.

A. sEMG Measurement Unit
The sEMG measurement unit consists of a circuit board [Fig. 1 The sEMG sensor is used to measure biopotentials by placing electrodes on the skin.The conductive silicone electrode consisted of silicone (TSG-E30, Tanac Co., Ltd., Japan) and carbon black (EC600JD, Lion Specialty Chemicals Co., Ltd., Japan).This dry electrode was hypoallergenic and sufficiently flexible for long-term daily use.The electrode size was adjustable and set to 10 mm × 20 mm × 2.5 mm [Fig.1(c)] for the experiment.The composite electrode was divided into two layers: the base electrode and contact electrode with carbon-to-silicone weight ratios of 4% and 2.6%, respectively.A previous study has shown that double-layer electrodes under such concentration combinations exhibit lower skin-electrode contact impedances and higher signal-to-noise ratios [17].A layer of conductive nonwoven fabric was placed on the base electrode to prevent damage.The sEMG circuit board was connected to the electrodes via gold-coated copper wires and was isolated using an insulating sheet.The total size of the sEMG measurement unit was 32 mm × 20 mm × 6.5 mm [Fig.1(d)].

B. FMG Measurement Unit
An optical FMG measurement unit that uses reflected infrared light to detect pressure changes is designed, as shown in Fig. 2. The unit is composed of a base layer, an isolation layer, an elastic silicone structure between the base and isolation layers, and a reflectance sensor to measure the interlayer distances.The reflectance sensor (QTR-1A, Pololu Co., Ltd., USA) carried a single infrared light-emitting diode and phototransistor pair in a 12.7 mm × 7.62 mm × 2.54 mm module.The phototransistor was connected to a pull-up resistor, which functioned as a voltage divider, that generated an analog voltage output between 0 and  The optical FMG measurement is based on the relationship between the reflectance sensor's output and the displacement caused by muscle contractions.As the muscle contracts, the elastic silicone deforms vertically due to the pressure change, resulting in a displacement between the reflectance sensor and the isolation layer surface.This displacement leads to a corresponding change in the intensity of the reflected infrared light.The phototransistor converts the light intensity into an electrical signal that reflects the pressure caused by muscle contraction.
To investigate the relationship between the FMG signal and pressure, we used a tension/compression testing machine (IMADA SL-6002) to perform a compression test with a load range from 0 to 30 N. Fig. 3 shows the results.
The trend line that fits the FMG amplitude pressure is a linear approximation (approximation degree 99.5%) which can be expressed as follows: y = − 0.12x + 3.17. (1)

C. Layered Hybrid Sensor and Sensor Band
As shown in Fig. 4(a), the sEMG and FMG measurement units were vertically combined.The height of the combined sensor was ∼16.5 mm and the weight was ∼4.66 g (without wires).This vertical combination facilitates the simultaneous detection of sEMG and FMG signals on the same skin area.Three hybrid sensors were attached to a strap to realize the three-channel sEMG-FMG sensor band [Fig.4

III. EXPERIMENT
To investigate the effects of the FMG signals, we conducted a 22-class hand gesture recognition experiment using a threechannel sensor band.The experimental results were used to compare the classification performances under three modality settings (sEMG, FMG, and sEMG-FMG).

A. Subjects
Nine able-bodied, right-handed male subjects (age: 27.1 ± 1.9, height: 175.2 ± 4.2 cm, weight: 72.2 ± 10.5 kg, numbered S1-S9) participated in the experiment.None of the participants had previously undergone similar sEMG recognition experiments.All experimental procedures were conducted in accordance with the Declaration of Helsinki and approved by the Ethics Committee of The University of Electro-Communications, Tokyo, Japan.Each participant received an explanation of the experimental procedure and provided their informed consent.

B. Experimental Protocol 1) Sensor Attachment:
The wearing positions of the sensors are shown in Fig. 5.All subjects used their right hands to perform gestures.The forearm skin was cleaned with alcohol before each experiment.Each participant was fitted with a sensor band on the right forearm and tightened to a comfortable but secure level.During the experiment, we used a pinch meter (K800, Biometrics Co., Ltd., U.K.) to monitor the skin-electrode contact characteristics.The initial pressure between the band and the forearm was set to 0.2 kg.
2) Hand Gestures: To meet the daily human-computer interaction needs, we focused on gestures frequently used in daily life.Bullock et al. [18] investigated the occurrence frequencies of different human grasps in several daily activities and extracted ten most-frequent grasp types that account for ∼80% of all grasping tasks.Furthermore, inspired by a previous study [19], we included eight hand gestures and four finger gestures.Detailed descriptions of the 22 hand gestures are shown in Fig. 6.
3) Experimental Paradigm: The participants were seated with their elbows resting on the table and their right forearm perpendicular to the table.To compensate for the difference in the initial states of each participant, ten seconds of relaxed-state signals were recorded as a baseline.Each measurement period lasted for ten seconds, including two, 3 s of action and two, 2 s of relaxation.Each subject repeated the trial five times.The participants were allowed to rest for several minutes between trials to prevent muscle fatigue.The sEMG sampling rate was set to 1000 Hz using a 16-bit-precision professional analog-to-digital converter (AIO-160802AY-USB, Contec Co., Ltd., Japan) The hybrid sEMG-FMG signal waveform during wrist flexion was demonstrated as in Fig. 7.

A. Multiscale Representation
The output of each layer in the CNN can be regarded as a scale representation of the input data.As the layers deepen, the model can extract higher-level features.The shallow layers of the CNN are good at extracting low-level features, and the deep layers of the CNN are good at extracting high-level features.The general CNN structures used to process physiological signals only consider high-level features, which can be competent for some simple tasks.However, when the category difference of the input data is small and contains noise signals, high-level features cannot fully represent the full information of the input data [12].

B. Layer Fusion CNN
Recent studies [20], [21] have demonstrated that deep learning methods outperform conventional machine learning (ML) methods in gesture recognition tasks; however, their low generalization ability makes them rarely used in real life.Compared with picture and video data, the acquisition of physiological signals is difficult, costly, and limited by the collection equipment and Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.environment.The formats of physiological signals are usually different, making it difficult to collect enough data to train a model with a strong generalization ability.Most deep learning models have been proposed in the fields of computer vision and natural language processing, and are optimized to process image data and text data.When processing time-series signals with noise, some models cannot converge easily [22].To improve the generalization ability of the CNN model, we took advantage of the multiscale input data and adopted LFC (Fig. 8) to process the sEMG and FMG signals.
The entire gesture-recognition network can be represented by the following decision function: where x represents the data input to the deep learning model, F w (x) is the decision function, w is the parameter to be learned for the entire network, and y pred represents the output gesture category.The network consists of feature extraction and classification layers.The feature extraction layers contained multiple convolution blocks and a layer fusion structure.The convolutional block primarily consists of two 1d convolution layers with strides 1 and 2, respectively.The convolutional block can be defined as follows: where O n is the output result of the nth convolution block, δ n+1 is the input of the subsequent convolution block, H λ n (O n ) is the calculation performed in the nth convolution block, and λ n are the parameters to be learned.
The fusion layer is defined as follows: where f n denotes the results of the nth convolution block, F denotes the fusion results of all layers, and Concat(x) denotes the concatenation operation.This structure offers two advantages.First, during forward propagation, information from each layer is directly input to the decision layer, which makes the data propagation more efficient than structures only using the final layer's results.Second, during backpropagation, gradients are directly propagated to each layer through the structure of layer fusion, which enables the parameters of each layer to be updated effectively.Both advantages contribute to the improvement of the generalization ability [13], [23].

C. Model Complexity
A good trade-off between improved performance and increased model complexity is required to make a model of practical use.Our designed LFC (sEMG/FMG) and LFC (sEMG-FMG) require almost the same FLOPs in a single forward for a 3×1000 and 6×1000 input signal, respectively.For params, the sEMG-FMG LFC model only increases the number of parameters by 0.3% compared with the single-modality LFC.And about the training time and testing time, we will discuss in Section VI.The detailed structural codes are available in our Github.

D. Comparison With Conventional ML
Several conventional ML methods (support vector machine (SVM), random forest classifier, K-nearest neighbors (KNN), and XGBoost) were used as baseline for comparison.To perform data preprocessing for these methods, the sEMG and FMG signals were divided into a series of nonoverlapping 500 ms shift-analysis windows, and features were extracted from each window.Five sEMG time-domain (TD) features were extracted, including the root mean square, mean absolute value, waveform length, zero crossings, and slope sign changes, all of which have been confirmed to be effective in previous studies [15], [24].There are also many effective sEMG features [25] that have been proven to be capable of representing complex self-regulatory systems.However, they are not used in this study, only five common sEMG features were used.In addition, 12 FMG TD features were extracted, including the maximum (Max), minimum (Min), average value (Avg), standard deviation (SD), and degree of deformation in various states (1, 5, 10, 25, 50, 75, 90, and 99%) [26].
Because LFC does not rely on prior knowledge, the raw signals are fed directly to the model after being extracted by a fixed-length window and normalized.The dataset was divided into training and testing sets of 80% and 20%, respectively.A five-fold cross-validation was conducted.We implemented our model based on the PyTorch framework.The proposed model was trained by an Adam optimizer with learning rate of 0.001.The learning rate was halved every 20% of the total epochs.

A. Classification Results of Individual Subject
A comparison of the classification accuracies (CAs) of the nine subjects is presented in Table I.The highest CAs among the ML methods are shown in red, and the highest CAs among all methods are in blue.Compared with the sEMG modality, the CAs in every subject of the hybrid sEMG-FMG modality using conventional ML and LFC both showed a significant improvement (t-test, RFC: p = 2.8E-05 < 0.01, XGBoost: p = 1.2E-05 < 0.01, KNN: p = 3.5E-05 < 0.01, SVM: p = 7.2E-08 < 0.01, LFC: p = 8.9E-06 < 0.01).Specifically, SVM was the best-performing conventional ML, whose average CAs of the sEMG and hybrid sEMG-FMG modality settings across all subjects were 65.34% ± 4.75% and 86.4% ± 4.83%, respectively.The average CAs of LFC were 78.41% ± 5.39% and 93.06% ± 3.26%.Notably, no significant difference (two-way analysis of variance (ANOVA), SVM: p = 0.93 > 0.05, LFC: p = 0.13 > 0.05) was observed in the CAs of the nine subjects with respect to the classifier, which suggests that the proposed sensor does not cause significant differences between individuals and offers high generalization ability.Meanwhile, the difference between the three modality settings was significant (two-way ANOVA, SVM: p = 1.17E-08 < 0.001, LFC: p = 4.58E-07 < 0.001).

B. Classification Results of All Subjects
To verify the generalization ability of the methods, we used different subjects for training and testing (e.g., S1-S7 for training, and S8 and S9 for testing).Considering the domain difference from the intra and interindividual, the following baseline subtraction and normalization were performed before training.1) Baseline subtraction: For each subject, the baseline of sEMG and FMG signals for each channel was calculated by averaging the values during a period of relaxation without muscle contraction.These baseline values were then subtracted from the corresponding channel signals to focus on the relative in signal changes resulting from muscle contractions.2) Normalization: To account for the domain difference from intra and interindividual variations, a normalization step was performed before training.The mean and standard deviation of the sEMG and FMG signals for each channel on the training datasets (i.e., S1-S7) were calculated.During the training and testing phases, the sEMG and FMG data for each channel were standardized based on the mean and standard deviation calculated from the training datasets.Results are shown in Fig. 9, where the hybrid sEMG-FMG outperformed the sEMG and FMG.Same as the results for the individual subject, SVM performed the best among the conventional ML methods.The CAs of the hybrid sEMG-FMG, sEMG, and FMG using SVM were 82.32% ± 0.75%, 62.01% ± 1.14%, 47.72% ± 4.25%, and the f1-score were 0.82 ± 0.05, 0.61 ± 0.02, 0.44 ± 0.14.Meanwhile, the CAs and f1-score of the hybrid sEMG-FMG, sEMG, and FMG using the proposed LFC were 90.81% ± 0.8%, 74.1% ± 0.95%, 65.13% ± 0.86% and 0.91 ± 0.02, 0.73 ± 0.03, 0.64 ± 0.08.Compared with the sEMG modality, the CAs of the hybrid sEMG-FMG modality improved by 21.31% with SVM and 16.71% with LFC.There was no significant difference (p = 0.27 > 0.05) between the random training method and the group training method, implying that the trained LFC has a strong generalization ability and potential to be used for new subject.
The confusion matrix for the LFC is illustrated in Fig. 10.The average recall was 90.93% ± 5.3%, and the average precision of the 22 hand motions was 90.74% ± 4.54%, which suggests that the LFC achieved a strong classification performance for each gesture.Combining precision and recall, the classification performance for wrist flexion was the highest (F1 score = 0.97), whereas that for pen pinches was the lowest (F1 score = 0.82).The ten grasp gestures (from tripod to lateral pinch) suffered more misclassifications than the other gestures, as shown by the blue dotted region in Fig. 10.The correlation between these gestures is relatively high, because they require a strong force to hold the object, which may lead to similar FMG.

VI. DISCUSSION
The advantages of combining sEMG and FMG signals at the same location on the skin surface are reflected not only in the significant improvement in CAs, but also in the classification stability.The SD of the average LFC-obtained dual-modality CAs for all subjects was smaller than that of either single signal (hybrid:3.26%;sEMG:5.39%;FMG:6.22%).Because the CAs of the LFC exceeded those of conventional ML methods, we further discuss the performance and characteristics of the dualmodal approach based on the LFC.
The loss and classification accuracy curves of the three modality settings using LFC are shown in Fig. 11.The hybrid sEMG-FMG modality achieved the highest performance in terms of loss and CA.In the hybrid sEMG-FMG modality, the loss converged to the lowest value of the three modality settings (training:0.22± 0.01 and test:0.4± 0.01), with a generalization gap at 0.18.Furthermore, it achieved a faster convergence speed during the first 100 epochs for both training and testing sets.The performance of FMG was poorest, and its final loss converged to 0.98 ± 0.01 (training set) and 1.27 ± 0.02 (test set), with a generalization gap at 0.29.Compared with sEMG, FMG had a smaller generalization gap between the final convergence stabilities for the training and testing sets, which indicates that, although sEMG achieved a higher CA, FMG might offer superior robustness in the dual-modal approach, Thus, the dual-modal approach can further improve hand gesture classification performance by combining sEMG and FMG.
To elucidate the effect of layer fusion, we trained multiple classifiers using multiscale representations of the input signals in the conventional CNN.The results of the different scales of the CNN are shown in Fig. 12  A comparison of our research with previous studies that measured sEMG and FMG at the same location is shown in Table II.Jiang et al. [10], who used an armband consisting of eight colocated EMG-FMG sensing units, achieved a CA of 91.6% on a ten-hand gesture recognition task.Ke et al. [11] presented a modular EMG-FMG sensor unit, with four of them on an elastic belt to obtain a CA of up to 99.42% on a six-hand gesture recognition task.In our research, we developed a layered sEMG-FMG hybrid sensor and a layer-fusion convolution neural network.An equivalent level of CA (93.06%) was obtained with fewer sensors for a more challenging hand gesture recognition task.The effectiveness of both the hybrid sensor and layer-fusion convolution neural network was indicated by the experiment and comparison.
The computation time of the proposed LFCNN (sEMG-FMG) was investigated on a GPU (GeForce RTX 3090, NVIDIA, USA).Table III displays the recognition accuracy and computation time using different window lengths on different subjects.In the forward propagation stage, the time of a short window sample was significantly reduced compared with a long window sample.However, due to the loss of some low-frequency information, the recognition accuracy was also reduced.The longest time required for the training stage using datasets of different window lengths was about 20 min for one subject,

VII. CONCLUSION
In this article, we present a layered hybrid sEMG-FMG sensor that can measure sEMG and FMG at the same location of the skin surface.The unique layered structure of the sensor enables the measurement of an additional signal without requiring additional space.An LFC was designed to process sEMG and FMG signals.The results of a 22-hand motion classification experiment on nine subjects showed that the hybrid sEMG-FMG approach can achieve high performance with good generalization ability.The introduction of FMG improved the robustness Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.   of the classification performance.The hybrid sensor and the LFC have promising potential applications in fields such as teleoperation and human-robot interfaces.
The limitations of this study include the small sample size and potential variations in muscle activation depending on demographics.Future studies will focus on increasing the sample size and involving a wider range of participant demographics.Additionally, the design of the layered sensor system will be improved to reduce its thickness by shortening the effective range of the reflectance sensor and to develop deep learning models better suited for processing multimodal physiological signals.
(a)] and two conductive electrodes.The circuit board includes amplifier and filter function modules, as shown in Fig. 1(b).Differential input signals are preamplified by an instrument amplifier AD620 (Analog Devices, USA), with a gain of 40 in the first stage amplification.The power-line interference is removed with a notch filter (50 Hz), and a bandpass filter (bandwidth 1-1000 Hz) is used to limit the frequency range.Finally, the processed sEMG signals are converted into digital signals by the second amplifier, LM324 (Texas Instruments, USA), with a gain of 125 in the second-stage amplification.The output voltage ranges from 0 to 5 V.

Fig. 2 .
Fig. 2. Implementation of FMG measurement unit.(a) Reflectance sensor.(b) Base layer.(c) Isolation layer.(d) Elastic silicone.The base and isolation layers were made by a 3D printer using PLA material.

Fig. 3 .
Fig. 3. FMG sensor calibration curve.The yellow line represents the measured data, the green line represents the approximate result and the red dot represents the max pressure 26.63 N at the maximum displacement of 3 mm, whereas FMG is 0.74 V.
(b)] used for verification.Fig. 4(c) illustrates the mechanism of simultaneous detection of sEMG and FMG.During muscle contraction, sEMG signals are measured as the amplified potential changes on both electrodes.FMG signals are measured via the deformation of the silicone caused by the pressure change between the elastic band and the skin.

Fig. 5 .
Fig. 5. Sensor placement on forearm.(a) Sectional view of the forearm and sensor positions.(b) CH1 and CH2 are attached near the flexor carpi ulnaris and flexor digitorum superficialis muscles, respectively, CH3 is placed near the extensor carpi radialis longus muscle, a reference electrode is set on the elbow.(c) Photograph of a subject wearing the sensor band.

Fig. 7 .
Fig. 7. Time-domain waveform of sEMG and FMG signals during wrist flexion.The signals are from CH1 in Fig. 5.

Fig. 9 .
Fig. 9. CAs (left) and F1 scores (right) of conventional ML and our proposed LFC under three modality settings.
and compared with the LFC that uses all scales.The heatmap shows that not all gestures obtain the best classification performance in the scale representation of the last layer scale (scale8), which means that the optimal scale representation of the signals of different gestures is different.When all the layer results from the CNN with layer fusion are used, almost all gestures can achieve the best classification performance.The traditional CNN structure can only extract the same-scale representation of the last layer for the input signals.And the addition of the layer-fusion structure allows the extraction of different-scale representation from the input signals.

Fig. 10 .
Fig. 10.Confusion n matrix of proposed LFC using sEMG-FMG signals.The rightmost column and the bottom row of the matrix represent the recall and precision of each gesture, respectively.The diagonal of the matrix represents the number of correct classifications under both conditions of precision and recall.The black numbers on the edges denote the total number of samples for each gesture, the red numbers denote the number of misclassifications, and the green numbers denote the percentage of correct classification.

Fig. 12 .
Fig. 12. Heatmap of results (f1-score) for gesture recognition using different scale representation from CNN and LFC.The vertical axis represents different scale representation (scale1 ∼ scale8) from different layers' results of CNN.Results of all scales using LFC are shown in the bottom for comparison. 1 1[Online].Available: https://github.com/peijichen0324/a-layered-sensorunitAuthorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE II COMPARISON
WITH PREVIOUS STUDIESwhich is acceptable in clinical applications to train a specified model for a new subject.

TABLE III RECOGNITION
ACCURACY AND COMPUTATION TIME WITH DIFFERENT WINDOW LENGTH