A Marked Point Process Filtering Approach for Tracking Sympathetic Arousal From Skin Conductance

Human emotion represents a complex neural process within the brain. The ability to automatically recognize emotions from physiological signals has the potential to impact humanity in multiple ways through applications in human-machine interaction, remote health monitoring, smart living environments and entertainment. We present a marked point process-based Bayesian filtering approach to track sympathetic arousal from skin conductance features. The rate at which individual skin conductance responses (SCRs) occur and their respective amplitudes encode important information regarding an individual’s psychological arousal level. We develop a state-space model relating a latent neuropsychological arousal state to the rate at which neural impulses underlying SCRs occur and the impulse amplitudes. We simultaneously estimate both arousal and the state-space model parameters within an expectation-maximization framework. We evaluate our method on both simulated and experimental data. Results on simulated data indicate the method’s ability to accurately estimate an unobserved state from marked point process observations. The experimental data include studies involving mental stress artificially-induced in laboratory environments, real-world driver stress and Pavlovian fear conditioning. Results on experimental data outperform previous Bayesian filtering approaches in terms of a lower sensitivity to small impulses and avoiding overfitting. The filter is thus able to estimate emotional arousal from skin conductance features and is a promising approach for everyday emotion recognition applications.


I. INTRODUCTION
Human emotions are an essential part of life [1], [2]. They play an integral role in self-expression, decision making, survival (e.g. fear response) and inter-personal communication. The ability for intelligent agents to automatically recognize emotions could have wide-reaching applications in human-computer interaction, smart living spaces, remote health monitoring and entertainment (e.g. multimedia content retrieval and recommendation) [3]- [7]. Consequently, much attention has been devoted to automated emotion recognition using physiological signals, facial expressions and speech. Early psychological studies in the field tended to view human emotion in terms of distinct categories. The six The associate editor coordinating the review of this manuscript and approving it for publication was Mohammad Zia Ur Rahman . universal emotions (happiness, sadness, surprise, fear, anger and disgust) belonging to the well-known model by Ekman and Friesen [8] was one such early effort in the era. Subsequent studies highlighted the continuous (or dimensional) nature of emotion recognizing it as a richer, more complex phenomenon with blended, rather than clear-cut, boundaries between different categories. The circumplex model of affect developed by Russell [9] was an important contribution in this regard. It accounted for human emotion using two different orthogonal axes-valence and arousal. Valence captured the pleasant-unpleasant nature of an emotion while arousal denoted its corresponding excitement or activation [10]. The typical 2D valence-arousal plane is occasionally supplemented with a third axis known as dominance [11]. Dominance accounts for the degree of control felt. For instance, a word such as ''confident'' or ''cruel'' may reflect greater VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ emotional dominance compared to the semantics that ''terrified'' convey. Much work has been done in recognizing patterns in physiological signals, facial expressions and speech waveforms that capture emotion-related information in the field of affective computing (e.g. [12]- [20]). Emotion is a complex phenomenon within the human nervous system. Emotional arousal, for instance, is related to the extent to which the sympathetic nervous system is activated [21]. Studies investigating the neural correlates of emotion have helped enrich our understanding of the complex circuitry underlying it [22]. Of particular importance in processing emotions in the brain is the amygdala, although other regions such as the orbitofrontal cortex and anterior cingulate cortex have also been found to play a key role [23]- [25]. A comprehensive understanding of how emotion-related neural processes in the brain manifest themselves through variations in physiological signals (e.g. changes in heart rate, skin conductance, blood pressure), facial muscle movements and subtle tones of voice is currently lacking.
Here we present a state-space approach for tracking arousal from skin conductance features. State-space methods are frequently encountered in controls engineering. Not all internal state variables are directly accessible in a typical control system. However, variations in the states do give rise to measurable changes in physical phenomena. Sensors can be used to measure these physical quantities, and mathematical relationships derived thereafter between the unobserved state variables and the measurements. The internal states can then be tracked continuously based on these relationships using methods such as Kalman filtering. Here we model the neural system governing emotion and the physiological changes that co-occur in such a manner-emotion is not directly accessible, but the physiological changes it causes are. By relating the internal neural quantity to specific biomarkers, we use a Bayesian filtering approach to track it in time. Here we present a filtering approach for tracking arousal based on marked point process (MPP) skin conductance features. We first review some of the prior work in neural state estimation in the following section. We then describe our methodology. Thereafter we show our results on simulated and experimental data and provide a discussion of the results. We finally conclude with a summary of our approach and note additional directions of future research.

II. RELATED RESEARCH
State-space models have been developed to estimate neural processes underlying physiological and behavioral observations. Here, the specific neural process is treated as the latent state variable of interest, and measurements such as neural spiking, reaction times and binary correctincorrect responses are taken to be the observations through which the unknown state is estimated. These state-space methods have found applications in behavioral learning [26]- [29], sleep [30], movement decoding in brain-computer interfaces [31], [32] and in estimating latent states underlying neural spiking [33], [34]. Due to the inherently binary or point process nature of some of the observations that are related to neural processes (e.g. on-off neuron firing, correct-incorrect responses), these state estimation procedures often rely on a type of Bayesian filtering that accounts for Bernoulli-distributed random variables. In this work, we consider estimating arousal from skin conductance, which also has an inherently ''spikey'' nature to it due to the neural activity that underlies its generation (Fig. 1). This makes it amenable to the point process state-space methodology just referred to.
The human autonomic nervous system comprises of both the sympathetic and parasympathetic branches. Sympathetic nerve fibers innervate the sweat glands [35]. Consequently, a skin conductance signal, generated by tiny variations in sweat secretions, provides an index of sympathetic arousal [36], [37]. Skin conductance, therefore, has been widely used in emotion recognition research as a means of capturing arousal information [38]. A skin conductance signal comprises of a slow-varying tonic part and a fast-varying phasic part [39]. The phasic component consists of individual skin conductance responses (SCRs), each of which is typically modeled as being generated by a burst of neuroelectric stimuli to the sweat glands [40]. The rate at which SCRs occur, their amplitudes and the tonic level are three of the most common skin conductance-related measures of autonomic arousal used in the literature [41]. Our previous work in [42], [43] and [44] presented some of the first methods to estimate sympathetic arousal from skin conductance explicitly making use of the point process state-space approach. In our first approach described in [42], [43], we related an unobserved arousal state to the occurrence of neural impulses underlying SCR events. We binned the impulses to form a sequence of 0's and 1's based on whether or not they occurred at a given time instance. The method was successfully able to track arousal in subjects across periods of relaxation, and cognitive and emotional stress. Our later method in [44] sought to incorporate features from both the phasic and tonic components into the estimation scheme as well. Here, estimation was based on two additional continuous-valued features-the raw tonic values and a set of phasic-derived values that were interpolations over the log-transformed SCR peak amplitudes [44]. Fig. 1 depicts a skin conductance signal and the neural stimuli underlying it. Note that each impulse has an amplitude (the height of each impulse captures the synchronized discharge effect from multiple neurons). In the figure, the underlying bursts of sudomotor nerve activity were extracted using our deconvolution approach in [43] ( [43] presents both a deconvolution strategy and an arousal state estimation method). Our state estimation approach in [42], [43]-which we label binary observation-based state estimation (BOBSE)-ignored the amplitudes of the neural stimuli when detecting arousal. Instead, it treated all the impulses as 1's. The amplitudes of the SCRs (which are directly related to the amplitudes of the stimuli that generate them) capture sympathetic arousal information [45]. Larger amplitudes are typically reflective of higher levels of arousal [46], [47]. BOBSE therefore results in an important loss of information. Our subsequent method [44]-which we call binary and continuous observation-based state estimation (BCOBSE)-sought to improve on the earlier work by incorporating additional continuous-valued information from the tonic levels and phasic SCR amplitudes. As noted earlier, we derived two continuous-valued signals from these quantities. Unfortunately, the method could result in overfitting to one of them [44]. Hence, BCOBSE required that the noise variance estimates of the continuous-valued variables be monitored continuously during estimation. Part of the parameter updates were halted once overfitting began to occur (a form of early-stopping) [44]. Here we present a method to estimate arousal from skin conductance explicitly taking into account the binary nature of the underlying neural impulses and their amplitudes, but which avoids the problem of overfitting. Our new method performs estimation based on the MPP nature of the observations (i.e., each stimuli event is associated with an amplitude, but these amplitudes only exist at discrete time instances) and avoids continuous-valued signals altogether. It is these continuous-valued signals, which have values at each point in time, that usually cause overfitting [44].

A. DATA
The methods in [42]- [44] primarily used three different data sets. Here, we employ the same data sets and use representative cases from either one to illustrate how our current model performs in comparison to the previous methods.
We first use the Non-EEG Biosignals Data Set for Assessment and Visualization of Neurological Status (which we abbreviate as the Neurological Status Assessment Data Set) [48]. This contains skin conductance recordings from a group of college students who were exposed to several conditions meant to elicit stress. The experiment included physical, cognitive and emotional stressors. Subjects were required to stand, walk and then walk/jog during the physical stress period. This lasted for approximately 5 min. The cognitive stress period consisted of two different tasks. In the first, the subjects had to count backwards in 7's beginning at 2485. This lasted for approximately 3 min. The Stroop test was the second cognitive stressor. This is a color-word association task where the names of certain colors are displayed on a screen in font colors that do not necessarily match their meaning. The need to perform correct identification requires concentration on the part of the subject and is known to generate psychological stress [49]. The Stroop test lasted for approximately 2 min. Subjects were shown a horror movie clip for 5 min as an emotional stressor. The physical, cognitive and emotional stress periods were interspersed by 5 min intervals of relaxation. The authors of the data set also categorized the 40 s just prior to the cognitive stressors as a stress period as they noted the subjects visibly displaying signs of stress while even being given the instructions [48]. Similar to our prior work in [42]- [44], we discard the physical stress period and focus only on the psychological aspects of the experiment. We use the data from subject 1 as a representative case.
We secondly use the Stress Recognition in Automobile Drivers Data Set (we abbreviate this as the Driver Stress Data Set) [50]. This contains skin conductance recordings from subjects who drove along a set route in Boston as part of the experiment. The route consisted of toll roads, highways and city driving. As we earlier pointed out in [42], the precise timings of the different road conditions are unavailable and we used the data from a single subject for whom the road condition timings had to be approximately matched to a figure in [50] which originally introduced the data set.
We finally use the PsPM-HRA1: Skin Conductance Responses in Fear Conditioning with Visual CS and Electrical US Data Set [51] (we call this the Fear Conditioning Data Set). Additional details regarding this data set are found in [52], [53]. A fear conditioning paradigm consists of a conditioned stimulus (CS) and an unpleasant unconditioned stimulus (US) which could be in the form of an electric shock. Through repeated pairing (i.e., multiple trials), a subject begins to show a response typically seen for the US for the CS as well. For instance, if a subject is shown a blinking light prior to an electric shock being given, repeated exposure would cause the subject to gradually develop a fear response to the blinking light alone. In this data set, there were two types of CS in the form of orange and blue circles (CS+ and CS−) shown on a computer screen. The CS+ stimuli were accompanied by a 0.5 s electric shock 50% of the time after a delay of 3.5 s. The CS− stimuli were never accompanied by the shock. Similar to the first data set, we select the data from subject 1 as a representative case.

B. PRE-PROCESSING AND DECONVOLUTION
We pre-process the data similar to [42]- [44]. The data is first lowpass filtered at 0.5 Hz prior to tonic-phasic separation using cvxEDA [54]. We thereafter deconvolve the skin conductance signals to estimate the sparse train of neural stimuli responsible for the phasic variations using our previous method outlined in [43]. Deconvolution is based on a two-step coordinate descent strategy [55] that solves for the neural stimuli and the model parameters governing sweat diffusion and evaporation in turn until convergence. The deconvolution algorithm relies on the coupled differential equation formulation originally described in [56]. The final output yields the sparse train of neural stimuli and the two model parameters that determine the rise and decay times of the bi-exponentially-shaped phasic SCRs.

C. STATE-SPACE MODEL
Previous work in neural state estimation has assumed random walk-type models for the state equation (e.g. [26]- [28], [30], [33]). Here too we assume that sympathetic arousal x j evolves with time as where ε j ∼ N (0, σ 2 ε ) is process noise. The occurrence of a neural impulse n j is a Bernoulli-distributed random variable with probability mass function p n j j (1 − p j ) 1−n j , where p j = P(n j = 1). We relate p j to x j via a logit transformation based on the theory of generalized linear models [57].
Here, β is a constant to be determined. According to the sigmoid relationship resulting from this transformation, the probability that a neural impulse occurs rises with an increase in sympathetic arousal (i.e., the higher the arousal, the more likely it is that impulses occur). Other studies have also used the rate at which SCRs occur (equivalent to the rate at which the neural impulses underlying the SCRs occur) in a similar manner as an index of sympathetic arousal [58]- [60]. We determine β empirically similar to [42]- [44] assuming that x j ≈ 0 at the very beginning of the random walk. Therefore, and we approximate p 0 by the average probability of an impulse occurring during the entire experiment [42]- [44].
In the absence of multi-day data to estimate a person's normal physiological baseline, we assume that this approximation captures the natural SCR occurrence probability in daily life averaged over rest periods, stressors and other day-to-day activities that cause emotional arousal to fluctuate. We next consider the continuous-valued amplitude r j of each neural impulse. As noted earlier, SCR amplitudes capture sympathetic arousal information [45], and they have likewise been used in several studies [46], [47]. Higher levels of arousal are reflected in higher SCR amplitudes (and by implication, higher underlying neural impulse amplitudes that generate them). Prior neural state estimation work involving continuous-valued variables has made use of linear or log-linear models for developing mathematical relationships with the state variable [27], [28]. Here we assume a linear relationship.
where v j ∼ N (0, σ 2 v ) represents sensor noise and modeling error, and γ 0 and γ 1 are coefficients to be determined. We assumed similar linear relationships between the state variable and the interpolated SCR amplitudes and tonic values in our earlier skin conductance work as well [44]. The joint density function for the observed neural stimuli is therefore, Accordingly, an amplitude is absent if there is no impulse and the amplitude is Gaussian-distributed when an impulse is present. Both the arousal states x j and the model parameters γ 0 , γ 1 , σ 2 v and σ 2 ε are unknown. We estimate arousal and recover the model parameters using an expectation-maximization (EM) framework. Our algorithm iterates between the E-step and the M-step until convergence [61].

D. E-STEP
Given the sequence of observations Y J = {(n 1 , r 1 ), (n 2 , r 2 ), . . . , (n J , r J )}, we first wish to estimate x j . We do so using Bayesian filtering (we derive the filter equations in Appendix A). We utilize both a forward filter and a backward smoother. We make a Gaussian approximation to the posterior density p(x j |y j ) at each point in time to obtain the following recursive equations for the forward filter for j = 2 : J . Predict: Update: It is of interest to note that the filter equations switch between those in [26] (estimation based solely on one binary variable) and [27] (estimation based on one binary variable and one continuous variable) depending on whether n j = 0 or n j = 1. In [26], Smith et al. developed a state-space method to estimate an unobserved cognitive learning state from binary correct-incorrect responses in behavioral experiments involving animal models. Prerau et al. [27] extended the model to incorporate reaction times (i.e., a continuous variable) as well so that the estimated learning state was based on both a binary variable and a continuous variable. Also, note that x j|j appears on both sides of (8) and (11) and thus the equations have to be solved numerically using the Newton-Raphson method.
After proceeding in the forward direction, we obtain a smoother estimate by reversing direction, thereby making use of all the observations to determine an improved x j at each point. The backward smoother equations are [62]: the M-step updates (the derivations are provided in Appendix B) for the (n + 1) th iteration are: The algorithm iterates between the E-step and the M-step until convergence. We use a criteria similar to [27] and consider the model parameters to have converged once their mean absolute does not exceed a tolerance level in the order of 10 −8 in consecutive iterations.

A. SIMULATED DATA
We simulated two sets of data and tested the ability of our algorithm to estimate an unobserved sympathetic arousal state and recover model parameters from a set of MPP observations. Recall that we calculate β as shown in (3) based on approximating p 0 by the average probability of an impulse occurring in the data (i.e.,p 0 ) similar to [42]- [44]. We simulated two MPPs having a true value of p 0 = 0.05, but where the approximationp 0 was slightly higher than and slightly lower than 0.05 in the data.
The model parameters and their estimates for both cases are shown in Table 1. Fig. 2 shows the state estimation results. In general, there is a good fit to the simulated data although the state estimate can deviate from the true value in regions where there is no impulse for a long period of time. The estimates are closer to the true state in regions where more impulses tend to occur. We discuss the implications of this in the next section.

B. EXPERIMENTAL DATA
We compare BOBSE, BCOBSE and MPP-based estimation on the representative cases of the data sets described above. We used deconvolved skin conductance data for BOBSE following [43] while peak detection was used for BCOBSE as described in [44]. In each case where sympathetic arousal is tracked based on experimental data, we pay particular attention to the state estimate x j and the high arousal index (HAI) calculated as p(x j > x threshold ). The HAI is inspired by the term known as the ideal observer certainty level in [26]. It expresses the probability that an impulse (binary event)  05 (right -smaller approximation). The sub-panels within each figure respectively depict: (a) the MPP observations (blue) and the estimated r j (red); (b) the impulse occurrence probabilities p j (blue) and their estimate (red); (c) the arousal state x j (blue) and its estimate (red); (d) the quantitle-quantile (QQ) plot for the residual error of x j . occurs more than just by chance in a behavioral learning experiment. Since p j is related to x j according to (2), the same quantity can be calculated based on the probability that x j exceeds some threshold. The threshold would capture the physiological baseline arousal state value and would in turn be related to a certain baseline impulse occurrence probability. Similar to [43], we set x threshold to the median state value considering that the experiments in the Neurological Status Assessment Data Set and the Driver Stress Data Set contain both relaxation and stress periods. Fig. 3 shows the arousal estimation results from subject 1 in the Neurological Status Assessment Data Set. The subject's HAI is above or very close to the 90% threshold during the cognitive stressors according to all three methods. There are marked differences thereafter. The region of particular interest is soon after the emotional stress period (horror movie clip) began. Here, the deconvolution algorithm detects two small neural impulses at about 700 s which cause a very slight increase in skin conductance. The BOBSE method [42], [43] shows a considerable increase in the HAI due to these small impulses. The BCOBSE method [44] also has a considerable increase in HAI at this point (it even exceeds the 90%) threshold. Moreover, the HAI for BCOBSE extends into a considerable portion of the relaxation period as well. In contrast, the MPP-based estimate does not extend considerably into the relaxation period but rather declines shortly after the cognitive stressors conclude. The decline in HAI is also more gradual. Notably, unlike in the case of the other two methods, it only provides a slight increase in HAI and x j due to the two small impulses that occur at the beginning of the horror movie. This is also in keeping with the increase in the skin conductance level at this time which is very minor.
Results from the Driver Stress Data Set are shown in Fig. 4. Both the BOBSE [42], [43] and MPP-based estimates are similar. However, as in the previous case, the MPP estimate is not as sensitive to an increase in HAI due to small neural impulses. In the region shortly before 1000 s, there are a few such small impulses. BOBSE yields a greater increase in HAI compared to the MPP estimate. Moreover, in the vicinity of 3000 s, the BOBSE method shows a moderate HAI increase followed by a second smaller HAI increase. However, on closer examination of the skin conductance signal and the detected impulses, the second increase is actually larger. The MPP estimate correctly places the larger HAI increase second. The BCOBSE method provides a more jagged emotional arousal state x j . The HAI also has sharp unintuitive rises and falls indicating an almost binary-like transition. Fig. 5 shows results from the Fear Conditioning Data Set. In typical fear conditioning studies, it is generally the (c) the state estimate and its 95% confidence limits; (d) the probability of impulse occurrence and its 95% confidence limits; (e) HAI. For the top right figure, the sub-panels respectively depict: (a) the skin conductance signal; the green and black dots above it depict the detected peak locations; (b) the phasic-derived value (solid green line) and its fit (dotted line); (c) the tonic level (solid light mauve line) and its fit (dotted line); (d) the state estimate and its 95% confidence limits; (e) the probability of impulse occurrence and its 95% confidence limits; (f) HAI. The background colors in turn denote the instruction period, the counting task, the Stroop test, relaxation and emotional stress. The regions above 90% and below 10% are shaded in red and green respectively in the lowest sub-panels of each figure.
average differences of a particular variable over the different trial types that is of interest. The averaging across trials is similar to the analysis of event-related potentials (ERPs) with electroencephalography signals. Here too, we provide ERP-like responses averaged across the three types of trials. For all three estimation schemes, on average, the highest (c) the state estimate and its 95% confidence limits; (d) the probability of impulse occurrence and its 95% confidence limits; (e) HAI. For the top right figure, the sub-panels respectively depict: (a) the skin conductance signal; the green and black dots above it depict the detected peak locations; (b) the phasic-derived value (solid green line) and its fit (dotted line); (c) the tonic level (solid light mauve line) and its fit (dotted line); (d) the state estimate and its 95% confidence limits; (e) the probability of impulse occurrence and its 95% confidence limits; (f) HAI. The background colors in turn denote rest, city driving, toll road, highway, toll road, city driving, toll road, highway, toll road, city driving and rest. The regions above 90% and below 10% are shaded in red and green respectively in the lowest sub-panels of each figure.
x j and skin conductance levels occur in the CS+ trials with the US (electric shock) followed by the CS+ trials without US. The CS− trials have the lowest response. Notably, the arousal state x j shows considerable variation (c) the state estimate and its 95% confidence limits; (d) the probability of impulse occurrence and its 95% confidence limits; (e) the 10 s ERP-like plot for skin conductance for the CS− (green), CS+ without a shock (mauve) and CS+ with the shock (red) trials; (f) the 10 s ERP-like plot for x j along with the confidence intervals. For the top right figure, the sub-panels respectively depict: (a) the skin conductance signal; the green and black dots above it depict the detected peak locations; (b) the phasic-derived value (solid green line) and its fit (dotted line); (c) the tonic level (solid light mauve line) and its fit (dotted line); (d) the state estimate and its 95% confidence limits; (e) the probability of impulse occurrence and its 95% confidence limits; (f) the 10 s ERP-like plot for skin conductance for the CS− (green), CS+ without a shock (mauve) and CS+ with the shock (red) trials; (g) the 10 s ERP-like plot for x j along with the confidence intervals.
with the BCOBSE method while there is much less variation with the BOBSE method. The MPP estimate shows a level of variation in x j that is in-between that of the other two.

A. SIMULATED DATA
With both sets of simulated data, there is a good agreement between the true state x j and its smoothed estimate. VOLUME 8, 2020 The quantile-quantile (QQ) plots of the residual error between the true and estimated states lie close to the diagonal indicating the ability of our algorithm to estimate x j from a set of MPP observations accurately. However, we do note that in regions where an impulse is absent (due to x j being low, and consequently the impulse occurrence probability being small), the estimate can fail to capture small deviations in the sympathetic arousal state. Since estimation is performed by only observing n j = 0 when neural impulses are absent, the algorithm has limited information to work with at these locations. Consequently, if an impulse is absent for a long period of time, then the EM estimate is unable to capture small variations in x j during that time. In the real-world, this would translate to scenarios where there are small variations in an individual's emotional arousal, but none so large as to cause noticeable SCRs. If no SCRs occur in this manner (or no underlying neural impulses occur), then our algorithm is unable to capture the tiny emotional variations during that time. This issue would, however, be common to skin conductance-based approaches for emotion recognition-if no SCRs occur, then the algorithms have little to perform estimation on.
Moreover, note that our EM algorithm does not place any constraints on the sign of the impulse amplitudes (Fig. 2). Therefore, the algorithm could be used for other, more general MPP-based estimation problems as well. For instance, assume that a particular financial time series is in the form of losses and gains (both negative and positive) that occur at discrete time instances. Our algorithm would be able to estimate a latent state underlying the observations (e.g. an unobserved market volatility state). Skin conductance represents a special case where the amplitudes are positive.
B. EXPERIMENTAL DATA SCR amplitudes contain sympathetic arousal information [45]. The BOBSE method in [42], [43] did not incorporate the amplitudes of the SCRs (or equivalently the amplitudes of the underlying neural impulse bursts) for estimating sympathetic arousal from skin conductance data. While this method provided good initial results, the lack of the amplitude information could result in an estimate that had an exaggerated response to small neural impulses. This is evident from the results for BOBSE in Figs. 3 and 4. In Fig. 3, there is an unusually large response at the beginning of the horror movie and in Fig. 4, the over-emphasis on the small impulses is also seen on occasions.
The BCOBSE method makes use of continuous-valued variables in addition to the binary SCR occurrences. However, there is a tendency to overfit since the location in the EM parameter space where the state fits to one of the continuous variables has a greater extrema [44]. Therefore, the M-step updates related to the continuous variables need to be monitored and their updating halted when overfitting begins to occur. The method can also result in a very low σ 2 ε estimate, a more jagged state estimate and sharp transitions in HAI. These effects are seen in Figs. 3 and 4. Fig. 5 also help point out some of the differences between the three estimation schemes. While the state estimate with BOBSE shows very little variation, the estimate with BCOBSE has considerable variation. The variation of the MPP estimate is intermediate. However, all three are consistent with the general skin conductance trend for the three classes of trials: CS+ with the US, CS+ without the US and CS−. The highest responses are for CS+ with the US and the lowest are for CS−. The CS+ without the US has an intermediate response.

Results from the Fear Conditioning Data Set in
Recall that neural impulse amplitudes only exist at distinct points in time and not everywhere. Estimation based on treating the observations as an MPP overcomes the sensitivity to small impulses and the tendency to overfit to a continuous variable. Consequently, only a small rise is seen at the beginning of the emotional stress period in Fig. 3 and a smoother state estimate is seen in Fig. 4. In both these regards, the MPP estimation outperforms the previous methods in [42]- [44]. Noise is a reality in any physiological signal. A noisy skin conductance signal can give rise to several small noisy impulses being detected during the deconvolution process and the MPP-based estimation scheme would be less sensitive to them in such scenarios.
We finally point out an additional factor regarding the use of the MPP estimation scheme with experimental data. Occasionally, the EM algorithm can have convergence issues. This can particularly occur when the signal is very long. In these cases, the parameters γ 0 , γ 1 and σ 2 v tend not to settle down, while σ 2 ε keeps being driven down at the M-step in successive iterations. This can result in convergence at a location where the process noise variance is very low and there is hardly any variation in x j . We recommend monitoring the model parameters in this case, particularly when the skin conductance signal is long, and possibly relaxing the convergence criteria. We relaxed the convergence criteria to 10 −5 for the Fear Conditioning Data Set due to this convergence issue.

C. APPLICATION TO REAL-WORLD SCENARIOS
Our current state-space method functions in an offline manner as it incorporates both forward filtering and backward smoothing. In a real-world application, the forward filter could be run on a wearable device to give a user an estimate of his/her sympathetic arousal level continuously. The full EM algorithm could be executed in the background (possibly on the cloud) from time to time so that the model parameters can update periodically to account for any stochasticity. A block diagram of how this would work is shown in Fig. 6.
Skin conductance can be measured conveniently using sensors attached to the fingers or palms, or even with wrist-worn devices. Our current results indicate the suitability of using skin conductance to estimate sympathetic arousal for wearable healthcare applications. Further validation and testing on targeted research populations would, however, be required prior to real-world deployment. Moreover, this would also require that the algorithm be embedded on an electronic device that processes skin conductance data in real-time according to Fig. 6. Future work would include developing this skin conductance-based wearable monitor. We anticipate that such a device would be helpful in monitoring patients diagnosed with certain types of neuropsychiatric disorders, and this is our primary target population. Post-traumatic stress disorder (PTSD), for instance, is associated with symptoms of hyperarousal, while depression is typically associated with low levels of arousal [43], [44]. Validating our method on a patient population diagnosed with PTSD or depression with appropriate clinical collaboration is one of our future directions of research as well.

VI. CONCLUSION
Bayesian state-space methods have been used for neural state estimation in a number of applications including behavioral learning and movement decoding, and for monitoring sleep, unconsciousness and the comatose state. State estimation based on observing spike train-like signals or other binary variables often underlies these methods. We have recently adapted these methods to detect sympathetic arousal based on the neural impulses that are responsible for generating ''spikey'' skin conductance signals. However, our previous methods in [42]- [44] had a few inherent shortcomings. These include the lack of accounting for the amplitudes of the neural stimuli that gave rise to the skin conductance variations and a tendency to overfit to continuous-valued skin conductance variables yielding a more jagged estimate. Here we overcome these challenges by developing a model that explicitly takes into account the MPP nature of the observations. The present method provides good results on simulated data, is less sensitive to small impulses that may occur due to minor increases in emotional arousal and also does not overfit to the continuous-valued variables. Future work would include extending the state-space model to incorporate additional physiological features that capture emotional changes including heart rate [63]- [65], electromyography [66], [67], cerebral blood flow [68] and respiration. Moreover, the inclusion of additional skin conductance channels can improve the deconvolution result [69], [70], and deconvolution can also improve tonic-phasic separation [71]. A better estimate of the underlying neural stimuli would help improve the detection of sympathetic arousal as well.

APPENDIX A DERIVATION OF THE E-STEP FILTER UPDATE EQUATIONS A. PROBABILITY OF IMPULSE OCCURRENCE
We make use of the following identities for the positive quantities p j and (1 − p j ) when deriving the filter equations.

B. MEAN AND VARIANCE OF THE POSTERIOR DENSITY FUNCTION
We wish to calculate the mean and variance of the following posterior to obtain the filter updates for x j|j and σ 2 j|j .
p(x j |y j ) = p(x j |y j−1 )p(y j |x j ) p(y j |y j−1 ) = p(x j |y j−1 )p(n j ∩ r j |x j ) p(y j |y j−1 ) 1) CASE WHEN n j = 0 Consider the case when n j = 0 where p(n j ∩ r j |x j ) = (1 − p j ). (26) The partial derivative of the logarithm term above is set to 0 to solve for the mean.
x j = x j|j−1 + σ 2 j|j−1 (n j − p j ) since n j = 0 (30) VOLUME 8, 2020 Equation (30) provides the filter update for x j|j . Taking the second partial derivative we have, The filter update for σ 2 j|j is given by, 2) CASE WHEN n j = 1 Next consider the case when n j = 1 where a neural impulse amplitude exists.
Setting the partial derivative of this logarithm term to 0 we have, Solving for x j yields, + γ 1 (r j − γ 0 − γ 1 x j|j−1 ) since n j = 1. (38) This provides the update for x j|j when an impulse exists. We next take the second partial derivative to obtain the variance σ 2 j|j .
Similar to the case when n j = 0, the variance update is given by, The expected log-likelihood is, Similar to [26], we take B. M-STEP UPDATES FOR γ 0 AND γ 1 Take Q 1 as the term in Q containing both γ 0 and γ 1 .
Taking the partial derivatives with respect to γ 0 and γ 1 and setting them to 0 yield Solving the two simultaneous equations above provide the M-step updates for γ 0 and γ 1 .
C. M-STEP UPDATES FOR σ 2 v Take Q 2 as the term in Q containing σ 2 v .
Taking the partial derivative with respect to σ 2 v and setting it to 0 yields 2 = 0 (53) D. M-STEP UPDATE FOR σ 2 ε Take Q 3 as the term in Q containing σ 2 ε . We set x 0 = x 1 as a boundary condition allowing for some bias at the beginning [26].
Taking the partial derivative with respect to σ 2 ε and setting it to 0 yields