Touching Virtual Humans: Haptic Responses Reveal the Emotional Impact of Affective Agents

Interpersonal touch is critical for social-emotional development and presents a powerful modality for communicating emotions. Virtual agents of the future could capitalize on touch to establish social bonds with humans and facilitate cooperation in virtual reality (VR). We studied whether the emotional expression of a virtual agent would affect the way humans touch the agent. Participants were asked to hold a pressure-sensing tube presented as the agent’s arm in VR. Upon seeing the agent’s emotional expression change, participants briefly squeezed the arm. The effect of emotional expressions on affective state was measured using self-reported valence and arousal as well as physiology-based indices. Onset, duration, and intensity of the squeeze were recorded to examine the haptic responses. Emotional expression of agents affected squeeze intensity and duration through changes in emotional perception and experience. Haptic responses may thus provide an implicit measure of persons’ experience towards their virtual companion.


INTRODUCTION
T HE future society is predicted to increasingly involve interaction between humans and artificial social agents [16]. While virtual reality (VR) provides an extraordinary platform for such interactions to take place, the usefulness of VR remains limited due to lack of tangible interactive elements. By allowing tactile sensation of surrounding virtual objects, haptic technology increases a user's sense of being part of the projected environment and its events. The role of haptics is similarly crucial for social interactions, as non-verbal communication can strongly influence our feelings about others [4], [38] and ability to cooperate and solve problems together [58]. It is therefore vital for social VR to integrate of multimodal technology and take full advantage of haptic communication [33], [58].
Haptic interfaces have shown capable of communicating emotions and promoting interpersonal trust [3], [37], [52], [57]. The social effects of mediated touch are likely due to the resemblance between haptic interfaces and natural touch, thus transferring the social-emotional effects of skin contact to computer-mediated interaction [24]. Indeed, both receiving a simulated touch from an artificial agent in VR and receiving a mediated touch from another human via a haptic link can result in a more positive social impression and enhanced likelihood of cooperation [20], [24]. However, the social effects of virtual touch strongly depend on the cultural and relational context of the interaction [51], [53]. Recent findings also suggest that not only are the emotional outcomes of virtual touch changed due to accompanied context but also the sensory experience of virtual touch is affected by contextual information, such as emotional facial expressions of the sender [45].
So far, the social implications of virtual touch have been studied by observing and measuring the emotions and actions of touch recipients. However, studies investigating touch in non-social interaction scenarios, such as typing a message or handling a joystick in a driving simulator, show that also the production of touch manifested as haptic response is sensitive to a person's emotional states [19]. That is, people's tactile behavior towards interaction devices are emotionally loaded. For example, studies show that emotional information can be extracted from gestures and finger-strokes on mobile devices [21], [40], [49], or keystrokes [41]. However, whether emotional states implicitly influence haptic expressions in mediated social interactions remains unknown [10].
The objective of the present study was therefore to investigate whether perceiving a virtual agent's facial emotional expression in a social VR environment influences haptic responses when touching the agent. Specifically, we asked the following research question: RQ1: How do emotions expressed by virtual agents affect a human's haptic response towards them? To answer the question, we presented an affective agent as displaying facial expressions of seven basic emotions [12] in VR and measured the degree to which the user's emotional states were affected by the agent's emotional expressions. We then requested users to touch the agent by physically squeezing a pressure meter that was virtually presented as the arm of the agent. We measured the onset, duration, and intensity of their touch along with self-reports and physiological responses that quantified the user's affective state.
Furthermore, we examined whether emotions affect haptic responses implicitly, that is, without a user's conscious control, by requesting touch towards the agent to be exercised in a consistent manner irrespective of the situation. To capture the users' emotional state in the context, we used a wide range of physiological measures including electrodermal activity (EDA) facial electromyography (fEMG), and electrocardiography (ECG). Furthermore, we attempted to determine whether the haptic expression varied as a function of how the user themselves felt while touching, or how they perceived the agent feels. To this end, we asked the following research question: RQ2: To what extent are human-agent haptic responses predicted by a user's experienced affective states on the one hand, and their perception of the agent's emotions on the other? The perceived and experienced affective states were independently measured by requesting users to rate either their own or the agent's perceived affective state. This was done using self-reports along two commonly used affective dimensions: valence (positive versus negative) and arousal (tense versus relaxed) [47], [48].
The study filled a substantial gap in the literature of affective haptics by demonstrating the involvement of emotions in haptic responses when touching a virtual agent in a social VR setting. In summary, the contributions of the present research and findings from the experimental study were the following: We conducted the first study of emotional perception and haptic responses during human-virtual agent touch, showing emotional expressions displayed by agents affect a user's affective state, as indicated both by self-reports and physiological responses. We showed that perceiving the agents' emotional expressions affected haptic responses towards the agent, affecting both duration and intensity of their touch. Negative, high-arousal states evoked by the agents predicted longer response duration and higher maximum pressure of the haptic touch than positive or low-arousal states. We showed that both the users' own emotional valence and arousal as well as the valence and arousal attributed to the virtual agent predicted the duration and intensity of users' haptic response. Within-subject (trial-to-trial) variation in haptic responses was better explained by agent's perceived affective state whereas between subjects variation (user-to-user) was better explained by users' own affective state, indicating the need for personalised models for predicting individual user's affective states based on their haptic responses.

RELATED WORK
Previous studies in human-computer interaction and psychology have used a variety of paradigms to investigate communication of emotions via touch (Section 2.1). As a result, it has become clear that touch has an intrinsic hedonic tone, as it evokes positive emotions in the receiver, promoting cooperation. (Section 2.2). Moreover, both perceived and experienced emotional states have been shown to affect the perception and production of touch (Section 2.3). In the following sections, we will review these distinct bodies of literature to present the knowledge gap targeted by the research questions of the present study.

Touch is Effective in Communicating Emotions
Although any sensory modality can in principle be used in communication, both research and engineering have long centered on the visual and auditory sensory modalities [39]. However, it is now becoming clear that substantial benefit can be gained from haptic communication interfaces, for instance by supplementing communication of other modalities [31], [32], [55], [57]. In a study by Hoggan and colleagues [31], for example, users sent haptic messages by squeezing their mobile phone during the call. The pressure levels were mapped onto vibrations on the recipient's phone to supplement the verbal communication. The haptic link was found to be spontaneously used for greetings or catching the recipient's focus as well as to influence the receiver's emotions. In another study [43] users encoded touch gestures into vibrations presented on another person's hand. Asking the users to convey four types of emotional states, they found intense touch gestures to be used to convey negative and higharousal states and soft finger touch to be used to communicate positive and relaxed emotions. Furthermore, in a study by Bailenson et al. [5], a two degrees of freedom force-feedback joystick was used to enable pairs of users to convey a range of emotions to each other. By exerting varying amounts of force and accelerating or decelerating the joystick movements the users were able to encode disgust, anger, sadness, fear, and joy to be correctly recognised by their coparticipant at higher than chance level.
Since touch and physical contact have long been recognized as critical for human development and social bonding [29], haptic devices were designed to target a wide range of complex emotional messages. Thus, HCI research was conducted to determine the utility of tactile communication as a means to establish emotional connectedness through mediated physical contact [30], [54], [55], [56]. For instances, Tsetserukou et al.'s "iFeel_IM" [56] was designed to intensify sender's feelings and simulate the receiver's feeling during text-based communication. The system recognized nine emotions (anger, disgust, fear, guilt, interest, joy, sadness, shame, and surprise) from text messages and encoded these to different forms of touches (for example, hug, shiver, tickle) produced by different types of tactile devices such as "HaptiHug", "HaptiShiver", and "HaptiTickler". These devices were found effective in inducing and sharing emotional states between the users.

Touch Evokes Positive Emotions and Cooperation
While the importance of touch for emotional communication is well-recognised, the Midas Touch-effect suggests a more direct role for touch in that receiving it elicits immediate emotional and behavioural consequences. For example, Fisher et al. [15] conducted a study in which a library clerk briefly touched the hand of the student while returning a library card. The authors found that participants felt more positive towards the clerk and the library if being touched, despite not noticing being touched. Likewise, a brief casual touch was found to improve evaluations of car salesmen [14], increase the likelihood of people helping dog owners looking after their dog [22], and resulting in bus drivers allowing passengers to receive free rides [23]. If the increased affinity and interpersonal cooperation were due to the touch itself rather than the physical proximity of the interactants, one would predict the Midas Touch-effect to occur even when communicated over distance by a technological device. Indeed, Haans et al. [25]) reported participants were more likely to help a confederate if being touched by them via a haptic device during a preceding mediated interaction. Furthermore, in Ultimatum decision-making game, Spap e et al. [51] found mediated touch between remotely operating participants increased the likelihood of economic offers being accepted. A more recent study replicated this effect with a virtual reality paradigm, in which touch of a virtual agent was presented through a haptic glove and was found to promote compliance to the agent's unfair economic offer [28].
However, the compliance promoting effect of mediated touch is less robust than previously thought or strongly depends on context and personality [28]. For example, even though Haans and colleagues [25] demonstrated the virtual Midas touch-effect as being as strong as the original field experiment Midas touch-effect, it was statistically nonsignificant. Similarly, while [51] found mediated touch to enhance compliance in the Ultimatum game, the effect-size was small and not significantly different from receiving an auditory cue from the co-player. More failures to replicate were found with larger samples and more controlled, social VR designs [50], and even with more realistic types of touch [46]. Only persons who were less motivated by profit or were not worried of unfair treatment were found to be sensitive to the virtual Midas touch-effect [28].
In summary, while there is consensus that touch has a central role in emotional processes, the claim of an automatic, immediate effect of mediated touch on receiver's prosocial behaviour remains controversial. Even if a virtual Midas touch-effect exists, it is unlikely to reliably translate in conformity, unless supported by the right circumstances as defined by personality factors and cultural context [28], [53]. And yet, while touch may not have as mechanistic social consequences as originally assumed, it is clear that people are able to use touch to communicate emotions via technological tools. Consequently, a potentially more fruitful line of research could be to investigate the way people touch each other when interacting in social VR. In the next section, we will therefore summarise recent developments in the research on how experiencing and perceiving emotions in others affect the perception and production of touch.

Emotions Affect Touch Perception and Production
While most research on affective touch approaches emotions as the consequence of touch, recent studies suggest emotions may also determine touch perception [2], [27], [45]. For example, Ahmed et al. [2] designed an interactive VR scenario in which a virtual agent facially expressed emotions while reaching out to touch the user. Touch displayed by agents expressing anger was reported as more intense than touch delivered by sad agents -even though the touch intensity was held constant. A neural mechanism encoding touch as emotional feeling may underlie this effect as the immediate somatosensory brain responses to the agent's virtual touch were found to affected by the agent's touch-preceding emotional face expression [45]. Thus, seeing emotional expressions may alter expectations towards touch, causing top-down effects on tactile processing and perception. In a complementary analysis, the emotional modulation of touch perception was related to individual differences, such as behavioural inhibition and gender [27], suggesting that the connection between emotional and tactile perceptual processes was pronounced in individuals with high responsiveness to negative emotions. Thus far, we have discussed touch mainly in its passive sensing capacity, which is commonly termed tactile perception. However, the active motor capacity of touch in producing a haptic response is equally relevant for understanding the involvement of emotions in social haptics. In general, studies show that emotions can be detected even from non-social haptic responses, such as finger gestures on mobile devices [21], [40], [49]. For example, Gao and colleagues [21] demonstrated that data extracted from finger strokes during a mobile game play can be used to detect four distinct affective states (excited, relaxed, frustrated, bored) and two levels of arousal and valence with a 69-89 percent accuracy. Similarly, Shah and colleagues [49] used several distinct motor tasks on a touch screen device to extract series of finger stroke features to detect positive, neutral and negative emotional states with a 90 percent accuracy. Going beyond finger strokes on mobile devices, Gaffary et al. [19] designed a study where users engaged in a stressful arcade car driving game while their haptic responses were recorded with a geomagic touch device equipped with force-and kinematic sensors. The induced stress resulted in spontaneous haptic responses that could be reliably detected from the pressure data.
Aforementioned studies have shed much light on the relationship between affective states and haptic expressions. However, the literature remains uninformative regarding two critical aspects. First, it remains unknown to what extent emotions directly affect production of touch or whether the findings from other studies are due to the task itself affecting both emotions and haptics. For example, controlling a car or operating a mobile device in challenging circumstances may induce altered motor responses, but this may either be due to the situation causing the challenge, or the emotional impact of the challenge. To validly determine the existence of a direct, implicit effect of emotion on haptic responses, the task itself should be experimentally independent from the type of touch. Thus, in the present study, participants were asked to present the same type of touch regardless of the emotion expressed by the virtual agent. Second, effects of emotion on haptic responses have not been studied in mediated social interaction, as when touching another human or when touching an artificial agent. Our objective was thus to test whether even subtle socialemotional cues, such as virtual companion's emotional expressions, give rise to consistent features of haptic expressions when touching the virtual agent.

Participants
Thirty six university students (21 female, and 15 male) with an average age of 29 years (SD = 4.37) took part in the experiment. All the participants were right handed, and had either normal or corrected to normal vision. Participation was voluntary and each participant received a movie ticket as a compensation for their time. They had full right to withdraw their participation or interrupt the experiment at any time without negative consequences. Participants went through the written instructions and information about the experiment and signed an informed consent before starting the experiment. The study was approved by the research ethics review board of the University of Helsinki.

Procedure
Following instructions, the participants were assisted with mounting physiological sensors and putting on a headmounted display (HMD) and headphones. The participants were seated at desk and instructed to hold a haptic input device in front with their right hand. The participant's left hand rested on arrow keys of a regular PC keyboard. The virtual reality was then presented through the HMD where the participants could see their point of gaze projected as a green dot in the visual field. The gaze was tracked using an eye tracking system integrated into the headset. After calibration of the eye-tracking system, information about the participant's age and gender were asked. Then, before starting a trial, a tube of the haptic input device was inflated to 103 kPa to ensure constant level of air pressure in each trial. Each trial started with a view of the participant's right hand holding the arm of a virtual agent. If the participant was looking away from the face of the agent, a green dot was presented moving from the participant's point of focus towards the agent's face (see Fig. 1, Panels a and b). After ensuring the participant was looking at the agent's face, an emotional expression animation of 1,700 ms was initiated ( Fig. 1, Panel c). At 1,000 ms, a 440 Hz sinewave beep sound of 500 ms was played as a cue to start squeezing the input device ( Fig. 1, Panel d). Participants were instructed to squeeze the device in response to the tone and maintain pressure for 1 s. Participant were asked to apply the same amount of force as when touching a real person and use the same type of touch regardless of the emotion expressed by the virtual agent. After releasing the pressure, the VR environment was replaced by a blank screen for 500 ms (Fig. 1, Panel e). In the end of each trial, a questionnaire was shown (see Section 3.4). The questionnaire was filled out using arrow keys. Left and right arrow keys were used to move along a 5-point Likert scale and down arrow was used to answer and move to the next question.

Test Setup
The setup ( Fig. 2) comprised of a MSI gaming laptop (With a graphics card, nVidia GTX 1080) running a Windows 10, a HTC Vive head-mounted display (Resolution: 1080 x 1200 pixels per eye, 2160 x 1200 pixels combined, Refresh Rate 90 Hz, 110 Field of View), gaze tracker, and a haptic input device that is described below.
Emotional Expressions. Virtual Agents' facial emotional expressions were created using Unity 3D 4.5.4 software (Unity Technologies, San Francisco, CA). Following Ekman and Friesen's [11] facial action coding system, we manipulated the facial action units of agent's face model to generate a set of expression animations, with three different dynamic parameterizations for each of the seven emotion categories (happiness, anger, fear, disgust, surprise, sadness, and neutral). The same animations were projected on the faces of four virtual agents (2 males and 2 females with black and white skin texture, see Fig. 3) developed by our group. The emotional expressions were validated within the current study as part of the self-report measures, see Section 3.4.
Haptic Input Device. An air-tube, pump, pressure sensor, and escape valve obtained from a regular blood pressure measuring device were used to build the haptic input device. For sensing the pressure, we used the MPX4250A manifold absolute pressure (MAP) sensor, which is designed to sense absolute air pressure within the intake manifold. The pressure was recorded in kilo pascal units that were converted into microvolts. See Fig. 4. The pressure sensor, pump, and escape valve were installed inside a cardboard tube (b) and the tube was attached onto its outer surface. The escape valve was used to Fig. 1. Trial procedure. A dynamic gaze cue (a) directed the participant to look at the agent's face. The agent's expression was neutral for 500 ms (b) before the 1700 ms face expression animation started, its peak expressiveness being reached at 1000 ms (c). At that point, a 440 Hz 500 ms auditory cue was played to indicate the participant to squeeze the agent's arm (d). A blank screen of 500 ms was presented after the touch ended (e) followed by questionnaire items (f). prevent accidental rise of pressure that would break the tube. A thin cover of cloth (c) was wrapped around the prototype to protect the tube and other components. An arduino uno (arduino.cc) micro-controller was used to control the device and to communicate with the computer as well as to deliver continuous pressure data to an amplifier designed for physiological sensor data.
Touch Animation. To enhance the feel of touching other person instead of a device, we projected a 3D model of the user's hand in the VR to hold the agent's arm. A squeezing animation was created and projected on the hand when the pressure level of the input device was detected to raise above 105 kPa. The combination of animated body contact and haptic sensation from the input device was expected to create a strong illusion of touching the agent.
Eye-Tracking. The users' gaze was tracked to make sure they paid attention to the agent's emotional expression. To track the user's gaze, we integrated binocular 200 Hz eye tracking cameras provided by Pupil Lab (https://pupillabs.com/) into the htc vive HMD. As described in Section 3.2 and shown in Fig. 1, the agent's emotional expression started only after fixing one's gaze at the agent's face.

Measures and Design
Participants undertook 168 trials divided into two blocks of 84 trials each. Within each block, all combinations among 4 agents (two male, two female), 7 emotional expressions (anger, disgust, fear, happiness, neutral, sadness, surprise) and 3 expression variations were presented in randomised order. At the end of each trial (see Fig. 1, a selection of selfreport items was presented. That is, questionnaire items either concerned the user's (experienced) affect, or the agent's (perceived) affect, so as to prevent confusing the object of the ratings (user versus agent). The order between blocks was counter-balanced between users. Self-report measures and physiological recordings of users' affective reactions to the agent's expression as well as metrics of their haptic responses were obtained from each trial.
Self-Report Measures. User affect was measured after the agent's emotional expression using two 5-point Likert scale items: "How positive did you feel while touching?" (1: not at all, 5: very much), and "How emotionally aroused did you feel while touching" (1: not at all, 5: very much)", indicating the dimensions of affective valence and arousal, respectively. Perceived affect of the agent, that is agent affect, was measured using similar 5-point Likert scale items but now the agent was the object of ratings: "How positive did you think the agent felt?", "How emotionally aroused did you think the agent was?". Additionally, to validate the animations of emotional expressions were correctly recognised, we asked users to classify them ("Was the agent ...") using a seven-alternative forced-choice scale (...afraid / angry / happy / neutral / sad / disgust / fearful?").
Physiological Measures. The physiological measures included facial electromyography (fEMG), electrodermal activity (EDA), and electrocardiography (ECG), recorded using the auxiliary inputs from a EEG amplifier (QuickAmp USB, Brain-Products) at a sample rate of 1,000 Hz. EMG was recorded from three pairs of bipolar Ag/AgCl electrodes placed over the zygomaticus major (ZM), orbicularis oculi (OO), and corrugator supercilii (CS) muscles in accordance with established guidelines for facial EMG [17]. The ZM electrodes were placed halfway between the right corner of the mouth and the tragus, measuring muscle activity related to the mouth (e.g., smiling). The OO electrodes were situated ca 2 cm below the outer canthi of the right eye, measuring activity related to closing the eyelids (a secondary, 'Duchenne' response to smiling). Finally, the CS electrodes were placed superior and medial to the right eye, near the eyebrow, measuring activity related to frowning (e.g., during anger). Preprocessing included a low-cut filtering at 5 Hz, application of the Hilbert transform for half-wave rectification, and smoothing the signal in 15 ms intervals. The data were then time-locked from the start of the emotional expression animation, and segmented into of epochs of 7,000 ms, including 1,000 ms of baseline activity (i.e., before the emotional expression started), the mean of which acted as baseline, and was subtracted from the post-stimulus data. Analysed data were baseline-corrected mean voltages at temporal bins of 0-1000 (anticipatory interval before touch cue onset), 1000-1700 (interval including touch and animated emotional expression), 1700-2700 (post-touch interval), and 2700-4000 ms (remaining activity), averaged across epochs of the same condition.
EDA data provide measures of phasic and tonic arousal due to the effect of the autonomic nervous system on the skin's sweat glands, causing increased conductivity. EDA was mas recorded with electrodes placed on the middle phalanxes of the left hand's index and middle finger, and preprocessed similar to EMG, except with the use of a 1 Hz high-cut filter and no rectifying or smoothing. Finally, ECG data, likewise related to arousal and the orienting response, were obtained from electrodes placed over the manubrium of the sternum and over the second lowest left rib. Processing involved low-cut filtering at 2 Hz, detection of the R peaks of the QRS-complexes, then calculating an inter-beat interval time series indicating temporal fluctuations in intervals between successive heart beats in ms. The inter-beatinterval data was then epoched and entered into the analysis similar to the EMG and EDA data.
Haptic Responses. Haptic responses were recorded with the amplifier used for acquisition of EMG, EDA, and ECG signals. Raw voltages (in microvolt) were epoched into segments of 6 s, with 0 being the start of the emotional expression. The average voltage of the first 300 ms was subtracted from the data to control for trial-to-trial variation in initial pressure level. The analyses focused on three metrics (response onset, pressure duration, and maximum pressure) extracted from the epoched time series of pressure data. Response onset was calculated as the first time point after 1,000 ms at which the average pressure level measured in 50 ms consequent time windows was three standard deviations higher than the average baseline pressure level between 0 and 500 ms. Pressure duration was defined as the temporal difference (in ms) between the response onset and the moment when the pressure level dropped back below the level of response onset. Pressure maximum was calculated as the maximum amount of pressure in microvolts within the pressure duration window.

Analysis
Separate two-way ANOVAs were conducted on the valence and arousal ratings setting the emotional expression and the object of the rating (user versus agent) as factors. To correct for multiple comparisons and to reduce the risk of type-1 error, the Bonferroni correction was applied to the p-values (i.e., p x 2). The Greenhouse-Geisser correction was likewise applied when sphericity assumption was violated. Similar ANOVAs were used to examine users' physiological responses to agent's emotional expressions but with small adjustments. To investigate users' facial muscle responses to the agents' emotional expressions, we calculated three repeated measures ANOVAs separately for each muscle area (CS, ZM, and OO) with mean voltage as dependent variable, and emotional expression (anger, disgust, fear, happiness, neutral, sadness, surprise) and time (0-1000, 1000-1700, 1700-2700, and 2700-3900 ms) as factors. The alpha level was Bonferroni corrected by multiplying obtained p-values by three to avoid type I error resulting from three statistical tests conducted for the three muscle areas. The same two-way repeated measures ANOVA with emotional expression and time as factors was used for the EDA and ECG data.
To examine the influence of agent's emotional expression on haptic response, a repeated measures ANOVA with emotional expression (anger, disgust, fear, happiness, neutral, sadness, surprise) and time (1-4 s) as factors and average pressure level as dependent variable was first conducted. After the initial analysis, three metrics (response onset, response duration, and maximum pressure) were extracted from the pressure data and a one-way repeated measures ANOVA with emotional expression (anger, disgust, fear, happiness, neutral, sadness, surprise) as factor was calculated for each metric separately.
Finally, a series of six multilevel linear models was conducted to find out whether the haptic expression was predicted by the user affect (valence and arousal of the user), agent affect (perceived valence and arousal of the agent), and the users' physiological responses. The user valence ratings were found to correlate highly with agent valence ratings (r = .71, p < .001) as did user arousal and agent arousal (r = .67, p < .001). As mixed linear models are extremely sensitive to multicollinearity, we included agent and user ratings as predictors in separate mixed models. In each model, the subject ID was assigned to a random intercept and no random slopes were added. The models were estimated using restricted maximum likelihood method and the effects were tested using Wald X2 test with type-III sum of squares.

RESULTS
Before determining how an agent's emotional expressions affect haptic expression, we investigated whether the agent's emotional expressions influenced users' self-reported affect (user affect) and perceived affect of the agent (agent affect). These self-report measures were accompanied by cardiac (ECG), electrodermal (EDA) and facial muscle (EMG) responses to the emotional expression. After this initial examination of affective responses (Section 4.1), we turned to the haptic response (Section 4.2) exploring the effects of agent's emotional expression on three metrics of haptic response: response onset, response duration, and maximum pressure. Then, to better understand how these measures related to user affect and agent affect, we fitted linear mixed models on the observed haptic expression data using the user affect and agent affect ratings and the physiological responses as predictors (Section 4.3). Finally, to evaluate the virtual environment, we conducted a descriptive analysis of the participant's postexperimental responses to an abbreviated version of the social presence questionnaire [6] (Section 4.4).
Physiological responses evoked by the agent's emotional expression were examined with similar ANOVA approach. First, three repeated measures ANOVAs with mean muscle activity as dependent variable, and emotional expression and time (0-1000, 1000-1700, 1700-2700, and 2700-3900 ms) as factors were conducted on each muscle area (CS, ZM, and OO). The results showed significant effects of time for CS, OO, and ZM activity, Fs > 6.49, ps < .03, and emotional expression, Fs > 5.92, ps < .03. The interaction between time and expression was also significant for CS, F (4. 16 03, but neither emotional expression nor the interaction between emotional expression and time were significant, Fs < 0.73, ps > .56. Accordingly, we did not include EDA as predictor of haptic expression in the following multilevel-linear model.

Effect of Agent's Emotional Expression on Haptic Response
An analysis similar to previous ANOVAs was conducted on the epoched pressure data averaged to three second-long time bins. Significant effects of time, F (2.26, 70.18) = 10.82, p < .0001, emotional expression, F (3.86, 119.77) = 6.77, p < .0001, an interaction between the two were found, F (5.15, 159.65) = 3.08, p = .01. As can be seen in Fig. 7, the clearest difference was observed between neutral and emotional expressions, neutral expression resulting in a softer and delayed pressure response compared to emotional expressions. However, as can also be seen in the figure, emotional expressions differed from each other in terms of their effect on pressure onset, duration, and peak amplitude. To capture these emotional modulations, we extracted three parameters from the haptic expression data: response onset, response duration, and Fig. 6. Effect of emotional expression (from 0-1700 ms) on facial EMG and evoked ECG. EMG activity was measured over corrugator supercilii, orbicularis oculi, and zygomatic major muscle groups. ECG response was calculated as ms change in inter-beat interval (IBI). Red lines along the X-axes indicate significant differences in sliding 100 ms ANOVAs. Fig. 7. Effect of emotional expressions on haptic responses. Haptic expression was measured using a pressured response pad that was squeezed in response to a tone occurring at 1000 ms. Red lines along the X-axes indicate significant differences in sliding 100 ms ANOVAs.
maximum pressure, and compared the effect of emotional expression on each. ANOVAs with emotional expression (anger, disgust, fear, happiness, neutral, sadness, surprise) as factor conducted separately for each metric revealed that emotional expression significantly affected response onset, F (4.03, 124.84) = 9.93, p < .0001, h 2 p = .24, response duration, F (3.42, 106.13) = 8.19, p < .0001, h 2 p = .21, and maximum pressure, F (4.09, 126.72) = 2.70, p = .03, h 2 p = .08. However, as can be observed from Fig. 7, these effects could partially be ascribed to a single effect of neutral emotional expressions causing weaker response than the other expressions. To investigate differences among the emotional expressions, we repeated the same analyses on the three measures, but left out the neutral expression condition from the analysis. This resulted in emotional expression having no longer a significant effect on response onset, F (3.12, 96.56) = 1.07, p = .37. Response duration still showed sensitivity to emotional expressions, F (3.68, 114.01) = 2.87, p = .03, h 2 p = .08, as did maximum pressure, F (3.58, 110.92) = 4.41, p = .003, h 2 p = .13. Fearful and sad expressions provoked longer responses (847 and 836 ms respectively) as compared to expressions of disgust and happiness (754 and 756 ms respectively). Angry, disgusted, and fearful expressions provoked higher pressure maxima on average (Z = 101.58, 99.57, and 97.83 respectively), while sadness prompted the lowest maximum pressure (Z = 88.90), followed by surprise (Z = 91.76). Note that Fig. 7 presents grand averages across participants that do not easily reveal conditional peaks, as these can easily smear due to inter-individual variation in timing.

Predicting Haptic Responses With Physiology, User Affect, and Agent Affect
Finally, two sets of multilevel models with user affect and agent affect ratings and users' physiology as predictors were fitted on obtained response duration and maximum pressure data, respectively. The results are summarised in Table 1, which shows estimates (i.e., regression coefficients) and significance test results of six models: three predicting maximum pressure and three predicting response duration.
As can be seen in the table, none of the physiological responses significantly predicted maximum pressure (Model 1.1). The Model 1.2 showed a significant effects of user valence, x 2 (1) = 17.72, p < .001, and arousal ratings, x 2 (1) = 24.23, p < .001. Inspection of the model estimates revealed that users touched the agent more forcefully when in negative or high arousal emotional state (see Table 1, Model 1.2). When examining the agent valence and arousal ratings as predictors (Model 1.3), both valence, x 2 (1) = 10.10, p = .001, and arousal, x 2 (1) = 15.19, p < .001, were found to be significant. Again, participants applied more force when the agent was in negative or aroused emotional state (see Table 1, Model 1.3). Inspection of the marginal R 2 s (Table 1, last row) indicated, however, that the ratings of users' own emotional state were better predicting overall variation in maximum pressure (Model 1.2 marginal R 2 = .018) than ratings of agent's emotional state (Model 1.3 marginal R 2 = .008). Another set of models (models 2.1-3) was conducted to predict response duration. Model 2.1 revealed non-significant effects of CS, and ECG (p > .532) and significant effects of ZM, x 2 (1) = 10.22, p = .001, and OO, x 2 (1) = 4.26, p = .04. Inspection of the model estimates revealed that lower ZM muscle activity predicted longer responses whereas stronger 00 muscle activity predicted shorter responses (see Table 1, Model 2.1). The Model 2.2 revealed significant effects of user valence, x 2 (1) = 4.06, p = .04, and arousal, x 2 (1) = 30.03, p < .001, on response duration. Based on the model estimates, users touched the agents longer when in negative or high arousal emotional state (see Table 1, Model 2.2). Model 2.3 revealed similar effects of agent valence, x 2 (1) = 4.21, p = .04, and agent arousal, x 2 (1) = 17.66, p < .001, on response duration. Again, users touched the agents longer when perceiving the agent in negative or high arousal emotional state (see Table 1, Model 2.3). Comparing the coefficients of determination between models (i.e., Model 2.2 marginal R 2 = .074 versus Model 2.3 marginal R 2 = .037) indicated that user affect better explained the overall variance in response duration than ratings of agent affect.  ) whereas conditional R 2 refers to the variance assigned to the fixed and random part of the models. The intraclass correlation (ICC) indicates the ratio of variance on the two levels of analysis (within-subject s 2 level and between subjects t 00 level). *p < .05, **p < .01, ***p < .001.
The lower section of Table 1 presents random effects (s 2 and t00) of each multilevel model together with their intraclass correlations (ICC). These values are informative as they indicate that most of the variation in haptic expressions (96-97 percent in models predicting maximum responses and 84-86 percents in models predicting response duration) was due to differences between users. Therefore, the proportions of explained total variance were generally low (maximum marginal R 2 being .016 in the model 1.2) as being limited to the within-subject level of variation. However, the amount of explained variance in both levels significantly increased as a result of including the physiological responses and self-report measures to the models. Moreover, when focusing solely on the within-subject variation, the predictive power of situational emotional variables became even more apparent. By centering the outcomes around participant means and using these centered variables as outcomes instead, we were able to limit our focus to the within-subject variation that was mainly in our research interest. As a result, the proportion of explained variance of the models' fixed effects increased to R 2 = .072 in Model 1.2, to R 2 = .099 in Model 1.3, and to R 2 = .069 in Models 2.2 and 2.3, meaning that approximately 8 percent of the withinsubject (i.e., trial-to-trial) variation in haptic expressions was explained by the included emotion-related predictors. Also another interesting aspect of the data was revealed: while the overall variation in the haptic expression parameters was better explained by the user experience models, the within-subject variation, that is the variation between trials, was better explained by agent's perceived affective state. This implies that individual differences in haptic expressions partially reflect persons' overall reactivity to emotional encounters (e.g., how high or low they report their valence across the trials) whereas situational variation in haptic expressions is better explained by momentarily perceived emotional cues, that is reflected in ratings of agent valence and arousal in each trial.
As a final, exploratory analysis on the relation between emotions and touch, we conducted a mediation analysis. This used the same step-wise multilevel modelling approach as used thus far, but furthermore including the agent's emotional expression as a predictor in the initial step. Although a full report on the analysis is outside the scope of the present study, a description may be found in supplementary material S1, which can be found on the Computer Society Digital Library at http://doi.ieeecomputersociety.org/10.1109/ TAFFC.2020.3038137. To summarize the results, we found evidence of self-reported user and agent affect to be mediating the link between emotional expressions and haptic responses, particularly in predicting maximum pressure. Thus, the agents' emotional expressions indeed influenced experienced and perceived affective states, which in turn influenced haptic responses.

Post Experiment Feedback
Descriptive analysis of responses to the social presence questionnaire [7] suggested medium levels of social presence (M = 3.29, SD = AE1.27 on a scale of 1 to 5). We also asked participants to rate how much they felt they were passively participating as opposed to interacting ("I felt like I was just perceiving pictures"), how realistic they felt the virtual environment was ("How real did the world seem to you"), and how realistic they felt their actions were ("I had a sense of acting in the virtual space rather than operating something from outside"). This suggested that despite relatively low realism (M = 2.37, SD = AE 0.94) and somewhat passive participation (M = 3.40, SD = AE 1.14), they did have a sense of action (M = 3.28, SD = AE 1.27).

DISCUSSION
We set out to investigate whether a virtual agent's facial emotional expression affects a user's haptic response towards the agent and whether these emotional effects were explained by a user's affective state and/or the perceived affect of the agent. In the following subsections, we will summarise the key findings, discuss the contributions to the existing literature and elaborate possible limitations of the work and the future directions.

Summary of Findings
The virtual agent's emotional expressions strongly affected the user in terms of self-reported valence and arousal as well as physiological responses in facial muscle activity, electromyography, and electrodermal activity. In answer to RQ1, we found that perceiving the emotional expressions of agents affected the haptic response of users, affecting both the duration and intensity of their touch. Critically, the effect of emotional face expressions on haptic responses occurred even though the participants were asked to keep the touch constant, always squeezing the agent's hand with a gentle pressure of one second. In answer to RQ2, we demonstrated that both the users' own emotional states and those they perceived as the agent expressing predicted haptic responses in terms of duration and intensity. While physiological responses did not predict haptic responses to the same extent, the facial muscle activity was found correlated with touch duration. In the following, we will examine the key findings in detail.

Contributions to the Existing Literature
Perceiving emotions in others is known to shape emotions and social behaviours in the observers. This is believed to happen via two mechanisms: 1) appraisals of the other's feelings and intentions and 2) non-reflective emotional contagion [58]. In our study, the agent's emotional expression was found to be reflected in the users' self-reported affect and their emotion-related physiological responses. Specifically, we found that if the agent expressed negative emotions (i.e., sadness, anger, disgust, anger or fear), users reported having lower valence and showed attenuated ZM ('smiling' muscle group) and increased CS ('frowning') activity. Since there was no contextual framing for the interaction (collaboration or decision-making), the findings are likely to reflect emotional contagion between the agent and the user, confirming findings from previous studies [44].
Whether directly reflecting other's emotions (emotion contagion) or through cognitively appraising them, the present study shows that perceived emotions do not merely change perception or affect, but also our haptic expression. Theories of emotion have long held that emotions are defined not just in terms of feelings, but also in terms of action tendencies [18]: perceiving emotions implies potential actions (c.f. affordances). According to ideomotor theory, perceiving behavioural intentions results in pre-activating associated action, priming responses in observers even without a user's awareness [1], [8], [34]. Current proponents of ideomotor theory argue emotional priming may similarly elicit associated responses, thus causing a person to move towards or away from the emotional stimulus [1], [9], [42].
Yet, while this theory accounts for emotional perception affecting haptic responses in principle, it does not give a clear answer to the specific pattern of differences in haptic expressions between emotional expressions conditions. For instance, we found the emotional effect on haptic response not to map onto the approach-withdrawal continuum. Indeed, the response duration was longest when the agent expressed fear or sadness and shortest when disgust or happiness were shown. Neither do the results map on the dimensions of arousal or valence as the responses were most forceful when angry, fearful, and disgusted expressions were shown and lowest when sadness or surprise were expressed. The findings are thus different from the previous observations on touch perception [45], in which virtual agent's expressions of higharousal emotions (anger, fear, and happiness) made the agent's touch feel more intense.
The observed link between physiological and haptic responses was likewise complicated. While none of the physiological indices predicted touch intensity, the facial muscle activity of muscles involved in smiling was found to predict response duration. Specifically, attenuated ZM activity predicted longer responses. Amplified ZM activity has been systematically related to positive emotional states and smiling [13]. Since ZM activity is known to attenuate in response to negative events [35], it is possible that the attenuation of ZM related to negative valence that resulted in prolonged touch duration.
The findings became clearer when examining how the experienced and perceived affect predicted haptic responses. First, when reporting negative or high-arousal affective state, users were found to use more force while touching the agent. More force was also used when touching an agent that was perceived in negative or high-arousal states. Touch duration was likewise longer when the person was in negative or higharousal state or when perceiving the agent in such state. The perceived and experienced negative high arousal thus predicted more intense and longer touches. Therefore, while the differences between emotional expressions did not reveal haptic response to be linked to valence or arousal, a clear link was found when examining the affective dimensions as felt and perceived by participants. Finally, a mediation analysis suggested that the reported affective dimensions indeed mediated the link from emotional facial expressions to haptic expressions. In previous studies, where finger strokes on mobile devices have been examined, affective valence and arousal have been distinguishable from the users' haptic responses by computational clustering methods [21], [40], [49]. The current findings suggest that valence and arousal are likewise distinguishable from haptic responses expressed spontaneously in virtual social interactions.
When utilising the multilevel linear model's capability of differentiating between within-and between subjects variation in the haptic response measures, we found that most of the variation in both response duration and intensity was due to individual differences. However, since the individual differences (presumably arising from anatomical and forcerelated differences between individuals) were not of our interest, we carried out an exploratory analysis focusing solely on the within-subject (trial-to-trial) variation in haptic response. Interestingly, the analysis revealed that while the overall variation (within and between subjects levels combined) in haptic responses was better explained by user valence and arousal ratings, the within-subject variation was better explained by perceived valence and arousal of the agent. In other words, individual differences in haptic responses were related to a user's overall reactivity to emotional encounters whereas the trial-to-trial variation in haptic responses was sensitive to momentarily perceived emotional cues. This underscores the important role of individual differences in the effects of social touch [27], suggesting individual differences in affective responsiveness determine interpersonal touch behaviour.
Taken together, our study shows that measuring a user's haptic response in social VR provides a rich source of information for affective computing. It informs both how dynamic qualities of virtual agents provoke emotional experiences in the user and how the user perceives and interprets the agent's state. This implicitly measured information can be used to reveal the ongoing experience of users in social VR, enabling the development of adaptive systems that can respond to the user's emotions as sensed from haptic interaction.

Limitation and Future Work
Our haptic input device was designed using low-cost components obtained from a commercial blood pressure measurement device with limited capability to detect haptic features. More complex features, such as spatial or frequency related haptics, may enhance the sensitivity to specific emotion-related aspects such as the motivation to withdraw or approach. The selected sensors may thus have limited our scope of emotionally relevant aspects observable in interpersonal touch behaviour. However, given that the basic device already shows significant capability in detecting the fluctuations caused by perceiving emotions, advancements in sensing can only consolidate the reliability of the findings. Furthermore, the ease of engineering and affordability of the input device allow other research group to easily replicate or follow-up our research.
Beyond purely technical limitations, a dominant feature in the haptic responses (Fig. 7) was the large difference in pressure data between neutral and all emotional expressions. This observation raises a question whether the emotional effects on haptic responses reflect mainly the distinction between emotional and neutral states rather than qualitative differences between emotions. While possible, the observed difference between neutral and other expressions may also reflect slower squeezing response in the neutral expression condition that may be due to the lack of dynamic visual changes cuing the touch onset in the emotional expression conditions. Alternatively, showing an emotional expressions of any kind could have facilitated response times in comparison to the emotionally neutral condition, as suggested by previous research [26], [36]. However, even after removing the neutral expression condition from analysis, we found emotion induced differences in the response duration and intensity. It is thus clear that at least these two response features were sensitive to qualitative difference between different emotional cues.
Finally, while we believe that our social VR setting was efficient in bringing forth the link between emotions and haptic response, there are constrains of how far we should generalise the findings. For instance, it is not clear whether even a finger stroke on a touch screen would be similarly affected by the appearance of emotional expressions. Also, requesting participants to touch the agent upon hearing a tone makes it unclear whether the findings would replicate in a free-choice scenario. More research should therefore track down the boundary conditions before inferring that the findings generalise to more naturalistic haptic communication scenarios.

CONCLUSION
As being touched modifies our feelings, so do our feelings modify the way we touch others. Our study shows that the way we touch a virtual agent's arm is affected by how we feel, and how we perceive the agent to feel. Even a basic, affordable haptic interface is thus shown to provide remarkably rich information on a user's ongoing emotional state. As of yet, we remain unsure as to whether the findings generalise towards non-social scenarios, or applications outside VR. In case they do, similar pressure sensors could be used to infer person's emotional reactions and stress level in various demanding tasks. But even if not, a future of affective agents will likely become ubiquitous in our social and virtual life, which will enhance the importance of detecting implicit emotions and user-agent interaction experience. Imtiaj Ahmed received the master's degree in computer science from the University of Helsinki, Helsinki, Finland, in 2013, and is currently working toward the doctoral degree in computer science. His main field of research is HCI/HTI. He is an expert on multimodal interaction, affective interaction, VR/AR/MR/XR, and haptics.
Ville J. Harjunen received the PhD degree in psychology from the University of Helsinki, Helsinki, Finland, in 2019. He is currently a postdoctoral researcher with the Research Group of Emotional Interaction and eHealth (EIEH) and investigates emotional information processing, affective communication in VR and neuroscience of affect and social interaction.
Giulio Jacucci received the PhD degree in information processing science from the University of Oulu, Oulu, Finland. He is currently a professor of computer science with the University of Helsinki, where he chairs the UIX Group and leads research in ubiquitous computing, intelligent user interfaces, and affective interaction.
Niklas Ravaja received the PhD degree in psychology from the University of Helsinki, Helsinki, Finland, in 1996. He is currently a professor of eHealth and well-being, and is an expert on emotional and physiological processes during mediated social interaction.
Tuukka Ruotsalo received the PhD degree, in 2010. He is currently an academy research fellow and associate professor, and expert on interactive machine learning and cognitive computing.
Michiel M. Spap e received the PhD degree in psychology from Leiden University, Leiden, The Netherlands, in 2009. He is currently a senior researcher with the EIEH Group, docent in cognitive neuroscience, and specializes in perceptionaction integration, emotion, and psychophysiology.