Behavioral, Peripheral, and Central Neural Correlates of Augmented Reality Guidance of Manual Tasks

Objective: The use of commercially available optical-see-through (OST) head-mounted displays (HMDs) in their own peripersonal space leads the user to experience two perception conflicts that deteriorate their performance in precision manual tasks: the vergence-accommodation conflict (VAC) and the focus rivalry. In this work, we aim characterizing for the first time the psychophysiological response associated with user's incorrect focus cues during the execution of an augmented reality (AR)-guided manual task with the Microsoft HoloLens OST-HMD. Methods: 21 subjects underwent to a “connecting-the-dots” experiment with and without the use of AR, and in both binocular and monocular conditions. For each condition, we quantified the changes in autonomic nervous system (ANS) activity of subjects by analyzing the electrodermal activity (EDA) and heart rate variability. Moreover, we analyzed the neural central correlates by means of power measures of brain activity and multivariate autoregressive measures of brain connectivity extracted from the electroencephalogram (EEG). Results: No statistically significant differences of ANS correlates were observed among tasks, although all EDA-related features varied between rest and task conditions. Conversely, significant differences among conditions were present in terms of EEG-power variations in the $\mu$ (8–13) Hz and $\beta$ (13–30) Hz bands. In addition, significant changes in the causal interactions of a brain network involved in motor movement and eye-hand coordination comprising the precentral gyrus, the precuneus, and the fusiform gyrus were observed. Conclusion: The physiological plausibility of our results suggest promising future applicability to investigate more complex scenarios, such as AR-guided surgery.


I. INTRODUCTION
A UGMENTED reality (AR) devices implementing optical- see-through (OST) technology allow overlaying of computer-generated imagery on the real-world egocentric view of the user.The real view and virtual content are merged together by rendering the latter on a two-dimensional (2-D) microdisplay.Specifically collimating lenses, placed between the microdisplay and the optical combiner, focus the 2-D virtual image so that it appears at a predefined and comfortable viewing distance on a virtual image plane (i.e., the focal plane of the display) [1].The implementation of such technology on wearable devices can provide the user with a hand-free setup, which is particularly useful for performing manual tasks of different kinds [2].
However, this 2-D-three-dimensional (3-D) fusion can lead to a perceptual conflict between the 2-D virtual content on the surface of projection and 3-D real-world [2].More specifically, both vergence-accommodation conflict (VAC) and focus rivalry (FR) phenomena can occur.VAC is a visual phenomenon that arises when the brain receives conflicting cues during binocular vision.Particularly, in the OST paradigm, the virtual image is focused at a fixed depth away from the eyes whereas real-world objects are not, resulting in conflicting information within the vergence-accommodation feedback loop [3].This difference in the focal distances of virtual and real objects leads also to FR, whose effect is to constrain the subject to selectively focus on a single cue (i.e., the virtual or the real content) [4].Indeed, human visual system does not allow to focus on more than one focal plane at a time.As a result, when a target is at a distance from the fixation point of the eyes, but outside the optical extension of the human depth-of-field, it will be perceived as blurred by the viewer.All these issues are even more emphasized when using commercial OST head-mounted displays (HMDs) to guide manual tasks [2].Indeed, the focal distance of such devices is typically around 2-3 m therefore applications in which the real content is located within the user's peripersonal space can lead to VAC-and FR-related discomfort due to the distance gap between the 2-D virtual image and the real object.
A recent study has quantified the effects of VAC and FR in a simple "connecting-the-dots" task performed in an AR Fig. 1.Experimental setup and exemplary of experimental timeline.Subjects were equipped with EEG, ECG, and EDA sensors, and with a 1st Generation Microsoft HoloLens OST-HMD AR device.During each experimental condition subjects performed three trials of a "connecting-the-dots" task followed by 30 s of rest.Markers of pencil press (green dashed lines) and release (red dashed lines) were collected with an ad hoc touch capacitive sensor and time-locked to the recorded signals.Conditions were pseudorandomized across subjects.At the end of each EYE-modality block subjects filled-in a LIKERT-type questionnaire.At the beginning and at the end of the experiments subjects filled in the STAI questionnaire.Subjects performed the experiment by drawing lines on a paper placed on a vertical support at a distance of 0.5 m.The AR device was ensured on the head of the subjects through a metal frame support.
environment [2].Interestingly, they observed that both VAC and FR during the binocular vision, as well as FR during the monocular vision, deteriorated task performance.VAC and FR effects are usually assessed in the scientific literature by using questionnaires [2], [5], [6].However, collecting information through self-reporting has several limitations and can be either consciously or unconsciously biased.In this context, more objective measures of visual discomfort and fatigue can be obtained from physiological signals, such as electro-oculogram (e.g., by detecting blink rates [7]), eye-tracking (e.g., by analyzing eye movements [8], [9]), and electroencephalography (EEG) (e.g., by exploiting changes in EEG power [10]).Indeed, physiological signals are less prone to subjective bias compared to self-assessed reports.Our preliminary study has investigated mental workload during an analogous "connecting-the-dots" task by analyzing the EEG frontal-alpha-asymmetry index [11].Preliminary results suggested that performing the task using AR can be more demanding than performing it with natural vision.However, a fully detailed characterization of the physiological changes induced by AR is still missing.Indeed, to properly characterize a user's psychophysiological state, distinct measures of different bodily responses need to be considered, ranging from autonomic nervous system (ANS) correlates to brain activity and connectivity measures.
To this aim, here, we provide for the first time a multimodal characterization of the psychophysiological response during a "connecting-the-dots" task performed with and without AR, and in both binocular and monocular vision.Specifically, we monitor changes during the AR task in subjects' autonomic response, by focusing on electrodermal activity (EDA) and heart rate variability (HRV) signals, and we analyze the neural central correlates of AR use by considering several measures derived from the EEG.EDA and HRV are two of the most commonly used ANS correlates to infer a subject's psychophysiological state under several controlled conditions such as stress, fear and many other [12], [13], [14].Moreover, in this study, we analyze EEG to quantify changes in power at the scalp level and, for the first time, to characterize the brain sources participating to the task completion and the changes in the connectivity among such sources based on the different experimental conditions.Connectivity is evaluated by means of multivariate autoregressive (MVAR) models applied to independent component (IC) timecourses [15], [16], [17].We also introduce another novelty by analyzing physiological responses in an event-related fashion.Indeed, the "connecting-the-dots" task is by its nature an event-related experiment made of several press and release events.Hence, all methods are applied to study the physiological response immediately after press events and contextual to the line drawing.The efficacy of the stimulation protocol is assessed by analyzing behavioral measures and subjective ratings of users.

A. Subjects
The study was conducted according with the guidelines of the Declaration of Helsinki, and approved by the Bioethics Committee of the University of Pisa Review No. 14/2019, May 3rd, 2019.All subjects gave their informed consent to take part to the study.
Twenty-one healthy volunteers (age 27.5 ± 4.5, four females, all right-handed) were enrolled to participate in the study.All subjects had normal or corrected-to-normal visual acuity and limited previous experience with AR devices.

B. Experimental Protocol
The experimental setup and an example of the experimental timeline are reported in Fig. 1.After a resting period of 180 s, subjects performed a "connecting-the-dots" precision task under four different experimental conditions as follows.
3) Binocular naked-eye task (hereinafter NK-bin).4) Monocular naked-eye task (hereinafter NK-mono).These were purposely designed to study the influence of both AR and eye (hereinafter called EYE) modalities on the subject's performance.Each condition consisted of three different trials of equal complexity.This latter was controlled by designing paths of equivalent total length and an equal number of segments for each trial.Conditions were interleaved with 30 s of rest.
During tasks performed with AR, random dots were projected on paper using an HMD 1st Generation Microsoft HoloLens device (see Section II-C for details), whereas, for naked-eye tasks, dots were simply printed on paper.The paper was placed on vertical support at a common distance of ∼ 0.5 m from the eyes of the subjects inducing visual discomfort and fatigue due to the mismatch with the focal distance of the glasses (i.e., ∼ 2 m).This was purposely done to simulate those situations in which the AR glasses are used under suboptimal conditions, for instance in AR-aided surgery tasks.During the experiment, subjects were asked to sit on a comfortable chair and connected the dots by using a pencil.Moreover, subjects were asked to detach the pencil from the paper at the end of each drawn segment.The AR device was mounted on a metal frame support (and not on the subject's head) to limit the potential movements of the glasses during the experiment (see Fig. 2).
To identify the time points at which the subject started drawing each segment (i.e., pressed with the pencil on the paper), we designed and built an in-house touch capacitive sensor.The sensor was interfaced with the EEG triggering system through an Arduino MKR WIFI 1010 microcontroller and an ad hoc analog frontend circuit.Accordingly, we were able to determine the time points at which subjects pressed and released the pencil.These two events were tagged as press and release.
For each subject, conditions were pseudorandomized first according to the EYE modality, and then according to the AR modality.Hence, combinations such as (AR-bin, NK-mono, NK-bin, AR-mono) or (NK-bin, AR-mono, AR-bin, NK-mono) were avoided.Intertrial intervals were given by the time required by the experimenter to change this article of each trial.Instead, inter-block intervals were fixed at a duration of 30 s.At the end of each EYE-modality, subjects were asked to fill in a LIKERT-type questionnaire aiming at measuring visual discomfort, fatigue and perception associated with the AR "connecting-the-dots" task [18].Finally, before and after the experiment, subjects filled in the state-trait anxiety inventory (STAI) questionnaire [19].

C. AR Device
We used the 1st Generation Microsoft HoloLens OST-HMD as the AR device to display virtual dots during the experiment.Hololens features a self-contained computing power, based on Fig. 2. Exemplary of the experimental setup for one subject and for the gap estimate.The AR device was mounted on a metal frame support limiting movements of the glasses during the experiment.A "connecting-the-dots" task was performed on a paper placed on a vertical support in front of the subject.During NK modality, the dots were printed on a paper, whereas during AR modality, the dots were shown to the subjects through the AR device.An example of the gap (G_ij) for (i = 2, j = 3) and (i = 8, j = 9) is reported in the bottom-right panel.
an undisclosed Intel 32-bit processor, 2 GB of RAM and 64 GB of flash memory, and a custom-built Microsoft Holographic Processing Unit (HPU 1.0) which supports Universal Windows Platform apps.Furthermore, Hololens allows for wireless communication featuring both Wi-Fi 802.11ac and Bluetooth 4.1 LE wireless technology.The sensory system of the device includes: one depth camera, four grayscale tracking cameras, one world-facing photo/video camera (2 MP), one ambient light sensor, one inertial measurement unit to track head movements, and four microphones.

D. Connecting-the-Dots Performance Analysis
For each experimental condition, we analyzed subjects' performance in completing the task by evaluating the gap (G_ij) between the ending and starting points of each pair (i, j) of consecutive lines (see Fig. 2).We purposely focused on this measure because of its independence from AR device virtual-to-real calibration errors [2].Accordingly, we reduced the impact on task performance evaluation of possible distortions of the perceived visual content resulting in the misperception of line lengths.To this aim, we first detected line endpoints automatically by using the Harris corner detector [20] under MATLAB R2017b [21].Then, for each trial, we estimated the maximum and mean error gaps (i.e., max error and mean error), respectively.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

E. Physiological Data Acquisition 1) Peripheral Signals:
We acquired EDA and ECG signals during the experiment.The EDA was recorded using a Shimmer 3 GSR+ unit at a sampling rate of 100 Hz.The recording was performed by placing two electrodes on the distal phalanx of the index and middle fingers of the subjects' nondominant hand.Instead, the ECG signal was acquired with the peripheral physiological measurement module of a 128-channel Geodesic EEG System 300 from Electrical Geodesic, Inc. (EGI).The recording was performed at a sampling rate of 500 Hz, with two electrodes positioned below the right and left clavicle, and a reference electrode that corresponded to the Cz channel of the EEG cap.
2) Electroencephalogram: We acquired EEG data using a 128-channel Geodesic EEG System 300 from EGI.We kept electrodes impedances always below 20 kohm for the entire duration of the experiments.Channels were referenced to the Cz electrode during the acquisition.We used a sampling rate of 500 Hz.

F. Physiological Data Analysis 1) ANS Correlates:
We extracted several features from the EDA and electrocardiographic (ECG) signals to monitor subjects' changes in ANS activity.
The EDA signal reflects changes in the skin conductance induced by sweat glands' activity.Since sweat glands are under the only control of the sympathetic branch of the ANS, the analysis of EDA is a reliable way to infer SNS dynamics [14].The EDA signal can be seen as the sum of two components that carry nonoverlapping information: a tonic component, which is a slow-varying component whose spectrum is below 0.05 Hz, and a phasic component, which reflects short-term stimulus-evoked responses.Here, we took advantage of the cvxEDA model to extract the tonic and phasic components of the EDA signal, as well as the sudomotor nerve activity (SMNA) generating phasic responses [22].Specifically, we downsampled EDA signals to a sampling rate of 50 Hz and normalized them to have zero mean and unit variance before applying the cvxEDA model.Then, starting from the estimated components of tonic, phasic, and SMNA, we derived several features related to SNS activity, namely: the mean and the standard deviation of the tonic (i.e., Mean Tonic and Std Tonic), the mean of the phasic (i.e., Mean Phasic), the number of peaks (i.e., N peaks), the maximum peak (i.e., Max Peak), and the sum of peaks (i.e., AmpSum) of the SMNA, and the power spectrum in the (0.045-0.25)Hz (i.e., EDASYMP).More specifically, tonic-related features were estimated in the 35 s after the start of each trial, whereas phasicrelated components were time-locked to the press events and estimated in the 5 s following the start of each segment drawn.Such a choice of different time ranges for feature extraction was purposely made to capture the different dynamics of tonic and phasic components.
The ECG signal was analyzed with Kubios HRV [23].Briefly, the RR series were extracted from the ECG using the Pan-Tompkins algorithm [24].Peak-detection artefacts were removed by applying a cubic spline interpolation method, and the obtained RR series were resampled to 4 Hz to derive the HRV signal [25].Starting from the HRV, we extracted several features in the time, frequency, and nonlinear domains, which were representative of ANS dynamics.Different from EDA, HRV features are influenced by both SNS and PNS activity.As for the tonic-related features, we estimated all HRV features in the 35 s after the start of each trial.These were: the mean and the standard deviation of the HRV (i.e., Mean HRV and Std HRV), the square root of the mean-squared differences of successive normal-to-normal (NN) intervals (i.e., RMSSD), the percentage number of pairs of adjacent NN intervals differing by more than 50 ms (i.e., pNN50), the power expressed as a percentage of total power in the low-frequency (i.e., LF, 0.04-0.15Hz) and high frequency (i.e., HF, 0.15-0.40Hz) ranges, the ratio of LF to HF power (i.e., LF/HF ratio), and the minor (i.e., SD1) and major (i.e., SD2) axis of the ellipse that best fits the Poincaré plot of RR intervals.
In addition, all features were estimated also during the initial resting condition.In this case, we simply segmented the signals into 35 and 5 s windows to match the estimates on tasks' related conditions.
2) EEG Data Analysis: EEG data were analyzed in terms of power and connectivity measures with EEGLAB and MATLAB custom scripts [26].To this aim, we first preprocessed EEG signals with a pipeline comprising: 1) downsampling to 125 Hz (after applying a low-pass antialiasing filter); 2) high-pass filtering (filter cutoff 1 Hz); 3) bad channel removal based on correlation criterion [27]; 4) ICA decomposition and selection of components related to brain activity [28]; and 5) estimation of the equivalent current dipole associated to each brain-related component [29], [30].Cleaned signals were segmented into epochs centred around press events and ranging from -0.5 to 2 s.A subtractive baseline ranging from -0.5 to 0 s was removed from all epochs.Finally, for each experimental condition, we computed average EEG epochs.
EEG responses were also analyzed in terms of effective brain connectivity by describing the interactions among subjectspecific ICA-related EEG sources with MVAR models [31].Operationally, we estimated MVAR models with a slidingwindow approach as implemented in the Source Information Flow Toolbox [32], [33].Sliding windows were 500 ms long and the window step size was equal to 20 ms.Then, starting from model coefficients, we calculated the renormalized partial directed coherence (RPDC, [34]) in 30-log scaled frequency bins from 1 to 45 Hz.Accordingly, we obtained multivariate causal estimates among sources at specific frequencies of interest.As a result, for each subject, we obtained a spectrotemporal connectivity matrix RPDC (i, j, t, f ) with (i, j) representing networks targets and sources, and (t, f ) representing the timewindow and the frequency at which the interaction is happening.
Group-level connectivity was obtained by projecting subjectspecific RPDC matrices onto a common space across all subjects.To this aim, the dipole locations of ICs were transformed into probabilistic dipole densities smoothed with a 3-D Gaussian kernel with full-width at half maximum at 20 mm [16].Then, dipole density was segmented into 76 anatomical cortical regions of interest (ROIs) defined by the automated anatomical labeling (AAL) atlas [35].Afterward, we weighted subject-specific RPDC with dipole densities to obtain a 76x76xtxf connectivity matrices.Finally, we thresholded subject-specific weighted connectivity matrices by truncating the Gaussian distribution at a value of three standard deviations and considered a grouplevel connection only if it was present in at least the 80% of participants [16].

G. Statistical Analysis
We performed different statistical analyses to investigate how AR influenced subjective, behavioral, peripheral, and central physiological measures.For all the statistical comparisons performed, we considered a significance level of α = 0.05.
1) STAI Scores: We analyzed the differences in STAI metrics measured before and after the experiment with the Wilcoxon signrank test to control whether performing the experiment negatively affected the anxiety level of subjects.
2) LIKERT-Type Questionnaire: We analyzed differences in subjects' ratings of visual discomfort, fatigue, and perception as measured by the LIKERT-type questionnaire to evaluate whether the EYE modality changed any of these measures while using AR during the "connecting-the-dots" task.Operationally, we tested for group differences with a Wilcoxon sign-rank test on the score of each question followed by false discovery rate (FDR) correction for multiple hypothesis testing [36].
3) Task Performance: We studied how task performance varied among different experimental conditions by analyzing the maximum and mean errors on gaps (see Section II-D) with multiple paired t-tests.More specifically, we compared the max error and mean error between experimental conditions (i.e., AR-bin, AR-mono, NK-bin, NK-mono) but without considering cross-modality comparisons: i.e., AR-bin versus NKmono and AR-mono versus NK-bin were not included in the analysis.Multiple hypothesis testing was controlled with FDR correction [36].

4) ANS Correlates:
We performed an exploratory statistical analysis to assess at the group level how tasks differed in terms of ANS correlates.To this aim, we performed a 1 × 4 ANOVA to test the null-hypothesis (H0) of no differences among the four conditions (NK-bin, NK-mono, AR-bin, AR-mono) for each EDA and HRV feature.In addition, we analyzed differences in ANS features with respect to the initial resting state by performing multiple paired t-tests (one for each feature) between each experimental condition and the resting condition.Multiple hypothesis testing was controlled using the FDR correction [36].

5) EEG Neural
Correlates: Group differences were also analyzed in terms of EEG correlates of power and connectivity.We performed a 1 × 4 ANOVA to test the null-hypothesis (H0) of no differences among the PSD values during the four conditions (NK-bin, NK-mono, AR-bin, AR-mono).The statistical significance of the ANOVA was assessed with a Monte Carlo-permutation method (number of permutations = 10 000) followed by cluster correction [29], [37].Post hoc analyses were carried out by means of paired t-tests followed by Bonferroni correction.
In addition, for each condition, we evaluated significant differences in connectivity by comparing the observed RPDC values after press events with their averages during the baseline, as well as by comparing their values between experimental conditions through multiple paired t-tests.The statistical significance of such comparisons was assessed with a Monte Carlo-permutation method (number of permutations = 10 000) followed by cluster correction [29], [37].

III. RESULTS
The analysis conducted on STAI scores measured before and after the experiment was not significant (p > 0.1), indicating that the experiment did not induce any change in subjects' anxiety levels.In addition, as reported in Table I, all the performed analyses on LIKERT scores were not significant, indicating that Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.the EYE modality did not influenced visual discomfort, fatigue, and perception.

A. Connecting-the-Dots Task Performance
The results of gap error analysis is reported in Fig. 3.We observed significant differences in the max error between ARbin and NK-bin conditions, with higher maximum errors for AR-bin.Analogously, significant differences were observed between AR-mono and NK-mono for this feature.Mean error analyses showed as well a significant difference in AR-bin versus NK-bin and AR-mono versus NK-mono comparisons, with a higher mean error during AR modality.Overall, these results indicated that during AR, perceptual issues arising from FR and VAC deteriorated subjects' performance, confirming previous studies [2].

B. ANS Correlates
In Table II , we report the differences in ANS features between each experimental condition and the initial resting state along with their statistical significance.In particular, for each condition, we observed that all EDA-related features significantly differed from rest.Among these Mean Tonic, Std Tonic, Mean Phasic, N peaks, and EDASYMP were higher during task compared to rest.Conversely, Max Peak and AmpSum were lower during the task.Considering HRV features, none of them showed relevant patterns.However, the 1 × 4 ANOVA did not reveal any significant change in any of the considered features among experimental conditions, indicating that changes in ANS correlates induced by the "connecting-the-dots" task was not affected by AR nor by the EYE modalities.These results indicate that although some ANS features are capable of distinguishing between the task and the resting state, they did not allow to distinguish among conditions during the "connecting-the-dots" task.

C. EEG Power
In Fig. 4, we report the results of the 1 × 4 ANOVA on PSD estimates.Significant differences among conditions were observed at specific scalp sites and for specific frequency bands.The more widespread differences were at right frontal, middlecentral, and left occipital regions in the μ band.In addition, there were significant differences in the frontocentral region in the β 2 band.Less widespread differences were present also in the β 1 .Finally, we did not observe any significant change in PSD in the δ, θ, and γ bands.
Post hoc analyses are reported in Fig. 5.During binocular tasks, PSD was lower compared to monocular ones.This happened both for AR-bin versus AR-mono and for NKbin versus NK-mono comparisons.However, differences were more widespread and lateralized when subjects were using AR.Specifically, this was mainly observed for frontal μ and posterior β 1 activity.Differences between conditions were less widespread when considering the AR modality contrast (i.e., AR-bin versus NK-bin and AR-mono versus NK-mono).In particular, few localized and lateralized differences in the AR-bin versus NK-bin comparison, with lower PSD values during AR-bin.Notably, we did not observe any significant difference between AR-mono and NK-mono conditions.Finally, we observed some differences in the AR-mono versus NK-bin and AR-bin versus NK-mono contrasts, which were however of less interest.

D. EEG Connectivity
The results of the connectivity analysis are reported in Fig. 6.Among all the brain regions on which RPDC was projected only few of them had network edges that were significant.These nodes were the precentral gyrus (PCG), the precuneus (PCu), the fusiform gyrus (FG), and regions tagged as upper basal (UB) according to the AAL [16], [35].
We observed significant differences for the following comparisons: AR-bin versus AR-mono, NK-bin versus NK-mono, AR-bin versus NK-bin, AR-mono versus NK-mono.In the ARbin versus AR-mono comparison there was only a significant decrease in connectivity from PCG → PCu during AR-bin.This same connection was observed to significantly differ in NK-bin versus NK-mono, AR-bin versus NK-bin and AR-mono versus NK-mono comparisons, showing higher values in the first condition of each comparison.However, also other connections significantly differed for these comparisons.Among these, a decrease in the reciprocal connections between UB and FG in the AR-bin versus NK-bin and AR-mono versus NK-mono during Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.For each contrast, we report the difference in connectivity between the experimental condition on the left and the one on the right.On left of the image, a 3-D rendering of the differences in connectivity after ∼300 ms from press events is reported.On the right, for each significant difference within the epoch, we report target-source (i, j) spectrotemporal matrices of RPDC differences between conditions, along with the time-frequency points in which they were statistically significant (p < 0.05 after cluster correction).AR condition was observed.Finally, there were some connections which differed only in specific comparisons.Namely, the PCu → UB decreased during AR-mono compared to NK-mono.Conversely, this connection increased during NK-bin compared to NK-mono.

IV. DISCUSSION
In this work, we analyzed the behavioral and physiological correlates of AR during a "connecting-the-dots" task.We provide a first multimodal analysis of peripheral and central neural correlates of AR use during a precision motor coordination task.Our results integrate with previous reports-based exclusively on behavioral and error measures of task performance and suggest that using AR induces modifications in brain activity and connectivity compared to performing the same task without AR.Our results also highlight that EEG power and connectivity measures seem to be more suited for characterizing the observed physiological responses compared to ANS correlates.Overall, our work provides useful insights for the study and design of applications of AR in more sophisticated and challenging tasks.
The analysis of subjective reports of visual fatigue, discomfort and perception, measured with the LIKERT-type questionnaire, did not show any significant difference between AR binocular and monocular modalities.This result corroborates the previous study reporting no differences in perceived discomfort between EYE modalities [2] and suggests that the observed differences in physiological measures in AR-bin versus AR-mono comparisons are not due to such factor.
The experimental study was designed to investigate the effects of the "FR" and VAC together (binocular tests) and separately (monocular tests), during an AR-guided manual task.More particularly, a simple task, which does not require a superimposition Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
of the virtual scenario with a physical counterpart, was designed for the study to limit the influence of (virtual-to-real) registration errors, which can negatively affect user performance.Our results show that subjects are more likely to be inaccurate during the AR-guided task.Accordingly, we can hypothesize that such inaccuracies may depend on the perceptual issues (focal rivalry and VAC in AR-bin, and focal rivalry in AR-mono) that impair the contemporary view of the real world and the virtual target.Indeed, the current generation of consumer-level OST HMDs is only capable of presenting the virtual content at a single fixed focal distance that is generally more than 2 m.Therefore, during manual tasks, the user eye cannot keep in focus both the virtual and real content at the same time.We believe that the use of OST HMDs with a focal distance greater than the length of the arm can be practical when the user can switch fixation points and alternately focus on real-world and VR information (e.g., textual information or simple icons), such as a driver alternately looking at the road and the tachometer in the cockpit.However, we believe that these perceptual issues may adversely affect user performance whenever the AR-guided manual task requires the user to simultaneously focus on virtual and realworld information (in our specific case, virtual dots, pen tip, and trajectory drawn, respectively).The results obtained suggest that although there is a growing interest in using commercial OST-HMDs to guide high-precision manual tasks (such as surgical ones), attention must be paid to the current limitations of the available technology, which is not designed for the peripersonal space.
The analysis performed on ANS correlates allowed distinguishing between each experimental condition and the initial resting state.Interestingly, all EDA features showed significant changes, suggesting that, regardless of the experimental condition, the motor task promoted an increase in sympathetic arousal.Conversely, variations of HRV features were less evident and not homogeneous across experimental conditions.However, none of the considered ANS features (both EDA and HRV) allowed for distinguishing among the experimental conditions.As in other studies, the complexity (e.g., hand-eye coordination) and voluntariness of the tasks induce acute sympathetic reactions that masked possible other differences in the task modality [38], causing a sort of saturation effect.
EEG measures showed a higher variation and specificity in the comparison among conditions compared to ANS correlates.Power and connectivity metrics showed changes based on both AR and EYE modalities that were physiologically plausible in terms of brain areas and frequency bands involved.In particular, the majority of changes in both types of metrics occurred in the μ and β bands: the two frequency bands primarily involved in motor execution [39].In addition, the scalp maps of PSD analyses, as well as the dipole locations of network nodes involved in significant causalities were in regions known to be involved in motor coordination and execution [39], [40], [41].
For both AR-bin versus AR-mono and NK-bin versus NK-mono analyses, PSD was lower in the μ band for the right-centrofrontal regions for the AR-bin and NK-bin conditions, respectively.This behavior is in line with the idea that the more demanding a task is, the more evident the suppression of the μ band becomes.In this view, we can assume that the binocular conditions were more demanding than the monocular ones.Analogously, suppression in the β 1 and β 2 , which is typically associated with voluntary movements, happened at centrofrontal regions.In such a case, however, we did not observe any particular lateralization.Finally, the more widespread differences in PSD scalp maps arose in the AR-bin versus NK-mono contrast, possibly indicating that these two conditions were the ones that differed more in terms of task demand.
Although the AR-bin versus NK-bin comparison showed less widespread differences compared to the above-mentioned ones, lower PSD values in the μ and β 1 bands were observed during AR-bin.Interestingly, the AR-mono versus NK-mono comparison did not show any significant change in PSD, possibly suggesting a limited influence of FR on PSD differences.Conversely, we speculate that the differences in AR-bin versus NK-bin could be due to the VAC, which arises only during binocular vision.
Similarly to PSD analyses, connectivity profiles highlighted a major involvement of μ and β bands, although some changes were observed also in γ and θ bands.Among the 76 ROIs considered in the analysis, only a few of them were involved in significant causal interactions.Notably, three out of four of these regions were the PCG, PCu, and the FG: a set of brain regions involved in different aspects of motor task execution.The PCG includes the primary motor cortex, participating in the control of voluntary motor movement [40].In addition, among the many tasks carried out by the PCu, its corticocortical projections to the lateral parietal areas and premotor cortex seem to play a pivotal role in the visual guidance of hand movements, such as hand-eye coordination [42].In this view, the observed changes in PCu → PCG and PCG → PCu causalities can be interpreted as an expression of the necessity of different eye-hand coordination between conditions.Finally, we observed the involvement of the FG, a region responsible for high-visual processing [43].In this view, we speculate that the changes in the causalities of this region reflected the different needs of visual processing of each condition.
Of note, our results rely on a multimodal approach (i.e., the use multiple measures and signals) aiming at identifying the effect of AR FR and VACs on different behavioral and physiological measures, and on robust statistical testing to prove the validity of our hypothesis.Yet, a limitation of our work is given by the relatively small group of subjects participating in the experiments (n = 21).Future works may consider a higher number of participants to draw more robust conclusions.In this light, also solutions based on machine learning could help at finding specific features of interest characterizing FR and VAC during the use of AR.

V. CONCLUSION
This work investigated the psychophysiological response to VAC and FR perceptual conflicts arising from the use of commercial AR HMDs in manual tasks.Our study corroborates previous studies focusing on performance metrics during a "connecting the dots task" in both binocular and monocular vision, and extends them with new insights in terms of ANS and central neural responses elicited by the task.Our results suggest Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
physiological measures that could be used to test innovative HMD-ARs specifically designed for peripersonal space, with the goal of achieving correct perceptual augmentation minimizing visual discomfort.In fact, our physiological results complement those of the questionnaires highlighting differences in operating conditions of which the user may be unaware (indeed these are not captured by questionnaires), assuming a key role in the validation and testing of new devices.
Such encouraging and physiological-plausible results push toward future studies focusing on the risks and benefits of using AR technology in high workload demanding applications, as for instance AR-guided surgery.

Fig. 4 .
Fig. 4. 1 × 4 ANOVA results on PSD.For each frequency band, the scalp map of the significant p-values of the 1 × 4 ANOVA test are reported (p < 0.05 after cluster correction).

Fig. 5 .
Fig. 5. PSD post hoc analysis.For each frequency band, the scalp map of significant p-values of the post hoc t-tests are reported (p < 0.05).p-values in red indicate higher power during the condition on the right with respect to the condition on the left, whereas p-values reported in blue indicate lower power during the condition on the right with respect to the condition on the left.

Fig. 6 .
Fig. 6.Connectivity analysis.From top to bottom: AR-bin versus AR-mono, NK-bin versus NK-mono, AR-bin versus NK-bin, and AR-mono versus NK-mono.For each contrast, we report the difference in connectivity between the experimental condition on the left and the one on the right.On left of the image, a 3-D rendering of the differences in connectivity after ∼300 ms from press events is reported.On the right, for each significant difference within the epoch, we report target-source (i, j) spectrotemporal matrices of RPDC differences between conditions, along with the time-frequency points in which they were statistically significant (p < 0.05 after cluster correction).

TABLE I LIKERT
-TYPE QUESTIONNAIRE RESULTS.AFTER FDR-BKY CORRECTION NONE OF THE P-VALUES WERE SIGNIFICANT