Robot-Assisted Joint Attention: A Comparative Study Between Children With Autism Spectrum Disorder and Typically Developing Children in Interaction With NAO

Robots have been used in joint attention (JA) tasks with children with autism spectrum disorder (ASD). However, very few studies compared JA performance of children with ASD with typically developing children (TD) when interacting with a robotic partner and a traditional human partner. This study aims to: (a) to explore whether there are differences in response to and initiation of JA between ASD and TD children with two interactive partners: an adult and a social robot (NAO); and (b) to explore which characteristics of ASD children predicting their performance in robot-assisted JA tasks. Twenty-seven ASD and forty TD children were involved in this study in which they were exposed to diffident JA tasks. Mixed results were found per type of JA behavior over groups and conditions. Our results show that both ASD and TD children performed better with the human partner than with the robot in response to JA tasks. Among the characteristics of ASD children, ADOS total score is associated with response to JA performance. No significant result related to initiation of JA was found.


I. INTRODUCTION
One of the main characteristics of autism spectrum disorder (ASD) is a social deficit, in which the impairment of joint attention (JA) plays a crucial role [1]. JA refers to a set of behaviors that enable two partners to communicate about either vocally or non-vocally, or jointly attend to a third entity, an object, or an event [2]. Various labels have been used to refer to the behavioral dimensions of JA [3]. JA has often been measured and conceptualized in terms of two types of behaviors: (a) response to JA (RJA) i.e. the ability to respond to another person's vocal cues, or to follow their gaze, head-turns or pointing gestures, and (b) initiation of JA (IJA) i.e. the tendency to initiate social attention by pointing, showing, or alternating gaze between an interesting The associate editor coordinating the review of this manuscript and approving it for publication was Pedro Neto . communication [9], [10]. Therefore, it has been suggested that JA should be included in early interventions for young children with ASD [11], [12]. However, a lack of intrinsic motivation for IJA in children with ASD is an obstacle for lasting effects [13]. The social deficit in ASD has been often explained through the social motivation hypothesis [14], [15]. Children with ASD lack motivation to engage in social tasks, such as JA tasks, because they found those social tasks more challenging and less rewarding than others.
An increasing number of studies from the innovative research field of robot-assisted therapy (RAT) found that social robots can engage and motivate children with ASD in social tasks [16]- [22]. Robots are able to provide a safe, simplified, and predictable environment that can be repeated in the same format until the learning process is realized, where the complexity of interaction can be controlled and gradually increased to respect different developmental levels of children [23]. Consequently, it is expected that by using the strengths of social robots, the motivation and learning effect of ASD children in JA tasks will increase. Many studies consistently showed that a social robot is suited to elicit JA in children with ASD [24]- [30]. Robins et al. found that a robot could serve as a mediator for JA behavior between the child and an adult [26]. De Silva et al. also confirmed that the robot can be used as a mediator, but also as an object of JA skills [27].
Other studies focused on using the robot as a therapist to conduct the JA exercises without comparing RAT with traditional human therapy. Therefore, JA behaviors were measured in interaction with the robot and not with the therapist. Warren et al. highlighted that children with ASD had no difficulties in paying attention to the robot for relatively long sessions and that they reached a high level of accuracy across the JA tasks [24]. Lowe showed that at the end of the robot-assisted intervention, the responses to JA improved for two out of three participants and eye-contact behaviors were improved for all participants [31]. An intervention program based on modified attention cueing paradigm showed improvement in responding to JA post-training [32].
Very few studies compared the performance of ASD children in JA tasks with those of TD children when interacting with a robotic partner and an adult. Anzalone et al. found that ASD children had lower JA performance when interacting with NAO robot compared to TD children [29]. Bekele et al. also obtained a similar finding however with fewer participants [30]. A longitudinal study of David et al. found that the interaction of five ASD children with a robot followed a similar pattern as with a human therapist.
This study aims to explore the potential of using a social robot as a trigger for JA skills in young children with ASD. First, we investigate whether ASD and TD children differ in the JA performances (RJA and IJA) in two conditions: a condition with an adult interaction partner (AC) and a condition with a robotic interaction partner (RC). Second, we examine which characteristics of ASD children predicting JA performances with the robotic partner. The answer to this can be relevant in deciding which children with ASD this RAT approach would work the best.

II. RESEARCH QUESTIONS AND EXPECTATIONS
In this study, we formulated five research questions related to JA performance of ASD and TD children when interacting with two partners: an adult and a social robot (NAO). In these questions, JA performance refers to both IJA and RJA performances.
(Q1) Do ASD children differ in their JA performance while interacting with a human partner compared to the TD children?
It was expected that children with ASD manifest significantly less JA behaviors than TD children in interaction with a human partner since JA is generally declared in literature as being impaired in ASD [7], [33].
(Q2) Do ASD children differ in their JA performance while interacting with a robotic partner compared to the TD children? (Q3) Do TD children differ in their JA performance while interacting with a robotic partner compared with an adult?
Previous research showed that social robots can trigger interaction in both TD children [34]- [36] and ASD children [37], [38]. Very few studies were conducted comparing the performance of two groups (ASD and TD) in interaction with a social robot. Two studies found that TD children had better JA performance than ASD children when interacting with robots [29], [30]. They also found that TD children have better performance with a human partner than a robotic partner.
(Q4) Do ASD children differ in their JA performance while interacting with a robotic partner compared with an adult?
No expectancy could be clearly formulated regarding the JA performances of ASD with the two interaction partners. Although most of the existing studies found a superiority of a robotic partner in comparison to a human partner in triggering social skills for children with ASD [26], [39]- [41], those studies were often characterized by methodological limitations [16], [17], [19], [42]. Thus, it is not clear if the positive superior effect of the social robot in comparison with the human will withstand a larger sample size and a controlled design as used in the current study. Additionally, there were several studies with more rigorous methodologies that found mixed results [22], [43], [44] and some others that found that the social performance of the children with ASD was better in the adult condition than in the robot condition [29], [30], [45].
(Q5) Which are the characteristics of ASD children that can predict their JA performance with the robotic partner?
To the best of our knowledge, this is one of the first studies addressing the characteristics of ASD children that can predict their JA performance with a robotic partner. The level of intellectual functioning (assessed by the BSID) and the severity of their communication and social interaction abilities (assessed by the ADOS total score) were included in the analyses as potential predictors. Several studies, which conducted JA tasks with human partners showed that often ASD children with a higher level of intellectual functioning have better scores in JA than the ones with a lower level of intellectual functioning [46].

A. PARTICIPANTS
Two groups of children were included in the study: children diagnosed with ASD (ASD group) and typically developing children (TD group). The inclusion criteria for the ASD group were as follows: (1) an ASD diagnosis confirmed by the Autism Diagnostic Observation Schedule (ADOS) [47], and (2) a mental age of maximum 42 months, assessed by the Dutch version of Bayley Scales of Infant Development, Second Edition (BSID-II-NL) [48]. The children from the TD group had no intellectual disability i.e. having a development index that suggests a within normal limits development assessed by BSID-II-NL. Details of the participants from both groups are provided in Table 3.
Initially, 40 ASD children and 40 TD children were recruited. Only 27 children from the ASD group were included in the study due to several reasons. Four children did not fulfill the robot condition due to anxious reactions in the presence of the robot. Two children refused to enter the experimental room. Four children were sick or absent. For three children the video of the adult condition or the robot condition did not allow for reliable coding. Informed consent was obtained from all parents of the individual participants included in the study. Additional informed consent was obtained from all individual participants for whom identifying information is included in this article. Approval for the implementation of the study was granted by the Commission of Medical Ethics of the Vrije Universiteit Brussel (Belgium).

B. EXPERIMENTAL SETUP
The setup was adjusted from the Early Social Communication Scales (ESCS) by Mundy et al. (2003) [49] (see Figure 1). The study was conducted in a 4 × 4 m test room separated into two areas: the experimental area and the operational area. Distracting stimuli were removed. The experimental area was used to conduct JA tasks. The operational area was for the operator to control the robot and for the experimenter to trigger unexpected IJA events. The setup for RJA task is illustrated by Figure 1-left. The child sits in front of an interaction partner (adult or robot) at a small table situated in the middle of the room at equal distances from four posters (60 × 90cm). These posters contained images of well-known cartoon figures for children (e.g. Mickey Mouse, Minnie Mouse) and are located on the walls on the left, right, left-behind, and right-behind of the child.
The setup for IJA task is illustrated by Figure 1-right. The child sits opposite to the interaction partner (adult or robot) on a carpet on the floor. This allows the child to observe four unexpected IJA events i.e. a remote-controlled car driving from the right side of the child, bubbles coming from the left side of the child, two video-clips projected on the wall. These events are controlled by an operator behind the wall. Materials for these events are out of the visual field of the child.
During the experimental tasks, two digital cameras were used to record children's behavior. A remotely controlled toy car, a bubble blower, and a projector were used as stimuli for the IJA task. Children were always accompanied by an experimenter during JA tasks, except for the moments when IJA events occur in the robot condition. At those moments, the experimenter hid behind the wall so that the children could only share the IJA events with the robot and not the experimenter.

C. PROCEDURE
Each participant was first exposed to a familiarization phase. The experimenter went to play with the child in his/her class to get comfortable in interacting with each other. The length of this phase varied across participants from 10 to 15 minutes. Afterward, the experimenter brought the child to the experimental room. Each participant was exposed to two sessions which included each an assessment part and one of the two conditions: adult condition (AC) or robot condition (RC). Two conditions differed only in the type of interaction partner i.e. adult or robot. Half of the participants from each group, TD and ASD, were first exposed to the AC and then to the RC and the other half conversely to avoid potential order effects. Two conditions included two types of JA tasks and followed a pre-established protocol. In both conditions, participants were first exposed to the RJA task at the table and then the IJA task on a blanket. There was a break of two minutes playing a tablet game between two tasks.

1) ADULT CONDITION
This condition started with RJA task with an adult human partner. The aim of this task was to examine how the child follows the gaze or/and the pointing gesture of an unfamiliar adult that was directed to four posters in the following order: left, left-behind, right, and right-behind. The experiment continued with the IJA task, which aimed to explore whether the child shares his/her attention with the adult regarding unexpected positive events: bubbles, a remote-controlled car, and projected video clips. If the child noticed the event and initiating JA with the adult (e.g. making eye contact, pointing with or without verbal remarks, alternating gaze between an event and the experimenter), the adult looked at the direction of the event and answered to their sharing by saying: ''Yes, I have seen it. Nice!''. Immediately after each IJA event was manifested, the experimenter and the child played together with the materials. Between both RJA trials and IJA events, the adult was singing famous songs for this age and playing games i.e. imitation and body parts identification.

2) ROBOT CONDITION
This condition started with a demonstration part in which the child watched the experimenter playing with the robot (NAO) for five minutes. The child stood on the side on a chair close to the experimenter and followed the demonstration. In this manner, the child had a clearer view of the robot's behavior before he/she interacted directly with it. After this demonstration, the experiment followed the same structure as the AC but with the NAO robot as the interaction partner. NAO was programmed to mimic the adult's behaviors, both verbal and non-verbal e.g. gazing, pointing, imitation game, body parts identification game (see Figure 2).

D. INSTRUMENTS 1) AUTISM DIAGNOSTIC OBSERVATION SCHEDULE (ADOS)
The ADOS is a valid and reliable instrument to diagnose ASD [50], which was administered to confirm the ASD diagnosis of participants in the ASD group [51]. It is a semi-structured observation scale and provides a 30-to 45-minute observation period during which the individual is being assessed with numerous opportunities to exhibit behaviors of interest in the diagnosis of ASD through standard 'presses' for communication and social interaction. The 'presses' consist of planned social occasions in which it has been determined in advance that a behavior of a particular type is likely to appear.
The ADOS consists of four modules, each of which is appropriate for children and adults of differing developmental and language levels, ranging from nonverbal to verballyfluent. In this study, module one was used for children with no or little speech and module two for children with limited speech (e.g. short sentences but not able to speak fluently). Both modules included playful activities such as blowing soap balloons, peek-a-boo, and imagination play. Five domains are assessed during the observation i.e. Language and Communication, Reciprocal Social Interaction, Social Affect, Play, Stereotyped behaviors, and Restricted Interests.
The participant's response to each activity was recorded. Overall ratings were made at the end of the schedule to formulate a diagnosis. Scores varied from 0 (typical behavior) to 2 or 3 (atypical behavior). The scores of the children were compared to a cut-off score for ASD in order to assign the diagnosis. For both modules, a total score is the sum of the Language and communication score and the Social interaction score. A total score higher than 7 indicates an ASD diagnosis, and a total score higher than 12 indicates an autism diagnosis.

2) BAYLEY SCALES OF INFANT DEVELOPMENT (BSID-II-NL)
The BSID-II-NL was used to assess the mental age of the participants for both, TD and ASD groups. This instrument consists of three scales: the Mental Scale, the Motor Scale, and the Behavioral Scale. In this study, only the Mental Scale was used, which consists of measures of problem-solving, early number concepts, memory, generalization, and social skills. The developmental level of the child was estimated by calculating a Developmental Index (DI). Four categories of standard scores are possible: accelerated performance (score of 115 and above), within normal limits (scores between 85 and 114), below average (scores between 70 and 84), and significantly delayed performance (scores of 69 and below). Based on the DI, mental ages (MA) can be calculated. The duration of the assessment varied between 25 and 60 minutes. The test construction, the quality of the test material, and the manual were evaluated as good. The norms, test-retest reliability, and conceptual validity were rated as sufficient [52].

3) THE HUMANOID ROBOT NAO
This study used the humanoid robot NAO of Softbank Robotics as a robotic interaction partner [53]. The robot has been used in various studies with children with ASD and typically developing children (e.g. [20], [21], [24], [29], [39], [54]). NAO has the size of an infant (58 cm), weighs 4.3 kg, and is equipped with 25 degrees of freedom (DOFs) and a variety of sensors. Its human-like embodiment is beneficial for the generalization of skilled learn during human-robot interaction to human-human interaction. During the experiment, NAO was controlled by an operator in Wizard of Oz mode in the robot condition. Robot behaviors were programmed and triggered in Choregraphe, a graphical environment for programming NAO [55].

E. RESPONSE MEASUREMENTS
Video coding analysis was used in order to measure the target variables. This method is more practical and accessible compared to more advanced approaches e.g. eye-tracking and EEG [56], [57]. The behaviors were manually coded from the videotapes using ELAN transcription software [58]. Two independent coders trained by the experimenter coded a random subset of 30% of the data. An inter-rater agreement analysis revealed a Cohen's kappa of.89 for RJA and.87 for IJA variable.

1) RESPONDING TO JA (RJA)
RJA was defined as the ability to follow the head-turn, pointing gesture, and/or visual regard of the interaction partner (adult or robot). The opportunities for RJA were realized by using four posters positioned left, left-behind, right and right-behind on the walls as described above. Each opportunity always started with only gaze and verbal instruction. When the child failed at this level, another attempt for the same poster was immediately performed but accompanied by pointing gestures. The scoring ranged between 0 and 4 and was given according to the assumed degree of difficulty mentioned in the literature [59] (see Table 1). A participant could obtain an RJA total level score ranging from 0 to 12 by adding the scores from all RJA opportunities.

2) INITIATION OF JA (IJA)
Two IJA response levels were assessed i.e. basic and high. The basic level was defined as alternating looks between the eyes of the experimenter and the location where the event took place. The child got credit for this behavior if he/she shifted his/her gaze from the object to the tester's face. It was not necessary for the child to gaze back at the object to receive credit. The high IJA level was defined as pointing clearly at the place of the event with the index finger, eventually accompanied by verbal utterances (e.g., ''Look!''). The pointing was only valid when the index finger was extended, and adjacent fingers were noticeably inclined downward, or away from the index finger and downward the palm [49]. The child received a score of 1 when basic behaviors occurred and a score of 2 for high-level behaviors (see Table 2). There were four opportunities for IJA behaviors that corresponded to the four unexpected events (i.e. a self-remotely controlled car, soap bubbles, and two video-clips). In total, a participant could obtain a total IJA score ranging from 0 to 8.

F. EXPERIMENTAL DESIGN
A single exposure study using a 2 × 2 mixed-design with the interaction partner/condition (adult or robot), as within-variable and the group (children with ASD or TD children) as between-variable. The dependent variable was the JA performances (RJA and IJA) with each interaction partner.

G. DATA ANALYSIS
Data were analyzed using SPSS 24. To answer the first four research questions aiming to examine whether the JA performance (RJA and IJA) of children differed between groups and interaction partners, we performed a mixed ANOVA with the type of interaction partner as within-subjects factor and group, as the between-subjects factor. Bonferroni correction was used for multiple comparisons.
Finally, to explore the last research question (i.e. which the characteristics of the children with ASD are that can predict the JA performance with the robotic partner), a multiple regression was performed with each of the JA behaviors, RJA and IJA with the robot (RJA_R and IJA_R) as dependent variables and BSID's mental age and ADOS total score as predictors.

IV. RESULTS
First, a 2 × 2 ANOVA with two types of interaction partners (robot and adult) and two groups (ASD and TD) as between-subject factors revealed no interaction effects (see = 0.009. Several t-tests (paired and unpaired) were performed with Bonferroni correction to explore potential differences between the two groups with the two interaction partners (Q1-Q4). A multiple regression was performed for JA behaviors to explore which characteristics can predict JA performance of ASD children with a robotic partner (Q5).    (Q5) A multiple regression was performed for both JA behaviors with robot (RJA_R and IJA_R) as dependent variables and mental age (MA) and ADOS total scores (ADOS_TOT) as predictors. Firstly, for the RJA, a statistically significant model was found, F(2, 25) = 4.54, p < .05, R 2 = .27, see Table 4. Only the ADOS total score was significantly associated with the RJA (ADOS_TOT: t(25) = −2.22, p = .03) and 21% of the variance in RJA_R could be explained by the model. Regarding IJA, the model was not significant, F(2, 24) = 1.11, p = .34, R 2 = .08, see Table 5. None of the predictors were significantly associated with IJA (ADOS_TOT: t(25) = .16, p = .87, MA score: t(25) = 1.10, p = .28).

V. DISCUSSION
The results are discussed below for each of the research questions. In our study, ASD children had a marginally significantly lower RJA performance with the human partner in comparison with the TD children. This difference is in line with previous studies that RJA impairments of preschoolers with ASD distinguish themselves from other groups of children e.g. [10], [60], [61]. However, we could not replicate the difference in IJA performances between ASD and TD children as documented in literature e.g. [46], [62]. This result could be explained by the predominant very low performance of IJA with the adult, potentially caused by the type of unexpected event and/or the scoring system used to assess the IJA skills of the children. ASD children had a significantly lower RJA performance with the robot in comparison with the TD children. This is in line with previous studies in robot-assisted JA (e.g. [29], [30]). Our result also confirmed another fact that children do turn their attention to the robotic partners as in other studies with NAO and Tito robots [18]. There was no significant difference in IJA performances with the robot between two groups, which is similar to our result in Q1.

Q3: DO TD CHILDREN DIFFER IN THEIR JA PERFORMANCE WHILE INTERACTING WITH A ROBOTIC PARTNER COMPARED WITH A HUMAN PARTNER?
TD children differed in their RJA performance with the two interaction partners, but not in their IJA performance. Their RJA performance was significantly better with the human partner than with the robot. This indicates that although TD children were engaged with the robot at certain moments, JA episodes elicited by the robot were not as clear as those of the human partner. NAO does not have the ability to move its eyes and this had an impact on children's ability to guess where the robot is looking at, especially in gazing. Consequently, most of the participants needed pointing gestures as prompts to be able to identify correctly where the robot is looking at. This can explain the higher performance in the adult condition in which they could guess where the adult was looking at by only using his/her gaze. It is difficult to connect these results with previous studies. To our knowledge, most of the existing research aiming to explore how TD children interact with a social robot did not use comparative designs. Similar to TD children, ASD children had a significantly higher RJA score in the adult condition compared to the robot condition but not in the IJA task. This is in line with previous studies in robot-assisted JA (e.g. [29], [30]) as in Q2. Although several studies suggested ASD children have better or similar engagement with robot compared to a human partner in different social tasks (e.g. [16]- [18]), it does not lead to a better performance in JA tasks.
This result does not mean that a robotic partner is not useful in JA tasks. Several longitudinal studies found that ASD children improved their JA performance when interacting with robots (e.g. [24], [39], [40], [63]), and their interaction with robot followed a similar pattern as their interaction with a human therapist (e.g. [28]). Another study suggests that robot-enhanced therapy has similar performance as traditional human therapy and can be an alternative option to reduce therapist's workload [21]. Developing better robot behavior in eliciting JA episodes will potentially improve the JA performance of ASD children. In this study, on the one hand, the ADOS total score was associated with the RJA performance but not the IJA one. This partially confirmed the atypical development of JA in children with ASD [7], but only for RJA and not for IJA skills. On the other hand, the mental age assessed by BSID-II NL) does not predict JA performances. This represents a surprising result since RJA skills ability improves gradually over time [49]. This might be caused by the IJA events and scoring system of this study.

VI. CONCLUSION
In the last decade, advancements in technology led to an increasing amount of research focusing on developing robot-assisted tasks to encourage social skills in children with ASD. This study focused on JA in two categories of children (ASD and TD) in interaction with two different partners (a traditional human partner and a social robot). Two evidence-based tasks were adopted to measure two types of JA behaviors namely responding and initiating JA. The aim of the study was to explore the potential of social robots to be used in JA tasks for children with ASD and to investigate which are the characteristics of the children with ASD that perform the best in robot-assisted tasks.
There are some important limitations to be listed. Together with the technical limitations of the robot mentioned above, some methodological shortcomings may have impacted the obtained results. This study used a cross-sectional with a control study design. A long exposure study could reveal clearer results regarding which particular characteristics of the children with ASD predict the best performance in social tasks assisted by robots. A longitudinal study with more than one session with a robot could also explore how the interest of ASD children in robots evolves across sessions. The lack of findings in IJA could be explained by the scoring system that was applied. In this study, only one standardized task was used to assess IJA. Since the context in which JA skills are observed can influence children's performance [64], adding a structured interaction to this standardized task can lead to a more extensive picture of the IJA skills [33]. Finally, the ADOS calibrated severity score [65] which shows some benefits in measuring autism diagnostic symptom severity in pre-school children can be also used to compare with the ADOS total score used in our study.
This study has several important strengths to note. First, the study used evidence-based tasks [49] to measure children's JA behaviors in interaction with a social robot, which is an important added-value to the RAT research field. Second, this study deals with many methodological limitations of the existing research especially compared to engineer-driven studies [16], [17]. This study has a bigger sample size and has a control group: 27 ASD children and 40 TD children. Third, to our knowledge, this study is one of the first to provide some answers regarding the characteristics of the children with ASD from the whole spectrum that perform the best to robot-assisted JA tasks. Finally, this study suggests future studies should include more comparative designs to better evaluate the use of social robots in robot-assisted JA tasks.

ACKNOWLEDGMENT
The authors would like thank to all children and their families for agreeing to participate in this study. We also acknowledge the employees from all institutions involved in the study for their assistance with the recruitment and organization of the experiments.  JOHAN VANDERFAEILLIE is currently an Educational Studies Professor with Vrije Universiteit Brussel and also with PAika as an Educational Specialist. His current research interests include medically unexplained symptoms, autism spectrum disorder, and foster care.