An Adaptive Multi-Robot Therapy for Improving Joint Attention and Imitation of ASD Children

Robot-mediated therapies for autism spectrum disorder (ASD) have shown promising results in the past. We have proposed a novel mathematical model based on an adaptive multi-robot therapy of ASD children focusing on two main impairments in autism: 1) joint attention and 2) imitation. Joint attention intervention is based on three different least-to-most (LTM) cues, whereas the adaptive imitation module uses joint attention for activation of the robot. The proposed model uses a multi-robot system as a therapist without any external stimuli (from the environment) to improve the skills of the ASD child. Another novel aspect of this paper is the deployment of a multi-robot system for introducing the ASD child to the concept of multi-person communication. This is particularly useful as, unlike humans, robots can be more consistent and relatively immune to fatigue. Two different therapies of human–robot interaction (i.e., with and without inter-robot communication) have been conducted. The model has been tested on 12 ASD children, eight sessions for each intervention over a period of six months. The effectiveness of the model is validated by analyzing the cognitive state of the brain before and after the intervention with electroencephalogram (EEG) neuroheadsets. Moreover, results obtained using the childhood autism rating scale (CARS) to measure the effectiveness of therapy also support the conclusions firmly. The statistical results with the p-value = 3.79E-07 < 0.05 and the F value = 23.93>3.28 show reliability and significance of the data. The results strongly indicate significant improvements in both modules, along with a notable improvement in multi-communication skills of the participating children.

to identify the level of autism. Based on the spectrum level of the child, different cognitive therapies are implemented to improve the condition of the child [7]. Recently, robots are being involved in these cognitive behavioral therapies to enhance the focus and interest of the autistic child.
The research conducted on remedial measures for ASD using human-robot interaction (HRI) has shown that majority of the individuals suffering from autism are more inclined towards robots rather than human therapists [8].
In case of robotics therapy, joint attention [9], ability to imitate [10], verbal communication [11] and social activities [12] are mostly targeted to measure the improvement in ASD. Humanoid robots are gaining more attention for autism therapy as they are controllable, accurate, low cost and adaptive to environment [13], [14]. The current research trends are inclined towards development of novel robotbased therapies due to the inquisition of ASD children in robot [10], [12]. The child's engagement is a key prerequisite to improve the adaptive ability of robots in intervention [13]. Most of the research done in robot-based therapeutic measures focuses on the physical features of the robot [15], control architecture [15], different evaluation criteria [16], and several HRI based algorithms [17]. A child's behavior may vary according to the size, shape and looks of the robots as when interacting with robots, its appearance matters to the subject [18]. Research has proved that clinical use of robots is helpful towards eliciting positive social behavior of an ASD child [19].
There are two types of robots that are used in ASD intervention: anthropomorphic [11], [20] and zoomorphic [21], [22]. Zoomorphic robots are animal like robots used for studying the behavior of an ASD child. One such example of a zoomorphic robot is ''Keepon'' robot that is famous for positive social interaction with ASD children because of its cute and simple appearance [23].
Robots that look like human are called as anthropomorphic robots. These robots are recently being used for research involving social interaction and facilitating collaborative play. Examples of these are ''KASPAR'' [10] and NAO [24].
Anthropomorphic robots are specially preferred for developing social skills in autistic children [25]. Despite having functional limitations, robots having' physical appearance close to a humans can play a vital role in significantly improving the child's behavior [26], [15]. The interventions using these robots have proven to be more successful if they address the core deficits of ASD rather than choosing free play as mode of interaction [24]. The distinguished features of the robots are their high repeatability and willingness to interact without any complaints and fatigue [27].
Perception in autism is also discussed in latest Japanese research using robots. It is based on theory of mind and [28] is discussed via using different animate and inanimate entities. Non superiority visual processing of autistic children over typically developed children has been reported in [29] cognitive wise. In perception, humanness nature of the robot is also important which significantly affects [30].
Features which are good to increase the child-robot interaction are presented in [31]. The engagement of autistic children has been measured during the occipital therapy and the relationship between task driven valance and arousal conditions has also been studied [32].
An autism diagnostic protocol based on Autism Diagnostic Observation Schedule (ADOS) using a NAO humanoid robot has been presented in [34], [35]. However our proposed model represents an adaptive therapy using multi-robot system for improvement in joint attention and imitation of an ASD child. Table 1 shows the differences between our current research and previously work done in [33], [34].
Moreover another concern in application of such systems is the end users' preference for evidence based practice (EBP) which is not generally catered for in robotic therapy solutions [20].

A. CONTRIBUTION
Three important contributions of this research are: 1) Design and development of a single mathematical model for adaptive multi-robot based therapy of ASD children for both LTM-based joint attention as well as imitation. 2) Validation and effectiveness of MRIS system based on user study using CARS scale. 3) Notable improvement in multi interaction of an ASD child.
The multi-robot based adaptive model presented in this article satisfies the concern of EBP called MRIS (Multirobot-mediated Intervention System). These multi-robots act as non-human partners in order to improve the social communication skills between multiple persons at the same time. Moreover the robots themselves act as therapist as well as the stimulator for an intervention without the use of any body worn sensor during intervention. Based on the results of intervention for the improvement of joint attention and imitation, it has proven to be the robot-mediated interventions (RMIs) as an evidence-based practice (EBP) in autism. This is achieved using a variety of sensors. First, instead of recording data manually, sensors have been integrated into the system so as to ensure the correctness of results and avoidance of human error. Moreover the results from EEG headset before and after intervention also verify the success. The intervention was planned in a way that all participants took part in the therapies. Furthermore it was ensured that sensors should not touch the body during intervention as it may make the ASD child feel uncomfortable [11]. The ultimate aim of this research is to find the parameters in collaboration with clinical experts that can improve the multi communication skills of an ASD child using adaptive robotic interventions.

II. MRIS ARCHITECTURE
Our MRIS architecture as shown in Fig. 1 is based on the model proposed by Zheng et al. [35]. In the previous research [35], the model is adaptive for JA module only. Moreover the previous model does not focus on interaction of an ASD child with multiple agents simultaneously. Whereas, our proposed model introduces VOLUME 7, 2019   two adaptive modules i.e. for joint attention as well as imitation to improve the multicommunication of an ASD child. These two modules are discussed in detail in following sections.

A. LTM-BASED JOINT ATTENTION PROTOCOL
The interaction protocol for joint attention of MRIS uses least-to-most (LTM) cues as shown in Fig. 2. LTM has been used extensively as a tool for screening and diagnostics of ASD [36]. The child is introduced to the least intruding stimulus. If required the child is moved to the next level that is the more prominent intruding stimulus than the previous one [37], [38]. Robot mediated interventions have been using this protocol for teaching imitation skills to ASD children [39]. LTM protocol assists only when required.
Our designed protocol is based on three steps: 1) Visual Cues: In current joint attention, the first protocol is of visual cues. Two types of visual cues are developed: 1) ''Rasta'' (changing eye color of robot in cyclic manner) and ''Blinking'', considered as least intruding stimulus. 2) Speech cues with these visual cues are added in the second protocol. Speech cues added are ''Hi'' and ''Hello''.
These speech cues are more prominent hint for the ASD child compared to visual cues [38]. 3) Motion cues: Level three comprising of visual, speech and motion cues all combined together. The motion cues added are ''Move forward'', ''Move backward'', ''Stand-up'', and ''Sit-down''. As it can be seen the cues are ordered as per LTM approach [35].
While pervious researches have only used a single robot with non-adaptive model for improvement of either joint attention or imitation in ASD child. We have introduced the first mathematical model based on multi-robots for improvement in both joint attention as well as imitation based on joint attention. This model can be used to improve joint attention and imitation skills in ASD children with the help of prompt only when required.

1) NETWORKING PROTOCOL FOR JOINT ATTENTION
Networking protocol for LTM-based joint attention is shown in Fig. 3. Two transmission control protocol (TCP) servers are implemented in the control computer. The control computer Step 1: RA(n)=Robot_Action (Robot_Action_List (Index)) RB(n)=Robot_Behavior (RA(n)) RESP(n)=Participant_Joint_Attention (); IF(RESP(n) == ExpResp) Reward INSERT (PQ, RESP(n), Index) GO TO STEP03 Step 2: Index=Next_Robot_Action (RESP(n), PQ) n++; Go to step 01 Step 03: Write (SORT(PQ), Subject.xlsx) & Terminates communicates with NAO robots for activation of stimuli and feedback of data through the TCP servers during the experiment. The control modules are 1) Eye contact module which records the eye contact duration of the child and 2) Reinforcement stimuli module that gives cues for the joint attention module. The two cues given in this are rasta and blinking, as already discussed.
Both modules run in parallel. In Fig. 3 the modules are represented by numbers. Reinforcement stimuli modules are represented by C 11 and C 12 whereas eye contact modules are represented by C 21 and C 22 . Server sends commands to both clients i.e. both NAO robots at the same time via router and receives feedback during the experiment as shown in Fig. 3. This holds true for both modules. Moreover file writing is done in two separate files.

2) MATHEMATICAL MODEL FOR JOINT ATTENTION
Various prompts are used in the intervention modeling for therapies. The prompts used are usually verbal and motion cues from the robot along with some environmental factors. The environmental factors are usually the medium towards which the robot points e.g. LCD screen etc. [35]. In this research no external factor has been introduced. The prompt cues were given by the robot in LTM order i.e. visual, speech and motion cues. These cues differ in level of complexity for obtaining the child's response. This model starts with the least prominent cue for measuring the joint attention of the child. {V}, {S} and {M} are the libraries used for representing different reinforcement stimuli where {V} represents visual, {S} is for speech and {M} represents motion. By combining these libraries we get different stimuli ranging from least to most in its effectiveness e.g. using visual stimulus stand alone has least effect as compared to using it in combination with  speech and motion stimuli. MRIS-LTM model follows the following steps: Step 1: Index is passed to Robot_Action_List which gives a robot action to be performed and hence Robot_Behavior defines a behavior of the robot. Then Participant_Joint_ Attention () function starts which records the joint attention and gives an associated current response. If current response matches with expected response, then reward is given i.e. when RESP(n)=ExpResp () and control is transferred to step 3. If current response is not the expected response, then step 2 activates and step 1 is repeated till Max_Limit is reached i.e. when RESP(n)! =ExpResp (). Data is recorded, and code terminates.
Step 2: Max_Limit is checked first, if it has been reached the code terminates otherwise current response is given to Next_Robot_Action () function and step 1 is repeated till PQ is filled or any condition is met in step 1.
Step 3: In step 3, code terminates after saving the PQ list in sorted manner in an excel file. Max_Limit represents the number of attempts without setting the expected response. The Harel state chart of joint attention model is presented in Fig. 4.

S 1 = XOR {Initialization, Execution, Termination, Reward}
(1) Here ''S i '' denotes the output from different hierarchy levels. ''i'' denotes the hierarchy number. All possible operands combination refers to a state from state machine diagram as shown in Fig. 4. All the states are mentioned in hierarchical level i.e. S 1 is the top level state or parental state. S 2 is intermediate state and S 3 is the leaf node state. Equation (1), (2) and (3) indicate the control operator during the experiment. The depiction of where these control operators are applied is given in Fig. 4 showing which state will be active. For example in case of S 1 , XOR gate represents that only for single high input the output of parental state will be high.  (3), S 3 is represented by an OR gate such that if any stimulus in input is high the leaf node state S 3 will execute. The stimulus will be executed in LTM order. However in suggested model only one state can work at a time. It can be seen that S 1 , S 2 , S 3 are only used to trigger the respective level in the module once conditions are met. However the outputs themselves are analog and no data is discarded. Therefore the results in Table 3 and 4 in the Experimental design section are the results of therapies being performed within the state itself and are therefore not Boolean numbers. S 2 is running in parallel under execution stage. Two signals i.e. timeout (TO) and target hit (TH) along with threshold values determine whether the system needs to shift to the next cue or not. This model not only places a check on time in which gaze of the child should be directed towards the robot but also the time duration for which it should establish the eye contact in order for it to be claimed as a target hit. For activation of joint attention module, the minimum time for eye contact should be at least 5 secs. TO triggers if no action is done by ASD child in that particular time of module activations.
To represent module 1 and 2 in execution stage the first step of MRIS LTM protocol i.e. the least prompt cue level is when each robot starts with visual cues i.e. rasta and blinking to measure child's joint attention. This is represented by {V}. If the threshold value for joint attention is not achieved by the child, it moves to the next level represented as {V+S}. If the child does not meet the threshold value of this level too, the therapy is moved to third stage i.e. the highest level {S+V+M}. In execution stage, all the modules i.e. robot 1, robot 2 and gaze are working in parallel. Depending on the child's performance at any stage he/she is rewarded at the same stage after the completion of therapy. This model also records the particular stage the child has to start when introduced to the therapy next time The threshold value is the hyper parameter of this model. For this research the threshold value is 50%. Fig. 5 shows the interaction of child with the robots for the joint attention module.
The joint attention module is further explained using mathematical equations represented in (4) and (5). The joint attention module is linked with the reinforcement stimulus. A reinforcement stimulus is given by the robots to measure the joint attention of each subject. In order to execute this task two modules are running in parallel under O JA , a module to measure joint attention and stimulus module. These two operands for an ''AND'' operation represent parallel execution for true state as shown in (4). First operand deals with joint attention recording for both robots while the second operand deals with reinforcement stimulus being given by the robots. This is represented in (4). Mathematical model of (4) is further explained in (5) where ''i'' denotes the robot number, ''k'' denotes the number of eye contacts and ''j'' denotes the type of reinforcement stimulus. ''n'' and ''m'' belong to real numbers. In our case we have presented three stimuli denoted by ST j as shown in (6) where R i represents the robots presented in (7).
where n, m ∈ R (5) Equation (5) can be further explained by illustrating it in iterative manner: 1 st Iteration: j = 1 and i = 1, 2 k = 1 and i = 1, 2 where x 1 is the duration of eye contact noted by the robot and ST 1 denotes the visual stimuli given by the robot. 2 nd Iteration: j = 2 and i = 1, 2 k = 2 and i = 1, 2 where x 2 is the duration of eye contact noted by the robot and ST 2 denotes visual + speech stimuli launching on robot one and robot two. Similarly for third iteration ST 3 will denote visual + speech + motion stimuli launching on robot one and robot two. This process continues till completion of the therapy.

B. IMITATION PROTOCOL
The interaction protocol of MRIS imitation module uses the child's joint attention to activate the robot. This is done by allotting a certain time limit (5s) for which the child should focus towards the robot in order to activate it. The threshold time not only ensures activation of only one robot at a time but also makes the module adaptive. After eye contact is established with a particular robot, the robot starts its imitation tasks i.e. Move Forward, Move Backward, Raise Hands, Hands Down. These motion gestures are imitated by the child and are measured using Kinect to calculate the success rate.
Introducing the second robot in the experiment helps impart multi-agent communication skills along with improvement in imitation. The functionalities of imitation can also be used with only one robot in the experiment however having an additional robot helps impart communication skills in multiagent scenario.

1) NETWORKING PROTOCOL FOR IMITATION
The networking protocol of MRIS imitation module is almost the same as joint attention module, shown in Fig. 3. The modules are represented by C ij , where i is the server number and j is the client's serial number. The action module is dependent on the eye contact module as discussed above and is represented by C 11 and C 12 .

2) IMITATION-BASED MODEL
During initialization a priority list (PQ) is loaded. PQ list tells us about the last successive action of the robot which had imitated by the ASD child. In step 1 the joint attention of the ASD child is captured with gaze tracking module. If an eye contact is established for 5s, the imitation module starts. The selected time period of 5s is based on observations made during experimentation i.e. the time period should not be so short that it could start overlapping with the second robot's activation and not so long that the child may lose focus before the response is conveyed. If the child's response matches with the robot's response, the robot gives reward by saying ''Good Job'' and performs the next action which should be imitated by child as per the protocol of the therapy. In discussion with the therapist, verbal response for encouragement was only added in case of correct imitation. In case of incorrect action performed by the child, discouraging response is not produced by the robot. The description of functions for both algorithms is presented in Table 2. The Harel state chart of imitation model is presented in Fig. 6. Fig. 7 shows the child engaged in imitation module with both robots.
MRIS-LTM mathematical model for imitation based on joint attention is shown in (8), (9), (10) and (11). Here ''Si'' denotes the output from different hierarchy levels. ''i'' denotes the hierarchy number. All the states are mentioned in hierarchical level i.e. S1 is the top level state or parental state. S2 is the intermediate state and S3 is the leaf node state as discussed for (1), (2) and (3). The only difference is in the leaf nodes of execution state i.e. S 3 which reflects that only one robot will perform either of its two imitation task. This can be seen in Fig. 6 in execution stage. All possible combinations of operands refer to a state from state machine diagram as shown in Fig. 6. The depiction of the control operators are applied is given in Fig. 6 showing which state will be active.
OR{Raise hands, Hands down}} (10) Imitation module is further explained using mathematical equations represented in (11) and (12). Equation (11) represents imitation module based on joint attention of the child that triggers this module on establishing eye contact. Therefore in order to execute this task, joint attention module along with imitation module is running in parallel. For this purpose an ''AND'' operation is considered best to represent (11). Mathematical model of (11) is further explained in (12).
where n, m ∈ R Here O JA→IM denotes the output from joint attention integrated with imitation module (state machine). ''i'' denotes robot number, ''k'' denotes the number of eye contacts and ''j'' denotes the type of imitation. ''n'' and ''m'' belongs to real numbers. Where IM j is the imitation sequence executed by robots: For further explanation of the mathematical model, (12) is illustrated below in an iterative manner: 1 st Iteration: i = 1 and j = 1, 2 k = 1 and i = 1, 2 where x i is the duration of eye contact noted by the robot and IM j denotes the imitation tasks performed by robot 1. 2 nd Iteration: i = 2 and j = 1, 2 k = 2 and i = 1, 2 where x i is the duration of eye contact noted by the robot and IM j denotes the imitation tasks performed by robot 2. The iterations continue till completion of therapy session.

C. HARDWARE
MRIS uses two NAO humanoid robots for ASD therapy. Due to its anthropomorphic appearance and high VOLUME 7, 2019 programmability, it is widely used for the purpose of therapy [40]. The robot controller uses the built in function along with LTM protocol for joint attention. Two kind of interventions are embedded in the controller itself i.e. joint attention intervention module and imitation module. First joint attention intervention is accomplished i.e. an LTM-based protocol therapy followed by imitation module using child's eye contact for activating the robot. Both modules are autonomous i.e. adaptive. Gaze tracing and posture recognition modules are used for joint attention and imitation interventions respectively. Gaze tracking is done using NAO robots cameras. During the joint attention module two things are noted for gaze attention: 1) Delay in making eye contact with the robot. 2) Time duration for which eye contact is made. Imitation of the child is recorded using Kinect that measures the body posture of the child to match with the robot's posture during imitation therapy.

D. ADAPTIVE CLOSED LOOP SUPERVISORY CONTROL
This is the central module of MRIS as it controls the cues based on the child's response and then sends that command to the robot so that it can change the behavior accordingly. In case of adaptive module, the algorithm decides on its own if the child is following the command or not. If the imitation is performed correctly, the algorithm switches to the next level or command according to the protocol of the therapy. In joint attention module, the adaptive closed loop supervisory control gets the feedback from NAO robots camera whereas for imitation module, the correct posture recognition information is recorded by Kinect.

A. SUBJECTS
MRIS system was tested on 12 ASD children including 11 males and 1 female. They were recruited from Autism Resource Center (ARC). The study was approved by the autism specialist and director board of ARC. The recruited participants were already evaluated clinically based on Childhood Autism Rating Scale Schedule (CARS) criteria by the experts. The statistical characteristics of the participants are presented in Table 5 and Table 7. Parents of these children also signed consent form for the discussed therapy. Fig. 7 show the environment setup for therapy. The two robots are placed in front of the child while the child sits on a comfortable plastic chair during the joint attention module so as to attain a height at which he/she can make an eye contact with robot. For imitation module the child stands in front of the robots. The robots are 1 m away from the child and from each other as well. They were placed in an arc like arrangement facing the child.

C. EXPERIMENT DESIGN
MRIS follows the experiment architecture explained below for 1) human-robot interaction (without any inter robots' interaction) and 2) human-robot interaction along with the inter robot communication. These two strategies are presented as individual experiment architectures and the focus of this paper is not to inter-compare the two of them.

1) HUMAN-ROBOT INTERACTION WITHOUT INTER ROBOT COMMUNICATION
The ASD child is taken to the EEG area before the start of intervention to measure the brain activity and attentiveness. For that he/she sits on a comfortable chair and counts from 1 to 10. After delay of 30s, he/she is asked to read the alphabets. After this reinforcement activity the child's brain activity is measured using EEG neuroheadset. Following that a child is seated on a chair facing both the robots in the intervention area. The robots start with their first intervention therapy i.e. LTM-based adaptive joint attention module. The child's response is noted by NAO robots' cameras. Subsequent to intervention completion, the child is again taken to the EEG room for measurement of brain state after therapy. Moving ahead, the child is then introduced to second intervention i.e. imitation. This intervention follows the same protocol for measuring brain activity before and after the therapy. In this therapy, as the child enters the intervention area, both the robots give stimuli by flashing their eyes (same color was used for both robots) after which the robots wait for the child to maintain eye contact with either robot for at least 5s. Once eye contact was established, the robot is activated for imitation activity. This intervention basically utilizes the joint attention module in a way that imitation of the robot is activated by the eye contact of the ASD child.

2) HUMAN-ROBOT INTERACTION WITH INTER ROBOT COMMUNICATION
In this therapy we have introduced the concept of interrobot communication along with human-robot interaction of ASD child. The main rationale for introducing inter-robot communication is that in a daily multi-communication setting, one may also need to listen to/watch others' communication. This inter-robot communication is carried out at the start of experiment. Initially both the robots are sitting and facing each other. One of them stands up and says ''hello'' along with waving action to the partner robot. The partner robot shows the response by standing up, saying ''hello/hi'' coupled with a waving action.
During this time period, the ASD child's response is recorded as a listening task. Following this, the robot turns towards the child and start communicating in a similar manner. The robot communicating with the child is randomly selected. In return the response of child is noted in terms of joint attention (child's eye contact), imitation (waving of hand) and speech. It was also observed during experimentation that children were paying attention to the robots and also shifted gaze properly between robots during their inter-communication. The whole arrangement discussed is shown in Fig.1. The dotted line shows inter robot interaction done for these experiments.
Eight sessions for each intervention were conducted over a period of two months for each type of experiment. The entire experimentation was carried out for a period of four months. Weekly progress of the child was recorded for both interventions. Each session involved various trials, of which average was taken to calculate the overall success. The therapy was scheduled in a way that all participants were involved in both interventions.
However, for EEG recording before and after interventions, some participants felt uncomfortable in wearing EEG headset. Their data has not been included in the research. To record the initial engagement of the child with robots as a baseline measure, the participants' attention towards the robot was recorded for several trials until a stable baseline measure was achieved. Therefore, the first experimental value for each intervention was taken as the baseline parameter to measure the success over the intervention. Moreover the results for both types of experimentation i.e. joint attention and imitation module were compared using CARS score before and after the intervention as shown in Table 5. Table 5 explicitly shows the improvement in joint attention and imitation skills of each child before and after the experimentation along with the overall CARS improvement.

3) EEG SIGNAL PROCESSING
EEG data was acquired at a sampling rate of 128 Hz. The four EEG bands were measured i.e. alpha, beta and theta by applying the band pass filter on the data. The power in each band was taken and difference was estimated to measure attentive and non-attentive state of the child. In this way power of each band was used to estimate the cognitive brain state of child [41]. The power of alpha, beta and theta bands state of child [41]. The power of alpha, beta and theta bands was measured during the time interval for which the person interacted with the child. We have used a video based recording and stopwatch to measure the time interval at which the stimuli was given. Average SNR calculated before was 1.06 and after was −11.28. Second order filter was used for which frequency before filtering F1 = 8Hz and after filtering was F2 = 12Hz.

4) DATA PROCESSING USING NAO
For measuring the eye contact, we used upper camera of the NAO robot. The color space was BGR and frame rate was 15 f/s. NAO's API, ''ALGaze Analysis'' was used for two events that were associated with closing and opening of eyes.

5) DATA PROCESSING USING KINECT
It was used to track the skeleton of the subject. Its frame rate varies depending upon the speed of processing device (laptop). In Kinect it ranges from 15-30 fps.

IV. RESULT A. RESULTS OF HUMAN-ROBOT INTERACTION WITHOUT INTER-ROBOT COMMUNICATION 1) RESULTS OF JOINT ATTENTION MODULE
The results of joint attention module were recorded as: 1) eye contact of the ASD child to measure the attentiveness of the child towards a stimulus given by a robot as shown in Fig 8, 2) Delay in shifting the gaze of the child from one robot to VOLUME 7, 2019  the other based on a given stimulus measures improvement in social and cognitive developments. Fig. 9 shows the gaze shifting behavior of S1 for session 2, 3) the biasness of the child towards any of the robots to see the improvement in multi communication as shown in Fig.10 and 4) Interest level of the participants in joint attention module was measured before and after the intervention using EEG as shown in Fig. 11. Based on above parameters the improvement in behavior of the child is shown in Table 3. The table shows the improvement in joint attention of each child compared to the joint attention measured during first week.

2) RESULTS OF IMITATION MODULE
The results of this module were assessed after the measurement of interest level of the child using EEG in which a robot was triggered when the child established eye contact: 1) the imitation performed by the child when the robot gives a stimulus to measure the motor skills is shown in Fig. 12, 2) the social interaction based on the stimuli given by the two robots and measuring biasness based on actuation of robots is shown in Fig. 13. 3) The results in Fig. 14 correspond to the joint attention along with the imitation module. The overall improvement in imitative behavior of the child from week one is shown in Table 4.
Results for both types of experimentation i.e. joint attention and imitation modules along with joint attention were verified using CARS score before and after the intervention as shown in Table 5 and Table 8.  ASD child with both robots along with an average eye contact time and number of eye contacts maintained. Table 7 shows the results of different parameters which are considered in updated therapy. In this therapy we are considering waving and speech response of an ASD child towards both robots along with attention paid by the ASD child towards intercommunication of robots. Last three columns represent the percentage of success of different subjects. Further definitions of different acronyms used in Table 7 have been given in Table 6.
The impact of this intervention can be seen by pre and post intervention CARS score represented in Table 8.

V. STATISTICAL ANALYSIS
Statistical analysis of this research is done for each intervention module. We have used ANOVA (single factor) for this purpose. The result for joint attention and EEG module was F value = 20.36, p-value = 1.74E-06 and F critical value = 3.28. Fig. 17 shows the graph of ANOVA for joint attention and EEG modules without inter-robot communication. Results for joint attention and imitation were F value = 23.93, p-value = 3.79E-07 and F critical value = 3.28. Fig. 18 shows the graph of ANOVA for a particular module without interrobot communication. The results for last intervention module i.e. measuring the joint attention and imitation skills of an ASD child with inter-robot communication in the intervention was F value = 4.52, p-value = 0.0185 and F critical value = 3.28. Fig. 19 shows the graph of ANOVA for this particular module.   Since we have used single factor ANOVA for statistical analysis therefore we have p-value along with F statistic. In our case, we selected alpha = 0.05 as a threshold and we got p-value lower than alpha i.e. p-value = 1.74E-06, showing that our data is reliable. Moreover in both interventions, our calculated F value is greater than critical F value, thus rejecting the null hypothesis.

VI. DISCUSSION
Unlike previous research, our designed modules for joint attention as well as imitation are adaptive. Various studies have been carried out related to this LTM-based prompt method showing that it is not restricted to only imitation or joint attention but can be used generally for any robot mediated therapy. In a research presented in [42], the child was asked to imitate the robot's gesture. If the child fails then the robot points in order to improve the gesture.
In another research, the robot therapy is based on asking open question initially, if the child is unable to answer it correctly then the robot adds a hint of correct answer in it [43]. Similarly ARIA system uses the LTM-based protocol  for the model based on joint attention improvement of an ASD child [42]. Only one research shows a single robot based adaptive model for improvement of joint attention only [40].
Moreover the stimuli generated by both sources are the same and hence no biasness is introduced for an ASD child unlike previous studies that have used screen or other sources as stimuli to measure joint attention of a child. Also there are no external environmental factors involved in our prompts as included in NORRIS [35].
The advantage of this model is that it does not require continuous engagement of the human therapist. It is difficult for any person to work for extended continuous hours as a therapist unlike robots. Moreover these robot based therapies can be conducted at home. Keeping in view the non-human involvement it has certain disadvantages particularly if the child gets frustrated, how to manage the situation.
In addition to that another significant factor is willingness of the child for EEG recordings. Therefore it would be better to use some other device instead of EEG as children are sometimes reluctant in wearing the device. Moreover the proposed research does not compare the two models i.e. human-robot interaction without inter-robot communication and humanrobot interaction with inter-robot communication. Therefore the focus of this study is not to show which therapy is better than another as the protocol of both the therapies are different and depends on the intervention to be conducted.
The proposed future work for this research is implementing the proposed model on a larger set of ASD children. Secondly this model can be extended for more than two robots. Moreover the effectiveness of therapy can be evaluated for humanhuman interaction as well.

VII. CONCLUSION
Based on the results, this research has three main contributions. 1) Design and development of a single mathematical model for adaptive multi-robot based therapy of ASD children for both LTM-based joint attention as well as imitation.
2) Validation and effectiveness of MRIS system based on user study using CARS scale as shown in Table 5 and 8. This gives an insight into how effective the designed therapy is. 3) Notable improvement in multi interaction of an ASD child.
In this article, we have proposed the first autonomous multi-robot based mediated therapy for joint attention and imitation called MRIS. Two humanoid robots (NAO) were used as interaction partners of an ASD child. For the first intervention, interaction of a child was recorded in two different modules i.e. joint attention and imitation module. In joint attention module the child's gaze tracking was acquired using NAO camera to observe eye contact and delay in making contact after the stimulus is given. The implemented prompts of this module are based on LTM-RI hierarchy. In imitation module, the activation of module was dependent on eye contact of an ASD child, hence making the module itself adaptive. The child's imitation was measured over a period of experiments to observe any improvement in the child's behavior.
The second intervention involved inter-robot communication during which the child's behavior was recorded when the robots were communicating with each other. This is a normal protocol in daily life communication when one may also need to watch or listen to others' communication.
The improvement in multi-communication skills of the child with robots was recorded during intervention.
The child was introduced to eight sessions of each intervention. Each intervention was carried out for two months. The therapy was spread over a period of 6 months. All 12 subjects participated in each intervention. The participation of each individual was made sure by scheduling in such a way that each session for both interventions was carried out over a whole week. We had instances when sessions could not be conducted because of an unexpected reason or child's absence. Therefore the experiment was conducted on any other feasible day of the same week as per schedule. This is how all 12 subjects participated in all the sessions. Full participation was also ensured through meetings with the parents and therapist.
Results show that eye contact duration of each participant has improved over the experiments. Some degree of improvement was shown by every participant. Moreover the delay in making eye contact with the robot after the stimulus is given has been reduced. i.e., the subjects became more responsive to the stimuli. For imitation module it was observed that the participant actuated both robots almost equally in recurring experiments. Therefore the therapy proves to be successful for multi-interaction as shown in Fig. 15. However while testing the system and gathering data, it was noticed that the percentage of success varied from child to child as each individual was responsive towards different type of stimuli based on the level of autism they fall in.
The mathematical model for MRIS was validated by the cognitive brain state measured before and after the experiments using EEG headset (Fig 12 and Fig 14). Moreover the CARS score before and after the therapy shows a significant improvement in communication skills of an ASD child. The statistical analysis performed on the results also supports the conclusion firmly.
The advantage of this model is that it does not require any body worn sensors during intervention that can make the child uncomfortable. Additionally the improvement in child's behavior is recorded using sensor integration therefore reducing the chance of error and ensuring correctness of results.