Leveraging Human-AI Collaboration in Crowd-Powered Source Search: A Preliminary Study

: Source search is an important problem in our society, relating to finding fire sources, gas sources, or signal sources. Particularly, in an unexplored and potentially dangerous environment, an autonomous source search algorithm that employs robotic searchers is usually applied to address the problem. Such environments could be completely unknown and highly complex. Therefore, novel search algorithms have been designed, combining heuristic methods and intelligent optimization, to tackle search problems in large and complex search spaces. However, these intelligent search algorithms were not designed to address completeness and optimality, and therefore commonly suffer from the problems such as local optimums or endless loops. Recent studies have used crowd-powered systems to address the complex problems that machines cannot solve on their own. While leveraging human intelligence in an AI system has been shown to be effective in making the system more reliable, whether using the power of the crowd can improve autonomous source search algorithms remains unanswered. To this end, we propose a crowd-powered source search approach enabling human-AI collaboration, which uses human intelligence as external supports to improve existing search algorithms, and meanwhile reduces human efforts using AI predictions. Furthermore, we designed a crowd-powered prototype system, and carried out an experiment with both experts and non-experts, to complete 200 source search scenarios (704 crowdsourcing tasks). Quantitative and qualitative analysis showed that the sourcing search algorithm enhanced by crowd could achieve both high effectiveness and efficiency. Our work provides valuable insights in human-AI collaborative system design.


Introduction
Source search problems always exist in nature and in our daily lives, such as animals finding an odor source to acquire foods in the wild and people searching for the emission source of air pollution.Traditional search algorithms, such as tree search algorithms and graph search algorithms, work well for limited search space proved, completeness and optimality can no longer be guaranteed.Therefore, it is possible that these search algorithms return a local optimum or even no solution, instead of the global optimal solution.Researchers and practitioners have noticed this issue [3].
Recent work has focused on crowd-powered systems and human-AI collaborations [4].These approaches enable humans to take part in the automatic process that is supposed to be completely controlled by AI or automatic machines to solve complex problems.A crowdpowered system provides us with a new perspective that humans could get involved in the automatic problemsolving process, to enhance the effectiveness and efficiency of algorithms [5][6][7][8].Since human-AI collaboration has been proven to be feasible in a variety of domains [9], we see an opportunity that combines human rationales with search algorithms, to overcome the difficulties that current source search approaches usually encounter.However, whether the power of the crowd is really effective and efficient in improving existing source search algorithms remains unknown.To address this knowledge gap, in this work, we are particularly interested in answering the research question: how can crowd-powered approaches improve the effectiveness and efficiency of source searching algorithms?
To answer the research question, we designed a human-AI collaborative framework that could improve the existing source searching algorithm in a way that crowdsources the problems that occurred during source searching.We implemented a prototype system where a virtual robot autonomously searches a source in complex environments in a simulation setup, to enable a user study.The crowd-powered system is responsible for detecting fatal problems, explaining the algorithm, giving suggestions, and generating tasks for humans to complete.When it comes to humans' turns, humans could either take full control of the robot or aid the robot to address problems.Particularly, to better facilitate effective problem-solving, the system predicts the location of the source using Bayesian methods and sequential Monte Carlo methods to further assist humans in making decisions and taking actions.We recruited 10 participants in this study, including 4 domain experts in the field of source searching, and 6 non-experts having no experience in source searching, to evaluate the proposed human-AI collaborative approach in randomly generated complex searching environments.
In total, 200 source search scenarios (704 crowdsourcing tasks) were completed by the collaboration of humans and autonomous source search algorithms.The experiment shows that the crowd-powered system could improve the performance of the state-of-the-art source searching algorithms, in both effectiveness (success rate 100%, 22% higher than non-human methods) and efficiency (using significantly fewer iterations/steps to find the source).Furthermore, we analyzed system usability scores and cognitive workload scores reported by the participant.Results show that a specific way of interaction could achieve better usability and cognitive workload for a group of participants in the prototype system.Our work provides useful suggestions, valuable insights, and important implications in leveraging human-AI collaboration for improving source search algorithms.
An earlier version of this work was published at Chi-neseCSCW 2022 and received the Best Student Paper Award.In this version, three main additional extensions were developed: 1) we further categorized the problems that autonomous algorithms could encounter during source search; 2) we further presented the necessary elements for effective human-AI collaboration; 3) we extended and formatted the post-task survey and analyzed participants' feedback systematically.

Related Work
We discuss related literature from two perspectives: crowd-powered systems and source searching.

Human-AI Collaborative Systems
The goal of designing a human-AI collaborative system is to leverage human rationales in a way combing with computer systems to collaboratively solve complex problems.Scientists at Microsoft proposed human-AI interaction guidelines, to help researchers and practitioners working on the design of studies and applications that use AI technologies [4].The guidelines give 18 specific items categorized into four main parts, namely initially, during interaction, when wrong, and over time, to help people working in the AI and Human-Computer Interaction (HCI) communities better design and evaluate human-AI collaborative systems.Similar works have been done to provide important advice for better user experience in explainable AI and human-AI decision-making [10,11].A typical collaborative system is the ESP game [12], which is an image labeling system developed by Google.This system successfully gamified the image labeling process, and produced a large amount of data while the users were enjoying the game.CrowdDB is another example leveraging human input to process queries that database systems cannot answer [5].Bozzon et al. proposed Crowdsearcher, a system that is able to answer search queries using the intelligence of crowds [6].Furthermore, previous work combined human intelligence with machine learning methods to address the problem such as conversational agent learning intents and text classification [9,13].Recent studies recruited online users from crowdsourcing platforms, and applied smart task scheduling and output prediction methods to produce city maps [14,15].While human-in-the-loop systems have been shown to be effective in many domains, algorithm-in-the-loop systems also started to play a critical role in human decisionmaking.Previous work introduced this concept, and provided principles for human-AI decision-making and risk analysis [16,17].In the domain of robotics and engineering, human-AI collaboration has been used for a long time, to address practical problems that can hardly be considered in theoretical models [18,19].For instance, human-AI collaboration was effectively applied to address radiation source search and localization [20], spill finding and perimeter formation [21], and urban search and rescue response [22].

Source Search
In general, source search is a kind of problem that aims to determine the location of the source (of gas or signal) in the shortest possible time, as it is of vital importance for both nature and mankind [23][24][25].For example, the search for preys [26], submarines [27] survivors [28], and pollution sources [29].As a classical kind of source search algorithm, the bio-inspired algorithm typically leverages the gradient ascent strategy to approach the source, based on a reasonable assumption that the signal emitted by the source has a greater intensity near the source [30,31].However, in the presence of environmental disturbances (e.g., turbulence), the intensity gradient of the emitted signal may be disrupted, undermining the feasibility of the bio-inspired searching algorithm [2].An alternative kind of source search algorithm has been developed based on Bayesian theory [23].Previous works [1,2] proposed the cognitive search algorithm that models the source search process as a Markov Decision Process.To further enhance the performance (i.e., success rate and efficiency) of a search algorithm, multi-robot collaboration mechanisms [32][33][34] were designed and adopted.However, when source search happens in complex environments, the search process always encounters fatal problems, resulting in wrong outcomes.In this work, we designed a prototype system for source search in complex environments, and carried out a user study with this system to answer the research questions.

Methods
In this section, we propose a method that leverages human-AI collaboration to improve existing autonomous source search algorithms.We organized a discussion with three experts (including two authors of this work) in the domain of source search, with the aim of answering the following three questions: (1) What fatal problems could be happened during the search process?
(2) What kinds of information are necessary for humans to understand the problem that the search process is having?
(3) If a search algorithm is troubled by a problem mentioned before, how could the problem be addressed by humans?
According to the discussion, we designed a crowdpowered framework that combines human intelligence with AI, to overcome the problems that current source search algorithms often encounter.We classified the common problems that could occur during the search process, and showed how these problems could be understood and addressed by crowds from experts' perspectives.We further explain each step and elaborate on how humans and AI play their own roles within the proposed framework.

An Overview
In this section, we show the design of a crowd-powered source search method, and explain how human rationales can be used during the search process to improve the effectiveness and efficiency of search algorithms.There are numerous ways to enable crowd-powered methods in search, and human-AI hybrid intelligence can naturally play a role in each part of a search algorithm.In this study, we designed this simple framework in a way to minimize human efforts, and try not to make changes to the mechanism of source search algorithms.The overview of the method is shown in Fig. 1.
The entire workflow consists of three main steps.The first step -initialization, defines the search goal, space, Fig. 1 The crowd-powered method that integrates human-AI collaboration into the search process.
rules, and parameters.In the second step, the framework conducts the source search algorithm using the rules and parameters defined in the first step, to search for the source in the search space.When the search stopping criteria are satisfied (e.g., the source is found or the space has been fully searched), the workflow ends and outputs the corresponding result.

Detecting Problems
In the proposed framework, to minimize human effort, the algorithm is taking care of the search process most of the time, while human interaction is enabled when an algorithm is having a problem that cannot be effectively addressed on its own.Therefore, in order to detect such problems and enable human-AI collaborations, during the search process, the AI system needs to automatically monitor and acquire the state of the search process.
According to the discussion with experts to answer the question of "what fatal problems could be happened during the search process", we classified common problems found in search algorithms, especially in informed search algorithms or other intelligent search strategies which cannot guarantee the completeness and optimality [3].
(1) Local optimum: this problem could happen when the search space is very large or infinite, where traditional search algorithms cannot efficiently work.
In this case, a search algorithm might find a local optimum and believe this is the final goal.
(2) No information gained: this problem means that there is no new information gained while the search state is kept updated.It could happen to informed search or informative path planning strategies that always require new information to guide the search direction.A wrong search direction can be fatal when the search space is very large or infinite.
(3) Dead end: this problem could happen when searching heuristics, cost/reward functions, or stopping criteria are not appropriately set up for search environments.In this case, the search algorithm might end up with a dead-end, and it returns no solution in the end.
(4) Endless loop: this problem could happen to local or random search strategies, which do not record all the previous search states.If the search heuristics or cost/reward functions are not correctly defined, the algorithm may get trapped in endless loops.
Problem detection could be achieved in various ways.A simple solution would be using a rule-based expert system to automatically detect the pattern of search history.It is also possible to train a machine learning model to detect the pattern of problematic search states.In this preliminary study, we simplified the problem detection part, which does not use machine learning to recognize problem patterns.According to experts' domain knowledge, for current source searching algorithms, local optimum or dead-end usually strongly relates to no information gained and endless loop problems eventually.Therefore, we proposed a simple rule-based mechanism to detect the no information gained and endless loop problems automatically: if the searcher 1) passes by the same spot 5 times within a specific time window, and 2) acquires zero measurement, the system stops the search process.

Crowdsourcing Tasks
When a problem is detected, the method automatically generates crowdsourcing tasks to collect and leverage human intelligence.The key steps for successful task completion are considered to be task explanation and problem-solving.The AI behind the autonomous source search algorithm is responsible for task explanation and for giving valuable suggestions, while crowd workers (humans) are responsible to figure out the problems that the algorithm cannot solve.

Task Explanation Using Artificial Intelligence
The explanation is a very important part of human-AI collaboration.When the algorithm is having a prob-lem, humans need to first understand how the algorithm works and what the problem is, then they can help the AI address the problem.In this study, explaining source search algorithms and problems is slightly different from explaining machine learning models, since machine learning models are usually black boxes and it is not easy for humans, even experts, to understand why a model makes a problematic prediction.In the context of source search algorithms, although how the algorithm works is clear to designers and experts, the actual search process (search states) usually remains hidden from humans.Therefore, AI systems could be used to give useful explanations and suggestions so that humans can easily understand what happens and get some tips to solve the problem.
With the question of "what kinds of information do you need in order to know the problem that the search process is having", we discussed with experts and explicitly asked for requirements of good explanations and suggestions.As a result, we presented the necessary elements in Fig. 2  (1) The AI system should present the current search state.
(2) The AI system should present the partial/entire search space.
(3) The AI system should present the partial/full history of the search process.
(4) The AI system should describe the problem of the current search state, and better show why it happens.
(5) The AI system could present the initial state.
(6) The AI system could suggest a solution.
The first three requirements are relatively easy to achieve, while the last three requirements need AI's actions.The AI system implemented behind the source search algorithm should find a way to explain the problem in an understandable way, and further, the AI system could give an estimation of the source in the unknown environment, which would help the human better understand the problem of the current state and solve the problem later.
A variety of AI techniques could be used here to facilitate explanations and suggestions.In this study, the AI suggestion features a source estimation method that uses Bayesian inference and sequential Monte Carlo methods to show the distribution of the posterior probability of the source location (see green particles in Fig. 3) [35,36].The AI also suggests an area where the source most likely will be called "belief source area" using DBSCAN [37] to provide more information to help humans understand and address the problem.

Problem Solving Using Human Intelligence
After AI detects the problem and provides useful explanations and suggestions, it comes to the human's turn to address the problem that a search algorithm cannot handle on its own.With the question of "If the search algorithm is really troubled by the problems you mentioned before, how would you address them", we discussed with experts and summarized four main ways to overcome the problems, so as to improve the search algorithm: (1) Taking over the search process.This way requires humans to take over the search process until the problem is addressed.It means that the search process is no longer running automatically, instead, humans decide what are the next search states to make the search proceed.This way requires a good understanding of the search algorithm and can provide maximum flexibility for humans to address the problem.
(2) Manual pruning.This way requires humans to prune the problematic searching branches for jumping out of the current problem, since some search paths could look obviously problematic to humans according to humans' experience and rationales, but these problematic search paths might seem correct to the AI.
(3) Setting a temporary search goal.This way requires humans to set up a short-term goal for the search process, to temporarily replace the end goal.
To this end, the temporary goal lets the search process make a detour to bypass the problem.The temporary search goal should be relatively easy to achieve, and should be "closer" to the end goal according to human understanding and reasoning.
(4) Adjusting the search algorithm on the fly.This way requires humans to adjust the parameters, functions, rules, or heuristics of the search algorithm on the fly to solve the problem.It also requires much experience in using such search algorithms, i.e., humans need to understand the setting, mechanism, and workflow of search algorithms very well, to address the problem effectively.
The key to using human rationales to improve source search algorithms is to find an appropriate way for the specific context, and design effective human-computer interaction means.

Prototype System Design
We designed a prototype system following the framework shown in Fig. 1.

Crowd-powered Infotaxis Algorithm
The source search algorithm used in this system is Infotaxis, one of the most popular novel search strategies particularly effective for source searching problems [1,38].
When a problem is detected, a GUI-based crowdsourcing task is automatically generated.We use graphical elements to explain the algorithm as well as the problem, suggested by Fig. 2. In the prototype system, the goal of explanation is to let users clearly "see" the problem, rather than deeply "understand" the problem.We showed the direction that the robot (searcher) wanted to go, and the direction robot had to go (because of obstacles) to help people understand why a problem could happen.We did not further explain the reasons since a problem could be the consequence of many different factors.Future work could focus on a deep understanding of the problems.
When the human (a crowd worker) starts to operate, the crowdsourcing task provides two control modes.A full control mode allows the user to take over the search process and control every single step of the robot; An aided control mode allows the user to define a temporary goal (a targeted location), so the robot will pause the current search activities and move to the targeted location set by the user.We did not implement other problem-solving means in the prototype system since they require more expertise and incorrect operations may lead to a high failure rate.Future work could con-sider implementing more control modes such as setting forbidden areas and search parameter tuning, representing manual pruning, and adjusting the search algorithm on-the-fly respectively.

Task Interface
As shown in Fig. 3, the task interface uses graphical elements to explain the search algorithm and the problem.Since the environment and the source are unknown at the beginning, on the interface, most areas except the cells near the robot are in black, and green particles (representing posterior distributions of the source location) are randomly distributed, as shown in Fig. 3 (a).After clicking the button [START], the search begins.When a problem is found, the system generates a task and gives the user a voice-based alert, and then the crowd worker can click on the interface with the button [EXECUTE] to execute the task, by controlling the robot or planning a path for the robot, as shown in Fig. 3  (b).The crowd worker can click the button [CON-TINUE] to complete the task, letting the search process continues.When the source is found, the system shows a message saying "the source has been found successfully!" in red, then the user can click the [NEXT] button, to switch to the next source search scenario.The prototype system was developed using Python 3.7 and tkinter packages.The code repository of the prototype system is shared online publicly together with the data, for the benefit of the community * .

User Study
We design a user study to answer our research question.In this section, we introduce experimental conditions, environments, measures, and the procedure of the user study.

Experimental Conditions
As we have introduced in the previous section, we provided two interaction/control modes -Full Control (FC) and Aided Control (AC) respectively.A full control interaction mode represents the problem-solving method that requires humans to take over the search process, while an aided control interaction mode represents the problem-solving method that sets a temporary goal (let the robot exits the current search state and then navigate it to a manual defined location).
Furthermore, we used two baseline conditions in our experiment.The baseline 1 condition directly uses  the state-of-the-art source search algorithm (Infotaxis), while the baseline 2 condition also uses our proposed automatic problem detection method and then navigates the robot to a random location in order to jump out of the problem.Please note that the baseline 2 condition is also an improvement based on the state-of-the-art source search algorithm.

Experimental Environments
The source search activities are performed by a virtual robot in a 2D 20m × 20m squared area in a simulation setup.The 2D search area is divided into 20 × 20 cells in a grid.Each cell has a probability P o determining whether this cell contains an obstacle.P o is set to be 0.75, to give a relatively high difficulty (more obstacles) of tasks, since simple environments (with few obstacles) do not need human assistance that much.In this study, we did not consider the specific types or shapes of obstacles.If there is an obstacle in a cell, the cell is considered to be completely obstructed and cannot be arrived at or passed by the robot.
The prototype system was deployed on a PC, and all the participants were invited to execute tasks using the same PC to ensure a fair comparison.Participants were invited to a quiet lab, to make sure the experiment would not be interrupted by others.

Measures
In this study, we measure the effectiveness and efficiency of the source search process and outcomes.The effectiveness is measured by the success rate.As the source search process can forever go on if the source is not found, we define that a successful source search process means the robot finds the actual source within 400 steps (a step means an iteration of updating search states).If the robot cannot find the source (either with or without human involvement) after 400 steps, the source search task is considered to be failed.The efficiency is measured by the number of steps the robot takes to successfully find a source.A failed source search is not taken into account in calculating the efficiency.Furthermore, we measure the average execution time per task to see how engaged the participants are during task execution.
Furthermore, we use two standard questionnaires to understand the perceived usability and cognitive workload while using the crowd-powered source searching system.The perceived usability is measured by System Usability Scale (SUS) [39].Using the ratings of SUS items, we can derive scores of the SUS in two aspects -usability and learnability [40,41].Furthermore, we measure cognitive workload using NASA-task load index (NASA-TLX).

Post-Task Interview
We used SUS and NASA-TLX to measure perceived usability and cognitive workload.To have a deeper understanding of human needs, we further carried out a post-task interview/conversation with each participant.The questions used in the interview are shown in Table 1.The interview consists of 10 questions.The first and the last questions ask for general feedback and general comments.The other 8 questions are used for acquiring participants' thoughts on explainability and interaction.We asked questions 2-5 to understand the effect of explanation and how could the explanation be improved, while we asked questions 6-9 to understand the effect of interaction and how could the interaction be improved.The questions are supposed to be answered by either 7-point Likert scales (from 1 Not at all to 7 To a very large extent) or open-ended text, except question 6. Concerning question 6, participants can report Fig. 4 The experimental procedure.A participant must complete tasks using two interaction modes (i.e., Full Control and Aided Control).However, the order of the control modes was pre-scheduled for all the participants to avoid learning bias.what control mode they prefer (full control, aided control, both, or none).

Procedure
The experiment was organized as shown in Fig. 4. We first asked participants to complete a demographic survey.This survey requires participants to provide their basic background information about their age, gender, education level, and domain knowledge about search algorithms.After the demographic survey, we also briefly explained to the participants our experimental scenarios (i.e., to find a gas source) and how to use the prototype system.
After demographic surveys, participants were asked to complete source search crowdsourcing tasks.Each participant should complete 20 scenarios using 2 control modes, i.e., full control and aided control.To avoid learning biases, the order of the control modes during task execution was pre-scheduled -half of the participants first executed 10 full-control scenarios and then aided-control scenarios (2 experts + 3 non-experts), and the other half first executed aided-control scenarios and then full-control ones.After finishing each control mode (10 scenarios), participants were asked to rate their feelings on system usability and cognitive workload using standard questionnaires.Finally, in terms of the interview, we asked questions to the participants orally, also with printed questionnaires.The interviews were recorded, and later transcribed into text.

Results
We evaluated the effect of using the crowd-powered method in source search algorithms by measuring the effectiveness (success rate), the efficiency (the number of steps taken to find the source), the human execution time, the self-report SUS scores, and the self-report TLX scores.

Participants
Table 2 shows the demographic information, including gender, age, education level, expertise, and domain knowledge, of 10 participants recruited in our study.We asked four experts (academic researchers or engineers), who have been working on topics related to source search for at least 1 year, to participate in our study.Furthermore, we recruited 6 non-expert volunteers from our institute who had no experience in source search.People involved in the prototype system development were not invited to the experiment to avoid potential biases.The experiment was approved by the ethics committee of our institute.

Source Search Result
We evaluated source search from three perspectives, namely the effectiveness (the success rate), the efficiency (the number of steps used to find the source), and the human execution time per task.Results are shown in Table 3.Clearly, the crowd-powered method is proved to be effective, as the success rates can achieve 100% in most cases (except only one case) being approximately 22% higher than baseline 1, and 12% higher than baseline 2. It shows that leveraging human inputs could make the algorithm performance nearly perfect.Furthermore, we observe the improvement of efficiency when the full control mode is used, in comparison with both aided control and the baselines.In general, both experts and non-experts showed good performance while collaborating with the machine to solve the problems of the search algorithm.
To deeply understand the difference among experimental conditions, we performed statistical analysis for efficiency (the number of steps) and human execution time (per task).Since the numbers of search steps follow normal distributions according to the normality tests, we applied two-way ANOVA to see the effects of two factors considered in this study -expertise (expert vs non-expert) and control mode (full control vs aided control), as well as their interaction effect.Results of statistical tests are shown in Table 4.We found that the efficiency of source search shows a significant difference in terms of the control mode (p = 0.026), meaning the full control mode could achieve better efficiency regardless of expertise.
Since distributions of the human execution time do not come from a normal distribution according to normality tests (for all the data groups p < 0.003 ), we applied pairwise Mann-Whitney U tests to carry out the significance tests (α values were adjusted by Bonferroni corrections).We did not find a significant difference in terms of human execution time (p > 0.07 for all the pairs), meaning neither expertise nor control mode could significantly affect execution time.
The source search result conveys three main messages: (1) The crowd-powered method is effective and efficient for improving source search; (2) Through our design, non-experts could achieve similar performances as experts could do; (3) Taking over the machine during problem-solving could further improve the efficiency of source search.

Usability
We asked all the participants to fill up the system usability scale (SUS) after completing each control mode (i.e., full control and aided control).Therefore, each participant provided 2 SUS responses.Since we only recruited 10 participants (20 SUS responses in total), we did not use statistical tests to perform the analysis.Scores of SUS are reported in Tab.5.According to previous studies [39][40][41], SUS can measure the usability and learnability of a system.While we found that the full control mode (meaning the search process is taken over by humans) showed better search efficiency, the participants in general reported that they perceived better usability and learnability from the aided control mode, rather than the full control mode.Interestingly, for all the non-experts, their SUS scores of aided control were not lower than the scores of full control, meaning they all preferred the aided control mode.However, in terms of experts, we observed more diverse opinions, and the difference of experts' overall average SUS scores between aided control and full control was less obvious (Aided Control 86.25 vs Full Control 80.63), in comparison with nonexperts (Aided Control 80.42 vs Full Control 65.00).
The SUS result conveys one main message: The participants, especially the non-experts, generally perceived better usability when they were aiding the machine, in comparison with taking over the machine.

Cognitive Workload
To understand cognitive workload during humanmachine collaboration, we used NASA-TLX scale after the participants completed each control mode (i.e., full control and aided control).Similarly, each participant also only provided 2 TLX responses, resulting in 20 TLX responses in total.To this end, we did not perform any statistical analysis.Results of NASA-TLX scores on six dimensions, namely physical demand, mental demand, temporal demand, performance, effort, and frustration, as well as the overall TLX score are reported in Tab. 6.We observed that the non-experts in general perceived less cognitive workload in the aided control mode, compared to the full control mode, across all the TLX dimensions.However, in terms of the experts, we found that all the experts thought their performances while using the full control mode were not worse (3 out of 4 reported higher) than the aided control mode.This is an opposite finding compared to non-experts.In terms of other dimensions, again, we observed more diverse opinions from the experts, resulting in tiny differences between the aided control mode and the full control mode.
The TLX result conveys two main messages: (1) The participants, especially the non-experts, generally perceived less cognitive workload when they were aiding the machine, in comparison with taking over the machine; (2) The experts reported better performances when they took over the machine during problemsolving, while the non-experts did the opposite.

Discussion
Results have shown that a crowd-powered framework could improve the effectiveness and efficiency of a source search algorithm.We also observed interesting findings in terms of usability and cognitive workload.Clearly, these findings show that experts and nonexperts have different needs and preferences.

General Feedback
In general, experts liked the system very much, and non-experts also liked the system to some extent, as Table 6 Results of the cognitive workload.FC means the full control mode (humans take over the machine), while AC means the aided control mode (humans aid the machine).shown in Fig. 5 (a).Experts thought that the system was professional, since it could "present various kinds of important factors and information for source searching in a professional way" (Participant 4, Male, Age 23, Expert), and it could effectively address practical problems.Particularly, the experts believed that humans play a very important role here, since the machine was not always reliable and the AI's suggestions of estimated source locations were not always precise.
According to experts' understanding, during source searching, there is a trade-off between immediately going to the AI-suggested source location and exploring other areas to gain more information (we also observed that non-experts would like to immediately guide the robot to the suggested location of the source, without extra exploration).Humans were there to manage this subtle trade-off (Participant 1 and Participant 2).
Another important factor is trust.When the AI was explaining the search algorithm to the participants, some participants chose to not trust the AI (Participant 8).However, some participants trusted it very much at the beginning but felt deceived when the AI was wrong (Participant 9).After reading participants' feedback, we argue that a further explanation is needed to show the confidence level of the AI, so that users will understand to what extent they should trust AI, as studied by previous work as well [10,16,17].
In summary, the participants in general liked the system.Both experimental results and participants' feedback show that humans could play a critical role in the search process.While people consider human-AI collaboration as a professional and effective means, an important issue has been raised concerning human trust in AI systems.Trust is a key driver of inter-human collaboration.As AI techniques advance, building trust would be a big leap towards future human-AI collaboration.

How Can AI Explanation and Suggestion Be
Improved?
In the post-task interview, we explicitly asked participants questions about their feeling on the current explanation & suggestion method we used, and how the method could be improved.As can be seen in Fig. 5 (b), (c), and (d), experts generally appreciated the expla-Journal of Social Computing, 2023, ?(?): ???-???
1 Not at all nation and understood the problem well, whereas nonexperts did not report good scores in explanation.We consider this finding reasonable since the prototype system was designed according to experts' discussions and only from experts' perspectives.
After carefully analyzing all the comments from 10 participants, we suggested that the search algorithm explanation should particularly look into two aspects: personalized explanation and explanation in natural language.

Personalized Explanation
In the prototype system, the user interface used graphical elements with some text to explain the search algorithm as well as the problem the algorithm encountered.The way of explanation was identical to both experts and non-experts.Participants' feedback suggested that personalization would be helpful in explanation.In general, experts reported that they looked forward to more information (2 out of 4), while non-experts reported that they looked forward to less information (3 out of 6).In addition, it is interesting to see that there were no experts wanted less information and no non-experts wanted more information.
We could learn from the feedback that the focus of people differed a lot when they tried to understand the algorithm and the problem, meaning the explanation should be customizable and able to satisfy various preferences.More information for experts and less information for non-experts is just one direction that personalized explanation could advance forward.As previous studies have shown that a variety of factors such as human emotions and moods could affect the performance [42][43][44][45][46], we argue that studying personalization in many different aspects is a crucial topic in developing effective and friendly human-AI collaborative systems.

Explanation in Natural Language
Using natural language and conversation as the media for human-AI interaction has become a trend.Previous works have shown the advantages of using conversational interfaces over traditional graphical interfaces, for improving satisfaction, engagement, and output quality [47][48][49].In terms of human-AI collaboration for search algorithms, we found that the participants would like to see a more natural communication means -instead of giving professional explanations using graphical elements, both experts and nonexperts looked forward to explanations in natural language, which could either be text-based or voice-based.
Future work could focus on developing novel methods for automatically generating understandable explanations in human languages.As the current prototype system and explanation methods used in this work did not help non-experts much in understanding the algorithm and the problem, it is worth exploring whether explanation in natural language could further improve the user experience for non-experts.

How Can Crowd-Powered Problem Solving Be
Improved?
To understand whether the current crowd-powered problem-solving approach meets participants' needs, we explicitly asked questions in the post-task interview.
In terms of the question "which control mode do you prefer?",only one participant (expert) reported that he preferred the full control mode, meaning humans took over the machine during problem-solving.All the other 9 participants preferred the aided control mode, meaning humans aiding the machine during problem-solving.Furthermore, as shown in Figure 5 (e) and (f), it is interesting that non-experts tended to think that their ideal solution was correctly achieved by the system, while experts gave moderate scores.This is possibly due to the fact that experts tended to have higher standards and expected optimal solutions.However, experts believed that eventually the problems were solved successfully by them.In the following sub-sections, we analyzed each participant's comments in terms of crowd-powered problem-solving, and provided insights for improving search algorithms using human-AI collaboration from two perspectives: personalized interaction and intuitive control.

Personalized Interaction
We have learned that participants would like to see personalized AI explanations.In terms of interaction and problem-solving, we could learn similar findings from participants' feedback.First of all, the control mode (full control vs aided control) could be personalized.It is interesting to see that most participants preferred the aided control mode, although full control could achieve better efficiency.The main reason that most participants preferred the aided control mode is that they could put less effort but it could achieve the same goal as the full control mode could do.Both experts and non-experts could see the advantages of using a full control mode, though most of them preferred the mode of aided control.We still argue that the full control mode could have better effects for some specific scenarios which require more precise control and more explorations, like Participant 2 (Male, Age 40, Expert) explained: "I think the environments in the current system are relatively simple, so the aided control mode is better; when the environment becomes more complex, a full control mode might play a more important role".

Intuitive Control
In our context, we create a term -"intuitive control" -to represent a problem-solving approach that enables humans to make a vague goal (which does not define specific actions) for the search algorithm, following their instincts.As Participant 10 said, "Sometimes I just wanted to guide the robot to search in a certain direction rather than to move to a specific position".For the AI system, this requires a better capability of understanding human intention, and therefore it can further reduce human cognitive workload and achieves a more intelligent collaboration.

Implications for Designing Crowd-Powered Systems
Our study has shown that using human-AI collaboration is effective in improving source search algorithms.We have learned important lessons from the experimental result in terms of how to better leverage human intelligence.The findings provide important implications in terms of designing general crowd-powered systems.
Personalization.According to the participants' feedback, personalization is one of the most important factors that should be considered during the design phase.
Our experiment shows that the design could achieve good efficiency and effectiveness, but did not satisfy non-experts' needs to some extent.Therefore, we suggest that personalization should play an important role throughout the entire process, since human needs vary a lot due to different backgrounds, education levels, and personalities.
Learning from Humans.We argue that a crowdpowered system in the future should be able to learn human behavior and accordingly adjust its own problematic actions.The prototype system has successfully improved the effectiveness and efficiency of the state-of-the-art search algorithms.However, our proposed approach still frames the capability of the machine, meaning the machine could only detect the problem and explain the problem in a way that the task designers decided, and the humans could only help the machine in problem-solving using pre-defined control modes.Human-AI collaboration should be mutually beneficial, as machines could help humans in problemsolving, and simultaneously learn from humans.
Crowd Computing.The experiment has shown that non-experts could achieve comparable output quality, with regard to effectiveness, efficiency, and execution time, in comparison with experts.This is a rather positive finding, implying the potential of massive deployment using crowd computing techniques and the possibility of leveraging swarm intelligence.Researchers and practitioners in the field of human-AI collaboration should focus on lowering the barrier of interaction, to access a larger number of users.Crowd computing has provided numerous new opportunities for multiple disciplines, including AI and HCI communities [50].
It could be connected to prevalent crowdsourcing platforms to access more diverse users, and to acquire faster responses.

Limitations and Future Work
In this study, we only recruited participants from our institute, and all the participants were either full-time researchers or students.We acknowledge the limitation that the participants in our study are not representative enough.Future work could properly perform a power analysis, consider a larger sample size, and probably recruit participants from online freelancing/crowdsourcing marketplaces, to obtain more general findings.We also realized that the problem detection and explanation are rather simple in the prototype system design, and only two control modes were implemented in the prototype system.We consider it reasonable since this is a first step studying crowd-powered methods and human-AI collabortion in improving source search algorithms.As results indeed showed the feasibility of using human-AI collaboration, we suggest that future work could use more recent and advanced AI-based techniques to achieve a more intelligent interaction.

Conclusions
In this work, we proposed a research question to investigate the effects of using crowd-powered approaches in source search algorithms.To answer the research question, we designed a framework enabling human-AI collaboration for improving existing source search algorithms and carried out an experiment, asking experts and non-experts to complete 200 source search scenarios (704 crowdsourcing tasks generated in total).The experimental results showed the feasibility of crowdpowered source search in improving both effectiveness and efficiency.Furthermore, we explicitly asked the participants to report their perceived usability, cognitive workload, and their feedback to deeply understand their needs.Finally, we provided design implications for future human-AI collaboration and crowd computing research.

Fig. 2
Fig.2Elements needed to be presented while explaining the source search algorithm.
(a) The initial state of source searching.(b) A human is solving a problem.(c) The source is successfully found.

Fig. 3
Fig.3Screenshots of crowdsourcing tasks generated by the prototype system.

Fig. 5
Fig. 5 Histograms of the questions that are answered by 7-point Likert scales. :

Table 1
The questions used in the post-task interview.

Table 2
Demographic information of the participants of this study.

Table 3
Results of the source search experiment.

Table 4
Results of two-way ANOVA for the efficiency (# of steps) of source search.

Table 5
Results of the system usability.FC means the full control mode (humans take over the machine), while AC means the aided control mode (humans aid the machine).