Supply-Demand Matching in Non-Cooperative Social Networks

The complex supply-demand matching problem is a kind of social service computing problem, which can be applied to the coordinated production of products or the supply of service. In this scenario, the demander needs a number of suppliers to provide services or products to complete a given task. The key to solving this problem is to build a supply network that covers the requester’s requirement. Traditional collaboration issues in social network mainly focused on the “team formation problem”, that is to build a team that covers all the skills required for the task. However, due to the complex characteristics of supply-demand matching problems in the application of social services, the team formation method is limited and inefficient, and there is no special solution for the complex supply-demand matching problems in social network. This paper proposes a general framework to solve the complex matching problem of supply and demand. On the premise of non-cooperative constraints, social networks are used to build supply networks with low communication loss, and the unnecessary cost is reduced through cooperation.


I. INTRODUCTION
This paper studies the complex matching problem of supply and demand in Non-cooperative social networks (SNs). The demander wants to build a supply network that covers the task demand in both type and quantity. Unlike the previous ''team formation problem'' [1]- [11], we had to redesign the solution because of the following characteristics of the supply-demand matching problem: x The same service or product requirement in a task can be segmented to multiple providers to provide collaboratively; y Different from contributing skills, suppliers of services or products may face capacity caps; z A single supplier can supply multiple products and services to multiple demanders simultaneously. Besides, due to the selfishness of members in practical application [12], [13], the scheme must be feasible under the constraints of non-cooperation. At the same time, considering that individuals with closer social relations are more likely to have closer geographical and linguistic connections and a higher level of trust [14], [15], SNs will be used to select The associate editor coordinating the review of this manuscript and approving it for publication was Leyi Wei. suppliers with closer social relations with the employer, to reduce the cost of communication [16]- [20].
Although there are many ways to use SNs to form professional collaborative team [4], [21], [22], there are still some defects and some practices that are not applicable to this supply and demand problem: x these methods ignore the differences in social relationship quality caused by the differences in SNs structure. For example, the quality of two SNs composed of a complete graph and its minimum spanning tree cannot be generalized; y privacy policy leads to the imbalance of network information, which hinders a better employment plan (reduce the total cost of each task) in the whole SNs; z different from the professional cooperative team, many participants in the same supply network have no cooperative relationship, and they supply products to the demander independently. Therefore, an effective supply-demand matching method should evaluate the social relationship between the supplier and the demander, expand the supply network by using some information of member neighbors, and pay attention to global optimization.
In this context, we design a complete set of supplydemand matching method, including x a set of distributed negotiation-based supply networks formation algorithm, which allows the supplier and the demander to decide whether to cooperate or not and agree on a quotation in line with the interests of both parties, so as to initially build a supply network covering the task demand; y preference algorithm, which combines the relationship between nodes in SNs, describes the impact of trust and communication issues on task cost; z cooperate algorithm, which is used to further reduce unnecessary cost loss and achieve the goal of global optimization after the initial construction of the supply network.
The theoretical analysis shows that the proposed method can solve the complex supply and demand problem on the premise of satisfying members' selfishness and considering communication cost, and optimize the cost of individuals and groups at the same time. Finally, a series of experiments are set up to verify the improvement of this method compared with the traditional Contract Net (CN) method and adjust the parameters to simulate different task scenarios to observe the impact of condition changes on the performance of the algorithm. The experimental results show that: x compared with the traditional CN method, the method proposed in this paper can effectively reduce the cost; y the proposed method performs better in the scenario of low supply cost and small demand for a single product.

II. RELATED WORK A. TEAM FORMATION PROBLEM
As mentioned above, there are differences between ''supplydemand matching problem'' and ''team formation problem'', but current researches on the latter can provide references for the former. The goal of the ''team formation problem'' is to form a team of experts covering all the skills required for the task. Considering the evolution, emergent behavior, operational independence, and management independence of the systems, Lim and Ncube [22] developed a method of team building based on the stakeholders' recommendation, and proposes using social networks and crowdsourcing to identify and prioritize the stakeholders of systems projects. Taking into account the impact of social relations on team cooperation, Wang et al. [21] used the social neighborhood information of members to expand the connectivity graph to build a team and designed a mechanism based on distributed negotiation to improve social welfare. Different from the methods of searching experts in the whole SN, Sun et al. [8] proposed a team formation model that outsource tasks to social networks, and selected the list of centrality experts as the seed, so as to reduce the communication cost of the team and narrow the search space.
Instead of targeting individuals, Chamberlain [2] first proposed the concept of Groupsourcing. In Groupsourcing, tasks are assigned to a group of people with different expertise who were connected through social networks. Compared with other methods, Groupsourcing offers a high-precision, data-driven, and low-cost method. In Groupsourcing, every time a complex task is published, a new team would not be formed from scratch to satisfy the skill requirements of the task. Based on Chamberlain's research, Jiang et al. [3] formally defined the context-aware task allocation problem in group-oriented crowdsourcing, proposed a heuristic context-aware task allocation approach, and proposed a modeling method for natural worker groups in crowdsourcing, including groups with and without leadership. Besides, instead of solving the problem from the requestor's perspective, Lykourentzou et al. [6] explored a ''team dating'' strategy, which is a self-organized group team formation method. In this method, employees try and evaluate different candidate partners. Rokicki et al. [7] explores a cohesive strategy that includes self-organization. In this strategy, workers (initially in the form of a one-man team) can decide which team they want to join or who could join their team.

B. TEAM REVENUE OPTIMIZATION
Common methods to improve team profitability include reducing personnel and communication costs, improving team members' quality, and Optimizing the rationality of task allocation.
Complex tasks can be decomposed into smaller subtasks, which can be executed either sequentially or in parallel by workers. In order to build a high-quality team by rationalizing task requirements, Jiang and Matsubara [23] demonstrated the superiority of vertical task decomposition over horizontal task decomposition in improving the quality of the task's solution, and clearly explained the optimal vertical task decomposition strategies under two revenue sharing schemes, which maximized the quality of task solutions.
Tran-thanh et al. [9] studied the issue of how to hire higher-quality experts on a limited budget, they redesigned the classic multi-armed Bandit (MAB) model to solve this problem. An algorithm called bounded ε-first was proposed, it uses the first εB of its total budget B to derive estimates of the workers' quality characteristics (exploration), while the remaining (1 -ε)B is used to maximize the total utility based on those estimates (exploitation). Tran-Thanh also developed another BugetFix algorithm [24], which determines the number of interdependent micro-tasks and the price to pay for each task given budget constraints. Moreover, BudgetFix provides quality guarantees on the accuracy of the output of each phase of a given workflow.
Wolf et al. [25] were the first to realize that SNs plays an important role in teamwork, and social connections among social individuals might represent collaboration relationships(e.g., collaborate on common tasks previously). The advantage of using these SNs is that the social individuals who have worked together previously are estimated to work effectively as a team without much coordination overhead [16], [17]. Therefore, scholars consider establishing a cooperative team in SNs [18]- [20], so that the members of the team can form a connected graph, and work together effectively. Considering the selfishness of the members of SNs [12], [13], Wang et al. [21] further explored a team- VOLUME 8, 2020 building approach adapted to non-cooperative constraints. They model each individual as a selfish entity, using a negotiating mechanism to cut costs and improve social welfare.

III. PROBLEM DESCRIPTION
First, Social network SN=<A,E> is an unweighted undirected graph, where A={a 1 , a 2 ,. . . , a m } is the set of all member nodes in the graph, ∀(a i , a j ) ∈ E indicates that there is a social relationship between nodes a i and a j . ∀a i ∈ A is defined by 4-tuple <G(a i ), M(a i ), C(a i ), N(a i ) >. Where G(a i ) = {g 1 , g 2 ,. . . } represents the type of product that a i can supply; M(a i ) = {max(a i , g 1 ),. . . , max(a i , g |G(ai)| )} indicates the maximum upper limit of g j ∈ G(a i ) supplied by Task t is defined by 4-tuple <I t , G(t), R(t), E(t)>. Where I t is the t task requester; G(t) = {g 1 , g 2 ,. . . } indicates the type of all product supply needed to complete task t; R(t) = {r(t, g 1 ),. . . , r(t, g |G(t)| )} indicates the demand of task t for various product supply; E(t) = j=1,...,|G(t)| e(t, g j ), indicates the total value (not profit) that each product supply in task t can bring to I t .
The supply network N t for task t is defined by <t, t , O(t), Us(t)>. Where t represents the set of employed members of N t ; O(t) = {(t, a i , g j , q(a i , g j , t), p(a i , g j , t), η(I t , a i )),. . . , (t, a p , g q , q(a p , g q , t), p(a p , g q , t), η(I t , a p ))} refers to the set of order details contained in N t , where q(a i , g j , t) refers to the quantity of g j supplied by a i for task t in the order. p(a i , g j , t) = q(a i , g j , t) * c(a i , g j ) indicates a i 's payment for supplying g j for task t, i.e. a i 's salary. η(I t , a i ) is the communication cost coefficient between I t and a i ; Us(t) = {us(t, g j , q us (t, g j )),. . . , us(t, g q , q us (t, g q ))} is the unmet requirement of task t, where q us (t, g j ) = r(t, g j )-(t,ai,gj,·,·,·)∈O(t) q(a i , g j , t) is the unmet requirement of g j in t.
In particular, there are some concepts that will be used many times in this paper. Although they can be expressed by the above-mentioned defined symbols, the symbol definitions still gave for the convenience of use: µ(t, a i , g j ) = q(a i , g j , t)/r(t, g j ), µ(t, a i , g j ) ∈ (0, 1], which represents the ratio of the quantity of g j supplied by a i to r(t, g j ) in the contract of task t; λ(t, g j ) is the ratio of current remaining unallocated demand in t's sub demand g j to r(t, g j ). That is to say, λ(t, g j ) = 1− (t,ai,gj,·,·,·)∈O(t) q(a i , g j , t)/ r(t, g j ), λ(t, g j ) ∈ [0, 1]. th(t, g j ) is the threshold value to measure whether it is in its interest to employ a node to supply g j for task t. th(t, g j ) = E(t)/r(t, g j ).
The problem is that given task t and SN=<A, E>, task initiator I t wants to build N t =< t , O(t),Ø>, so that task t is completely covered by N t in the type and quantity of requirements; N t should not contain redundant members, and each member should provide at least one product supply; reduce the cost as much as possible, in order to increase the revenue Pro(t) = E(t)-(t,ai,gj,·,·,·)∈O(t) q(a i , g j , t) * (1 + η(I t , a i )) * c(a i , g j ). The above symbol definitions are given in Table 1.

IV. SUPPLY-DEMAND MATCHING IN NON-COOPERATIVE SOCIAL NETWORKS
In this paper, a distributed negotiation-based mechanism is used to make equal decision-making between supply and demand sides driven by interests, and a supply network covering task demand is constructed. In this process, the social relations exposed by the expansion of supply network are quantified to evaluate the cost of communication between SN nodes, which will have an impact on employment results. Finally, after the initial construction of the supply network, it coordinates with other task requesters in SN, exchanges some members of the supply network on the premise of meeting the interests of both sides, and improves the individual and group benefits. The above process is completed by three algorithms respectively, and the relationship between them is shown in Figure 1. We will describe these three algorithms respectively.

A. SUPPLY NETWORK CONSTRUCTION ALGORITHM
Input the demand of SN and I t , and output a preliminary supply network in line with the interests of both the task supplier and demander. Before describing the algorithm, several role concepts should be first defined. Definition 1 (Freelancer, Contractor, Supplier): For a product g j , if the node a i with g j supply capability does not have g j order at a certain time, then a i is called freelancer; If a i has g j orders, but N t has not been completed and g j supply has not yet started, a i is called contractor; after the start of supply and until the end of supply, a i is called a supplier.

1) EXPANSION ALGORITHM
This algorithm drives the expansion of supply network.

Algorithm 1 (Expand the Supply Network)
1. Initialize t = I t , ∀g j ∈ G(t), q us (t, g j ) = r(t, g j ), For a x ∈ N(a i )∪a i 5.
update N(t) and a x 8.
Terminate this Algorithm 10.
End for 11. End for The member initializing N(t) in step 1-2 only contains It, without any order, and the remaining unsatisfied demand equals the complete initial demand of t. Traverse each included member of the supply network Nt and their neighbor nodes (step 3-4), execute the decide algorithm (step 6) on it, and update the supply network Nt (step 7) according to the information returned by the decide algorithm until all the requirements of t are met (step 8-9). The algorithm involved in step 5 will be given in a ''preference algorithm'' later.

2) DECIDE ALGORITHM
This algorithm describes the details of step 6 in algorithm 1 and completes the decision-making of both the supplier and the demander in the way of three-stage distributed negotiation.

a: OFFER STAGE
In this stage, I t issues an order offer to the nodes that meet the requirements.

Algorithm 2 (Decide-Offer Algorithm)
/ * Q temp is the task amount of a i in the negotiation; P temp is a i 's order compensation in the negotiation * / 1. initialize th(t, g j ) = E(t)/r(t, g j ) 2. for each g j ∈ G(a i ) 3. if c(a i , g j ) * (1+η(I t , a i ))≤th(t, g j ) 4. if λ(t, g j ) = 1 5.
P temp = Q * temp c(a i , g j ) 8.
send offer(t, a i , g j , Q temp , P temp ) 9.
P temp = Q * temp c(a i , g j ) 13.
send offer(t, a i , g j , Q temp , P temp ) 14. end for Step 1 Define the threshold th to measure whether an employment is in I t 's interest. Traverse each product supply capacity g j of a i to determine whether its unit cost (if there is no special description, the ''cost'' in this paper includes product cost and communication cost) is less than th (Step 2-3). If it is, it will send g j order offer to a i in a ''saturated'' way. Its ''saturation'' is reflected in: x when the g j demand of task t has not been allocated, the order quantity in offer is all g j demand (Step 4-8); y when the g j demand has been allocated, it tends to meet all the remaining unallocated g j demand first, and then replace all the allocated orders (Step 9-13) in current N t with higher cost than a i .

b: RESPONSE STAGE
In this stage, a i responds to I t 's offer according to its own situation. Several concepts of capacity should be clarified.
Definition 2 (Free Capacity, Locked Capacity, Forbidden Capacity): For product g j , the surplus capacity of g j owned by freelancer is called free capacity; the g j capacity of the contractor is called locked capacity; the g j capacity of the supplier is called forbidden capacity. Due to a supplier can only supply a particular product to one demander at a time, only the free capacity can be provided to any demander by supplier freely. a i responds according to its g j capacity type after receiving offer: • Free capacity. After modifying the order quantity Q temp to the smaller value of ''its remaining gj supply capacity max(a i , g j )'' and ''the order quantity Q temp suggested by I t in offer'', a i make a positive response, agreement(t, a i , g j , Q temp , P temp ), to accept the offer; • Locked capacity. As an important basis for the follow-up ''coordination algorithm'', a i responds to I t with mark(t, a i , g j , Q temp , P temp , t 2 , q(a i , g j , t 2 )) to declare '' the quantity of I t 's orders that could have been accepted by a i if a i have not been employed by I t2 '', that is, min[Q temp , max(a i , g j )+q(a i , g j , t 2 )]; • Forbidden capacity. Make a negative response to refuse the offer.

c: CONFIRM STAGE
At this stage, I t determines the result of this decision according to the response content of a i : • If the response is agreement(t, a i , g j , Q temp , P temp ), a i will be allocated to supply the unmet g j demand λ (t, g j ) in N t first; if Q temp can cover λ(t, g j ) and there is still surplus, continue to replace (or partially replace) the high-cost g j orders in current N t in the order of ''high-cost orders → low-cost orders'' • If the response is mark(t, a i , g j , Q temp , P temp , t 2 , q(a i , g j , t 2 )), record this mark for subsequent ''coordination algorithm''; • If the response is refusal, the employment of a i to supply g j will be abandoned.

B. PREFERENCE ALGORITHM
Preference algorithm describes the impact of trust and communication problems on task cost according to the relationship between nodes in SNs. The key lies in: • determine the optimal precursor; • calculate the shortest distance; • design an appropriate preference function to evaluate the communication loss according to the shortest distance.

Definition 3 (Previous Supply Network, Precursor, Optimal Precursor, Shortest Distance):
Previous supply network (pre_N t ) of task t is a network that includes all the nodes that have been employed in the construction process of N t and their precursors; In pre_ N t , if the demander I t can access a i 's direct neighbor a x through a i , then a i is called the ''precursor'' of a x , which is recorded as a i ∈ pre(t, a x ); Among the precursors of a x , the one with the shortest distance from I t is called ''optimal precursor''--opre (t, a x );In Pre_N t , the shortest distance Dist(t, a x ) between I t and a x is n if I t can be reached by a x after a x visit optimal precursor iteratively n times.

1) DETERMINE THE OPTIMAL PRECURSOR
Step 5 of algorithm 1 gives the time to determine the optimal precursor of the node, which is determined by method setOpre(a i , a x ).
First, judge whether the visited node in Step 5 of algorithm 1 is its own precursor, if it is, do nothing (step 1-2); if not, record a i as the precursor of a x , traverse the precursor set of a x to update a x 's optimal precursor in task t (step 3-11). In this process, if the optimal precursor of a x changes to Algorithm 3 (Determine the Optimal Precursor, setOpre(a i , a x )) 1. if a i == a x 2. do nothing 3. else 4.
pre(t, a x ) = pre(t, a x )∪a i 5.
for each a y in pre(t, a x ) 6.
do nothing 11. end for a i , the shortest distance between a x and I t is recalculated by calDist(t, a x ) (step 8).

2) CALCULATE THE SHORTEST DISTANCE
Step 8 of algorithm 3 shows when to calculate the shortest distance between a node and I t . The calDist(t, a x ) method is used to calculate: set a pointer at a x , move the pointer in the direction of the optimal precursor of the current node until it reaches I t . Then the number of the pointer moves are taken as the shortest distance between a x and I t .

Definition 3 (Communication Cost Coefficient, Upper Limit of Communication Cost Coefficient):
The communication cost coefficient η(I t , a i ) is calculated by the preference function, which describes the additional cost loss when network members cooperate; the upper limit of communication cost coefficient η max is set because the communication loss will not increase endlessly with the increase of social relationship distance, and it must converge to a certain upper limit, which needs to be determined according to the actual application research. For example, the supply of precision instrument parts, such as aircraft instruments, requires more communication to ensure compliance with specifications and quality requirements, so the communication cost and η max are both higher; for orders such as food packaging bags, η max is lower because there is no need for many additional communication costs.
The preference function determines the communication cost coefficient between nodes according to the social relationship, which should meet the following constraints: x the function should be in a positive proportion to the shortest distance between nodes; y for the nodes with a long social distance, the change of communication cost caused by the increase of distance is no longer obvious, so the growth rate of function should decrease with the increase of distance, that is, the differential coefficient of the function decreases monotonically. z According to the theory of six degrees of separation, there are no more than six intermediate nodes between two points in social network, so the communication cost coefficient should be close to the upper limit when the distance is 6.
With these constraints, it is found that the functione −x + 1 can meet the above requirements well: the function monotonically increases on [0, +∞], the derivative gradually decreases to 0, the function finally converges to 1, and approaches the upper limit of the function when x = 6. The function image is shown in Figure 2: In addition, to meet the requirement of upper limit η max , the preference function is defined as η(I t , a i ) = η * max (−e −x + 1), where x represents the shortest distance Dist(t, a i ) between a i and I t .

C. COORDINATION ALGORITHM
In order to optimize the high-cost supply network caused by SNs privacy, the coordination algorithm coordinates the requesters with unbalanced social resources and exchanges their rights to employ the contractors, so as to improve the individual and group benefits at the same time.
The coordination algorithm is proposed based on this situation: I t receives a 1 's response, mark (t, a i , g j , Q temp , P temp , t 2 ), during the construction of N t , I t has to give up a 1 temporarily. After I t builds N t , I t finds that if a 1 is employed to supply g j , cost C 1 could be saved compared with current N t ; and for contractor a 1 's employer I t2 , if there were other alternative suppliers besides a 1 to complete this part of g j supply, and the increased cost after substitution C 2 is smaller than C 1 , there would be room for coordination between I t and I t2 .
The execution time of coordination algorithm is after I t completes ''supply network construction algorithm''. It traverses every mark response I t receives in the process of N t construction, executes coordination algorithm, and completes the coordination of two requesters in the way of three-stage distributed negotiation.

1) REQUEST STAGE
In this stage, I t decides whether and how to issue a coordination request to I t2 based on its own situation and the content of mark.
Based on the order quantity that a i responded in mark, I t updates N t by ''replacing the contractors whose unit supply cost of g j is higher than that of a i in the order of cost from Algorithm 4 (Cooperate-Request Algorithm) 1.The nodes a exp in {a exp |c(a exp , g j ) * (1+η(I t , a exp ))> c(a i , g j ) * (1+η(I t , a i ))&&a exp ∈ t } are arranged in descending order according to the size of c(a exp , g j ) * (1+η(I t , a exp )), and the µ(t, a exp , g j ) of the arranged a exp are filled in the list in turn. 2. µ'(t, a i , g j ) = Q temp /r(t, g j ) 3. x = 0 4. C pre = 0 5. if list is empty 6. do nothing 7. else if µ'(t, a i , g j ) <= list(x) 8. C 1 = Q * temp (c(a exp , g j ) * (1+η(I t , a exp )-c(a i , g j ) * (1+η(I t , a i )) / * a exp makes µ(t, a exp , g j ) == list(x) * / 9.
do nothing high to low'' by means of simulation. The replaced g j order quantity Q temp is compared with the order quantity q(a i , g j , t 2 ) of a i at I t2 . If Q temp >= q(a i , g j , t 2 ), calculate the cost C 1 that can be saved by the new N t after simulation compared with the original N t , and issue a coordination request (step 1-22); otherwise, give up the coordination (step 23-24) (this is because the coordination will reduce the order quantity of a i , which is not in line with the interests of a i and the principle of non-cooperation).

2) JUDGE STAGE
when I t2 receives the request (I t , I t2 , a i , g j , Q temp , C 1 ), it judges whether and how to accept the coordination according to its own situation. The judgment basis includes: • whether N t2 has started to supply; • whether I t2 can find other nodes with g j supply capacity to meet the order quantity q(a i , g j , t 2 ) to replace with; VOLUME 8, 2020 • whether the cost loss C 2 of I t2 caused by a i replacement is less than the cost C 1 saved by I t .
Terminate this Algorithm 14.
end for 15.
agree(I t , I t2 , g j , a i , Contribution) If N t2 has started to supply, I t2 will directly reject the coordination request of I t (step 3-4), otherwise, it will try to find a node that can replace a i to supply g j (step 5-17). The PreDecide method is used to meet the g j requirements of q(a i , g j , t 2 ) (step 10) (the method will be detailed below) until all requirements are met (step [15][16]. If λ temp (t, g j ) >0, it indicates that the a i order cannot be replaced successfully, the coordination request is rejected (step [18][19]. Otherwise, the lost cost C 2 when a i is replaced is calculated (step 21). When C 2 >= C 1 , the coordination request is rejected. When C 2 < C 1 , the coordination request is agreed and the contribution request between [C 2 , C 1 ) is proposed (step [22][23][24][25].
The PreDecide method in step 10 is similar to the decide algorithm in ''supply network construction algorithm'', in which the demander sends out an offer and the node who receives the offer makes a response. However, it has the following differences: • It is called ''Pre-Decide'' because it seeks only alternative nodes and does not include the actual hiring process known as the confirm stage.
• Since contributions can be requested to I t to cover losses, PreDecide are made regardless of cost and do not require threshold th limits.
• Since ''locked capacity'' and ''forbidden capacity'' cannot meet the coordination needs immediately, it is not necessary to distinguish between the two types of capacity in the response stage of PreDecide, only ''free capacity'' can be used for replacement.

3) CONFIRM STAGE
At this stage, I t receives the judge result from I t2 for its coordination request, and makes different actions according to different results.
• If the result is agree(I t , I t2 , g j , a i , Contribution): I t2 cancels all g j orders of a i , allocates the canceled orders to the nodes that make positive response in I t2 's PreDecide algorithm, completes the change of supply network N t2 . I t gets the right to employ a i , replaces the high-cost g j orders of N t in algorithm 8, and hires a i to provide g j (the quantity of the supply is Q temp ), and pays I t2 contribution, ConMoney, as the compensation for coordination.
• If I t receives I t2 's response as refuse: Abandon this coordination.

V. VERIFICATION AND CONCLUSION
The performance difference between the proposed algorithm and the traditional CN model is verified by a series of experiments with task cost as a measurement index.

A. DATA SET
In order to observe the performance of the algorithm under different conditions, specific parameter combination is used instead of the actual data set. Three groups of experiments were set up, and each group was executed 100 times. The parameter distribution is shown in Table 2. Value1∼3 simulated the normal scenario, high-threshold scenario and high-demand scenario respectively.
The parameters of each independent experiment were obtained from the normal distribution. The values in table are the expectations and the standard deviation is 1.

B. COMPARISON MODEL AND EVALUATION INDEX
• Algorithm in this paper • Traditional CN method: in the process of supply network expansion, I t only employs a i to supply g j under the   condition that the g j demand of N t has not been met and the g j cost of a i is lower than the threshold th. Compared to this method, our model's advantage is the ability to continually update the supply network to select the lowest cost suppliers and the opportunity to coordinate requesters to optimize the supply network.

C. EVALUATION CRITERIA
The cost of task t is taken as the measurement index of the algorithm.

D. EXPERIMENTAL RESULT
The experiment result is shown in Figure 3∼5: Experimental results show that the proposed method has better performance than the traditional CN model. Although the performance differences between the two algorithms is not stable, the average cost of the three groups of experiments with different parameter settings was reduced by 5.60%, 8.26% and 2.57% respectively compared with the CN model. The following conclusions can be drawn: • With the increase of threshold value, the compensation cost of the two models will increase at the same time, but the advantage of the algorithm in this paper is more significant.
• With the increase of task demand, the salary cost of the two models will expand at the same time, but the cost difference between the two models will remain unchanged in numerical value, which makes the performance advantage brought of the algorithm no longer obvious.
• The algorithm proposed in this paper performs better in the scenario of high measurement threshold and low demand for a single kind of product.
SHAO-JIE ZHANG was born in 1998. He received the bachelor's degree from the School of Software, Shandong University, in 2020. He is currently pursuing the master's degree in artificial intelligence with The University of Manchester. He has participated in four national, provincial, and ministerial level projects. His research interests include multiagent systems, machine learning, and reinforcement learning. He is a member of CCF.
XU-DONG LU received the master's degree from the Department of Computer Science and Technology, Shandong University, in 2001, and the Ph.D. degree in engineering from Shandong University, in 2010. He is currently a Lecturer with the School of Software, Shandong University. He has participated in more than ten national, provincial, and ministerial level projects. He has published more than ten articles in important journals or conferences. He holds more than ten patents for invention in application field. His main research interests include big data technology and intelligent data analysis. He received one First Class Prize for the Ministry of Education Teaching Achievement Award and one Second Class Prize for the Teaching Achievement Award of Shandong Province.
SHI-PENG WANG was born in 1995. He received the bachelor's degree from the School of Computer Science and Technology, Shandong University, in 2017, where he is currently pursuing the Ph.D. degree. His research interests include reinforcement learning, multiagent systems, and mental model.
WEI GUO was born in 1978. He received the master's degree from the Department of Computer Science and Technology, Shandong University, in 2005, and the Ph.D. degree in engineering from Shandong University, in 2015. He is currently an Engineer and a master's supervisor. He has participated in more than ten national, provincial, and ministerial level projects. He has published more than 20 articles in important journals or conferences. He has published an academic monograph. He holds more than 10 patents for invention in application field, of which seven are first invention applicants. His current research interests include big data technology and intelligent data analysis. He is with the Committee of CCF TCSC and CCF TCCC. He was a recipient of several prizes for the Scientific and Technology Progress (STP), including one Second Class Prize for the STP of State Education Ministry of China, one First Class Prize, and two Second Class Prizes for the STP of Shandong Province. He was also a recipient of one First Class Prize for the Ministry of Education Teaching Achievement Award and one Second Class Prize for the Teaching Achievement Award of Shandong Province.