A Binary Particle Swarm Optimizer With Priority Planning and Hierarchical Learning for Networked Epidemic Control

The control of epidemics taking place in complex networks has been an increasingly active topic in public health management. In this article, we propose an efficient networked epidemic control system, where a modified susceptible-exposed-infected-vigilant (SEIV) model is first built to simulate epidemic spreading. Then, different from existing continuous resource models which abstractly map resources to parameters of epidemic models, a concrete resource description model is built to simulate real-world goods/services and their allocation. Based on the two models, a cost-constraint subset selection problem in epidemic control is identified. To solve the problem, a swarm-based stochastic optimization policy is proposed, where each particle in the swarm can determine its own solutions according to the guidance of its superior peers and historical searching experience of the whole swarm, without extra problem-relative information. Theoretical proof about system equilibrium is provided, which is consistent with experimental observations. The competitive performance of the proposed optimizer is validated by theoretical analysis and comparison experiments. Finally, an application case is provided to illustrate the practicability.


I. INTRODUCTION
I NFECTIOUS diseases, having brought great economical loss [1] and public health burden in past centuries, are still spread around the world.Mathematical simulations [2], [3] and networked control systems [4] contribute to the prediction, mitigation, and eradication of epidemic spreading.In the past decades, numeric studies on different epidemics suggested that resources play a significant role in the epidemic control [5]- [8].The allocation of these resources becomes a fundamental problem in computational epidemiology.
Epidemic models are the basis of resource allocation problems, which provides the context of epidemic spreading.Recently, nonlinear epidemic models based on mean field theory are well-known technologies [9]- [11], which can be classified into four categories: 1) susceptibleinfectious (SI) models [12]- [14]; 2) SI-susceptible (SIS) models [15], [16]; 3) SI-recovered (SIR) models [17]- [20]; and 4) SIR-susceptible (SIRS) models [21]- [23].Thereinto, SI models can be used to simulate unrecoverable epidemics such as HIV/AIDS [14].It is also popular in simulating the spread of information [12], [13] due to the simplicity and low computation complexity.SIS modes facilitate the simulation of recurrent epidemics, such as influenza flu [17].SIR models are specialized in dealing with vaccine-preventable epidemics which confer lifelong or transient immunity to the recovered individuals, such as measles and smallpox.SIRS models have raised much attention in recent years for the good simulation on unrecoverable epidemics, recurrent epidemics, and vaccine-preventable epidemics [22].One representative variant of SIRS models is susceptible-exposedinfected-vigilant (SEIV) models [5], [24], [25], which can be degraded to the SI, SIS, SIR, and SIRS models by adjusting model parameters [22], [23].But existing SEIV models are mostly designed for simple or small-scale networks.When in the large-scale complex networks, the robustness of SEIV models needs to be improved.
Based on different epidemic models, there are two complementary research branches for epidemic control.
1) Single resource allocation with greedy heuristics, such as heuristic topology adaption strategies [26]- [28], zerodeterminant vaccination strategies [29], and decentralized protection strategies [29].They were effective in optimizing one specific category of resources, but it is difficult to combine all the strategies together, or effectively allocate multiple resources just by one single strategy.2) Combinatorial resource allocation with semiconvex/convex optimization [5], [26]- [28], [30].They perform well on scheduling the continuous resources for epidemic control, and are the basis of this article.However, they tend to define resources in an abstract way, i.e., only the resource concepts are defined, while the entity is absent.Resource costs are described as nonlinear functions of model parameters, and the resource allocation problems are to optimize parameters of epidemic models.Though the second category of studies facilitate the theoretical analysis [6], they do not facilitate engineering application.In the real world, resources are concrete goods or services, such as antibiotics, vaccines, hospital beds, nurses, and clinicians, isolation wards, which are discretely distributed and priced at per-unit basis.The differences between the existing resource models and realistic resources weaken the effect of computational epidemic-control, and become one focus of this article.Another focus of this article is how to effectively allocate the discrete resources.Since the practical budget is usually limited, discrete resource allocation optimization becomes a typical subset selection problem, which is nondeterministic polynomial hard (NPhard) [31].In solving NP-hard problems, approximately optimal policies are promising [26], [32]- [33].Thereinto, particle swarm optimizers (PSOs) have been proven to be an efficient representative [34]- [36].But so far, there are still few attempts to optimize discrete resource allocation in epidemic control using PSOs, for most PSOs with excellent performance are designed for continuous optimization, which lose the efficiency and effectiveness in solving the discrete optimization, even with the help of certain discretization methods [34].
This article proposes a concrete resource description model to simulate concrete/abstract resources in real world, which help identify a subset selection problem in epidemic control.A novel binary PSO is proposed to solve the problem.The major contributions are as follows.
1) The proposed resource description model provides concrete description to the real-world resources by considering their discrete and commodity properties.There are five categories of resources summarized from existing studies and empirical observations, with their allocation formulated by a mathematical matrix.Other resource functions are inspired by cost-utility analysis [37] and cost-effectiveness analysis [38].Concretely, cost functions are in relation to unite price and total number of resources.Utility functions describe the efficacy of single resource allocation.A global benefit function is defined to evaluate the benefit of total resource allocation, acting as the target/fitness function in PSOs.With the help of this model, existing continuous resource models can be generalized to their discrete versions.An application case illustrates how to apply the model in practice.
2) The proposed swarm optimizer with priority planning and hierarchical learning (PHSO) can produce highquality solutions for resource allocation optimization.
In PHSO, after dividing particles into multiple groups, particles in inferior groups can learn from those in superior groups.Then, some high-priority resources can be selected out through three hierarchical learning steps, becoming candidate resources for new solutions.Theoretical analysis on PHSO demonstrates its good exploration and exploitation abilities.Finally, we compare PHSO and some state-of-the-art optimizers, and results show its competitive performance.Model equilibrium state after resource allocation verifies the effectiveness of PHSO in epidemic control.The rest of this article is organized as follows.Section II demonstrates a modified SEIV model and the stability analysis.Section III elaborates on the new resource description model and problem formulation.The proposed PHSO is shown in Section IV.Section V provides experimental results and analysis.This article is concluded in Section VI.
In the first group [24], [25], [39], [41], state E corresponds to individuals who have been exposed to the epidemics but not yet infected.State V corresponds to individuals who have been vaccinated and will not be infected again.The infection rates are formulated by some bilinear functions such as βSI [41], or some nonlinear formulations, such as βSI(1 + αI k−1 ) [24], βSI(1+αI) [25], and βSI/ϕ [39], where β is the probability of being infected when a susceptible individual contacts with an infectious neighbor, and α is a positive parameter used to adjust the infection rate.Network topology is usually not considered in above SEIV models.In the second group [5], [40], state E refers to individuals who have been infected but still not aware of the infection.State V corresponds to individuals who have been vigilant to the infection (including vaccinatedtreated or other well-protected situations).The infection rates consider neighborhood topologies in the networks, such as [5] where N in represent the adjacency list of node i, β E i and p I i , respectively, are the infection rates referring to states E and I.The second group can better characterize some network-based infection processes by considering network topology.Noteworthy, in relevant literature, the boundary between above two groups are indistinct, especially on explorations to state E and the nonlinear infection process.Besides, some improvements on classical SIS models can be used to improve the SEIV models.For instance, Trpevski et al. [42] formulated a multiplicative infection rate and considered the state transition order, both of which are logically reasonable.Granell et al. [43] constructed a virtual layer to independently describe awareness diffusion.

B. Modified SEIV Model
Existing works present us a full picture about epidemic spread simulation, resulting in a modified SEIV model.We consider some superior settings, such as multiplicative infection rate, state transition priority, and awareness influence, to make the SEIV model more robust.
Given an unweighted and undirected network containing N nodes, where ν and ε denote nodes and edges, respectively.The network can be represented by Each node in the network may be in one of the four states: 1) the susceptible state (S); 2) the exposed state (E) in which individuals are asymptomatic infected and unaware of the infection; 3) the infected state (I) with symptomatic infection and aware of the infection; and 4) the vigilant state (V) in which individuals are not immediately susceptible to the epidemic.
The state transition process of the SEIV model is shown in Fig. 1.Each individual typically progresses along one of three paths: S→E→I→V(→S), S→E→V(→S), or S→V(→S), with the details as below.
1) S↔V: A susceptible individual may acquire certain immunity to resist the epidemic, thus entering the vigilant state, with an immune rate representing the rate of acquiring immunity.On the contrary, the vigilant individual may lose the immunity with an immunity loss rate {γ i |γ i = Constant}.2) S→E: A susceptible individual may be infected by its neighbor j in state E or state I at the probability The probabilities of the ith node being in any of four states can be represented by ( With the state transition probabilities, the process in Fig. 1 can be formulated by the following differential equations: where the prevalence rate u i (0 ≤ u i ≤ 1) denotes the total infection possibility of the ith node by its neighboring nodes.
If each node j in states E or I attempts to infect the node i, and each attempt may succeed at probability β E i or β I i , we can formulate the comprehensive infection rate u i , which is a multifactorial function of β E i and As the prevalence rate of an individual is over than a specific probability, the individual will raise more awareness of self-protection.Its potential infection risk will be decreased due to the self-protection awareness.With this observation, we assume that when u i > 0.5, the potential infection rate β E i will be adjusted to Besides, as is formulated in (2), the susceptible state transitions into the exposed state only when it does not transition into the vigilant state.Similarly, the exposed state may transition into the vigilant state only when it does not transition into the infected state.Then, simplifying (2) with p S i (t)

C. Global Stability Analysis
The disease-free equilibrium for system (4) is defined by Next, let we consider the nontrivial solutions of global equilibrium.From ( Theoretically, substituting thewo expressions into p I i /dt = 0 will produce final solutions.But in fact, it is hard to obtain exact solutions due to the nonlinear infection rate u i (p E j , p I j , β E i , β I i , A).A feasible method is to calculate approximate solutions by finding a linear boundary for the nonlinear system, which results in the following two theorems.
Theorem 1: Assume a degree-uncorrelated network where degree of nodes is independent of each other, and consider the mean-field premise where the state probability of one node can be represented by the population density of each state.Let subscript k denote the node connectivity degree k, and k be the average degree of the network.Define Proof of Theorem 1: Let ρ(t) represent the density of the infected nodes p I k , and substitute p E k and p V k of ( 5) into (3), yields where a kk denotes the connection probability between node with degree k and k .In degree-uncorrelated network [44] According to above two formula, the following inequation is generated: . Now, let we formulate the right hand of (4) as a new linear system.Denote which represents the infection rate of the node by all its neighboring nodes.Then, the new system will be Through (4), we can deduce that dρ k (t) dt = 0 is the linear boundary of nonlinear system .
In a degree-uncorrelated network, (ρ k (t)) = (ρ k (t)) = x, so an equation form xg(x) = 0 is obtained.Obviously, x = 0 is a trivial solution of xg(x) = 0. Since g (x) > 0, the equation g(x) = 0 has a unique nontrivial solution if and only if g(0) < 0, namely This completes the proof.Theorem 1 gives a theoretical equilibrium condition of system ( 2), but it is not precise enough when the system is built on a network with definite topology.We need to find a more elaborate equilibrium condition to ensure the system convergence.
Now, let us reformulate (4) by a matrix form Ṗ = LP + H, with LP denoting the linear part and H representing the nonlinear auxiliary, as where L is the linear-transformation probability matrix of the states, H = [H E , 0, 0] T is the nonlinear-transformation probability matrix.The model parameters are organized by vectors: Since H E <0, Ṗ = LP is the linear upper bound of system (6).
Referring to [45,Ths. 4.5,4.7,4.11,and 4.15].The maximum real part of the eigenvalue of L represents the exponential decay rate of the epidemic.Only when the leading eigenvalue is no more than 0, system (6) will converge to its disease-free equilibrium point.In matrix L, there are N eigenvalues equal to the eigenvalues of L 33 that given by θ i +γ i .The other 2N eigenvalues are equal to those of 2N Hence, the following theorem is established.
Theorem 2: Assume a definite network whose adjacency matrix is A = [a ij ], and the edge-based infection premise.Subscript i and j represent any two nodes' indexes in the network.Define R = λ * (L ) + 1 where λ * (L ) is the largest real part of eigenvalues for L .If R ≤ 1, the disease-free equilibrium is locally stable.Furthermore, if R ≤ 1 − k with some k > 0, the diseasefree equilibrium is globally exponentially stable with the exponential decaying rate upper bounded by k.
The proof of Theorem 2 is omitted, for it is similar to the proof of [5,Proposition III.1].
To sum up, the modified SEIV model is characterized by two healthy states (S and V) and two infectious states (E and I).A multiplicative prevalence rate, state transition priorities, and awareness impacts are considered in the model.With the help of mean-field theory and eigenvalue analysis, we analyze the global equilibrium condition.

III. RESOURCE DESCRIPTION MODEL AND PROBLEM FORMULATION
Existing studies have paved the way for the allocation of continuous/abstract resources [5], [6], yet how to deal with the real-world discrete/concrete resources and their allocation remains a fresh topic.In this section, a concrete resource description model is built.
To build the description model, we investigate some related studies [5]- [8] and summarize out four typical resource categories: 1) the vaccinating resource R 1 (e.g., vaccines); 2) the protective resource R 2 (e. g., medical protective mask and Malaria insecticide); 3) the detective resource R 3 (e.g., diagnostic kit and diagnostic instruments); and 4) the curative resource R 4 (e.g., hospital wards and equipment, doctors, and medicines).As a special case, we define the detective and curative resource R 5 to transition an individual from state E to state V, whose effect equals the effect of resource R 3 and R 4 .These resources are selected for they are commonly seen in real world.The virtual resources such as epidemicprevent-information broadcasted by public media are not easy to be measured and evaluated, and therefore not discussed in this article.The allocation of resources are formulated by The allocation of resources in this section is closely connected to the SEIV model proposed in Section III.Concretely, R 1 helps transition the individual in states S to V. R 2 lowers the infection rate of the individual in state S. R 3 promotes the state transition process E→I.R 4 and R 5 accelerate the transitions I→V and E→V, respectively.Moreover, as the state transition priority determine the order of one-to-many state transition, the allocation of resources need to be preliminarily optimized to avoid wasting.Corresponding to the state transition priority, we assume that if R 1 and R 2 are allocated together to one node, only R 1 is allocated.Similarly, if R 3 and R 5 are both allocated, R 3 is retained.
How to evaluate the resource cost and effect is another focus.Cost-utility analysis (CUA) [37] and cost-effectiveness analysis (CEA) [38] in health economics are two inspiring methods.CEA is to compare the relative cost of the chosen resources and their effectiveness (outcome) in the gain of health.CUA is to estimate the ratio between the cost of a resource and the benefit that it produces in terms of specific health objectives.CUA can be considered a special case of CEA if the two objectives are the same as each other.Learning from the thought of CUA and CEA, we define the following functions.
1) Cost Functions: There is a premise assumption: the resources can only be allocated to the individual with definite states; for individual with other states, the allocation will be invalid.The cost functions for each categories of resources are where 2) Utility Functions: Resources will help the individual to prevent the epidemic.Such help is reflected by the changes of model parameters [5], [6].Concretely, the vaccinating resource R 1 can improve the individual immunity (increasing θ i ).R 2 protects susceptible individuals from infection (reducing β E i and β I i ).R 3 is used to detect the exposed state (increasing ξ i ).R 4 heals the infectious individuals who are symptomatically infected (increasing δ I i ).The utility of resource R 5 is a combination of utilities of R 3 and R 4 , which heals the exposed individuals (increasing δ E i ).The values of these parameters do not change within a continuous range, but in a two-value scope.Utility functions are formulated by 3) Effectiveness Functions: Different from the utility of a single resource, the effectiveness of resources is to measure the overall effect of all allocated resources.For instance, the vaccine can transform an individual from susceptible state to the vigilant state, while the effectiveness of the vaccines describes the global influence like the degree that the epidemics are suppressed.The epidemic decaying rate (or the increasing rate) mentioned in Theorem 2 describes the effectiveness of all the allocated resources, which can be described by where c is a positive constant, λ(L ) ≥ 0 represents the speed of epidemic development, with + to characterize the increasing rate and-to characterize the decaying rate.Based on Theorem 1 and the concrete resource description model, a resource allocation problem with cost constraint and nonlinear optimization objective can be formulated, which is a typical subset selection problem.
Concrete Resource Allocation Problem (CRAP): Under a fixed budget C, the objective is to find the optimal allocation that resulted in the fastest epidemic decaying rate.Mathematically, the problem is formulated by where λ(L ) is the leading eigenvalue of L .The concrete resources and the concrete resource description model are embodied in the formulations of matrix R, Cost(R), and Effectiveness(R).The transformation matrix L is sourced from the modified SEIV model.
As a subset selection optimization, CRAP is NP-hard.It is difficult to solve the NP-hard problem in polynomial time.Traditional deterministic algorithms also lose efficacy in this situation.Therefore, we proposed a swarm-based stochastic optimizer to alleviate the above concerns.

IV. SWARM OPTIMIZER
In order to decouple the algorithm from the problem, we define a general framework about the problems.Let f (R) represent the fitness function, and h(R) ≤ C represent the constraint condition.A smaller fitness value corresponds to the better solution quality.Then, the problem CRAP in Section III can be restated as a minimization problem with 5×N dimensionality, as min f (R)

A. Existing Particle Swarm Optimizers
The basic principles of the PSO are as follows.As a popular evolutionary computation technique, PSO was first proposed by Eberhart and Kennedy in 1995 [46].It was inspired from observations on the collective nature of birds flocking, fish schooling, and other biological behaviors.Given a Ddimensional problem and a swarm with NP particles, for the ith particle, we can represent its position by , where d = 1, 2, . . ., D and k = 1, 2, . . ., NP.The updating rules of PSO are is the historical best solution of the ith particle (self-cognition), and Gbest = [gbest d ] is the global best solution of the whole swarm (social-cognition), w is the inertia coefficient that decides how much historical experiences are retained in this generation, c 1 and c 2 are two acceleration coefficients, and r 1 and r 2 are two random parameters uniformly distributed within [0, 1].The original PSO is efficient and simple, but it is easy to trap into local optima when dealing with complicated problems [35].
Some variants of PSO were developed to better deal with the complicated continuous optimization problems.A comprehensive learning PSO (CLPSO) [36] was proposed by Liang et al. in 2006, which used the other particles' historical best positions to update the velocity of a particle.Recently, to solve the large-scale optimization problems, a competitive swarm optimizer (CSO) [47] was developed by Cheng and Jin in 2015, which compared either two particles, and then let the loser-particle update its position by leaning from the position of the winner-particle.Inspired by CSO, Yang et al. further proposed a level-based learning swarm optimizer (LLSO) [35], in which the learning objects were not a single particle, but two superior particles that were selected from the two higher levels.However, the original PSO and its variants were predominately used to solve the problems in continuous space, whose performance in solving discrete optimization problems still need to be improved [36].
To solve the problems in discrete space, Kennedy and Eberhart [48] developed the binary version of PSO (BPSO, also named BPSO-S1 in this article) in 1997.In BPSO, each element in the position of a particle is randomly assigned 0 or 1.The updating rules are as follows: where sigmoid(v) = 1/(1 + e −v ), and rand(0, 1) generates a random number ranging within the range of [0, 1].Later, some other S-shaped functions and the V-shaped function family were used to replace the sigmoid function in BPSO, resulting in the BPSO-S2, BPSO-S3 and BPSO-V1, BPSO-V2, BPSO-V3.Their specific formulations can refer to [49].Clerc designed several different algorithms (referring to source codes in http://clerc.maurice.free.fr/pso/binary_pso/.2010) to enrich the family of BPSO.After preliminary testing, two strategies show promising performance on solving CRAP, respectively, the modified Kennedy-Eberhart BPSO (termed as BPSO-S4), and the algorithm with majority-voting strategy (termed as MV-BPSO).For each particle X (g,r) in the swarm:

Algorithm 1 Framework of PHSO
Update NEW_X (g,r) by ( 9), ( 10), ( 11) The framework of PHSO is shown in Algorithm 1.In the algorithm, the velocity updating formula in the first step is learned from LLSO [35], which is originally designed for continuous optimization.Here, we modified and adapt it to the discrete optimization, with principles as follows.First sort the particles with an ascending order of fitness, then divide the sorted particles into NG groups with the better particles placed in the group with lower ranking.Let NP g be the number of particles in the gth group, which equals floor(NP/NG) for the top (NG − 1) groups and NP − floor(NP/NG) for the last one group.Let X (g,r) = [x d (g,r) ] and V (g,r) = [v d (g,r) ](g = 1, 2, . . ., NG, r = 1, 2, . . ., NP g , d = 1, 2, . . ., D) represent the position and velocity of the rth particle in the gth group.Similar to the original BPSO, to update V g,r , we need two exemplars to learn from.The exemplars are located in two superior groups g and g (1 ≤ g ≤ g ≤ g − 1), respectively, the r th and the r th particle.In other words, exemplars for the second group are selected from the first group.Exemplars for the third and higher-numbered groups are selected from two lower-number groups randomly.Finally, V g,r is updated by where the w represents the extent of the impact of the past velocities on current velocities.c is a constant.r 1 and r 2 are two random variables ranging within [0, 1].Note that the position X (g,r) is a solution of the problem.The final global best position of the swarm (the final Gbest) is the optimal solution found by PHSO.Elements in position denote the whether a resource exists.Therefore, adding a resource mean the corresponding element is assigned 1.To update a position, we first initialize a zero vector NEW_X (g,r) one by one until the constraint condition is satisfied, resulting in the final X (g,r) .Theoretically, when all the resources are added into NEW_X (g,r) , the constraint condition will be surely satisfied.A hierarchical learning mechanism is designed to construct NEW_X (g,r) , including three steps.
1) H1-Learning: Construct NEW_X (g,r) by using resources with first-priority.These resources are collected from a crisp set of velocity Cut_V (g,r) = [cut_v d (g,r) ].A threshold parameter τ ranging with [0, 1] can adjust the number of resources in Cut_V (g,r) .then Cut_V (g,r) is updated by where S(•) is the sigmoid function and φ(•) is the binarization function, formulated by where "⊕" means adding a resource, namely changing the given element in the vector to 1.
2) H2-Learning: Construct NEW_X (g,r) by using the resources with the second-priority.Figuratively speaking, the resources with the second priority are voted by the Pbest (g,r) and the Gbest.To put it exactly, we first estimate the selecting probability of each resource, and then put the resources into a vector Cut_X (g,r) = [cut_x d (g,r) ] according to the probabilities.Cut_X (g,r) is constructed by using 3) H3-Learning: Construct NEW_X (g,r) by adding random resources not in NEW_X (g,r) .Let vector Other_X (g,r) = [other_x d (g,r) ] be the opposite of NEW_X (g,r) .H3learning aims at adding resources in Other_X (g,r) to NEW_X (g,r) if the constraint condition is not exceeded A Repairing Function is shown in Algorithm 2. Since the resource allocation problems with strict constraints, it is necessary to generate eligible solutions.There are two commonly used methods: the constructive method, which gradually adds resources to empty vectors until the constraint is violated, and the repairing method, which first generates random solutions and then reduce some resources until the constraint is not satisfied.The former is time-consuming, and the latter is not very refined.We learn from their advantages and avoid the deficiencies, using "striding forward" to accelerate the constructive process and "walking back" to refine the repairing process.This repairing function is also used in Algorithm 1.To facilitate the comparisons, it is simultaneously used in the competing swarm optimizers in the experiments section.

C. Theoretical Analysis
The theoretical foundation of PSOs is lag far behind its practical success [50].This is not a separate problem for PSOs, but a common challenge for most EAs.Existing studies [35], [51], [52] tended to analyzes EAs from the perspective of computer science, by viewing them as algorithmic techniques other than random processes.Following these studies, we develop comparative analysis and time complexity analysis for PHSO.
1) Comparative Analysis: Comparative analysis usually focuses on the exploration and exploitation abilities of PSOs [35].Good exploration increases the solution diversity, and good exploitation enhances solution accuracy.The diversity and accuracy can be measured by the distance and the difference between new solutions and the updated solutions, respectively.Since the increasing of solution accuracy is usually at the cost of less diversity, it is still a challenge to balance the exploration and exploitation.
For PHSO formulated by ( 8)- (11), signifying where φ 1 (•), φ 2 (•), and φ 3 (•) are the key binarization functions of H1-learning, H2-learning, and H3-learning, respectively.a) Exploration ability: From ( 12) and ( 14), we find in BPSO, only Pbest k and Gbest can become the exemplars of a particle, i.e., the exemplar size of a particle is fixed to 2. On the other hand in PHSO, any superior individual can be selected as the exemplar of a particle, i.e., the average number of superior individuals of a particle, is about NP/2, which is significantly large than 2. The larger the exemplar size is, the higher the solution diversity will be.A higher solution diversity will further extend the exploration space and help the algorithm jump out of local optima.
b) Exploitation ability: In ( 13), the sigmoid function is used in BPSO to generation binary solutions.This indiscriminate way will cause a great loss of heuristic information.As a result, the position updating process retains high uncertainty, and the algorithm convergence speed is correspondingly decreased.In (15), to avoid blind searching, PHSO includes a threshold τ to filter out the first-priority particles with most promising searching directions, and a voting mechanism to filter out the second-priority particles voted by Gbest and Pbest k .Meanwhile, the carefully designed randomness in picking up above elite particles can avoid premature convergence.
c) Adaptability: PHSO can adapt to more different situations by adjusting parameters.If τ = 0, PHSO is degraded to a binary version of LLSO, whose solution diversity is increased but convergence speed is decreased.If τ = 1, the algorithm becomes a modified version of BPSO by learning from local best and global best positions, whose convergence speed is accelerated early but may become stagnate.In contrast, BPSO has a fixed usage pattern, whose application scenarios is therefore limited.
2) Time Complexity Analysis: Velocity and position updating are two major processes in PSOs.For the both, BPSO and PHSO share the same time complexity O(NP × D), with NP representing the swarm size and D denoting the problem dimensionality.Next, let us discuss the differences from the perspective of vector operation.Regarding each vector updating process as a loop, we can estimate algorithm execution time by counting loops.BPSO formulated by (7) has time complexityO(2 × NP × D), owing to the updating of 2 × NP vectors: X k and V k with k = 1 to NP.For PHSO formulated by ( 8)-( 11 Though PHSO takes more time than BPSO, the extra expenditure is still acceptable, for it is within a linear range.Another reason, the iteration of BPSO and PHSO is timed by seconds, while the epidemic is evolved by days/weeks.This illustrates that it is the solution quality other than the run time that deserves more attentions.

A. Experimental Configurations
The experiments are performed on 1) four types of artificial networks: Regular network [denoted as RG(N, k)] [42], Erdõs-Rényi random network [termed as ER(N, p)] [53], Barabasi-Albert scale-free network [termed as BA(N, k)] and Watts-Strogatz small world network (marked by WS(N, k, p) [54]), with N representing the number of nodes in the network, k denoting the average degree of generated networks, and p representing the random probability in generating links, and 2) two types of practical networks that are built from two sets of human contact network data: Primary School Temporal Network Data in BMC Infectious Diseases 2014 (Ps-contact) sourced from http://www.sociopatterns.org/datasets/primary-schooltemporal-network-data/,and Face-to-Face Behavior Data of People during the Exhibition Infectious in 2009 at the Science Gallery in Dublin (Ex-contact) sourced from http://konect.unikoblenz.de/networks/sociopatterns-infectious.
The constraint C is set as C = ratio × C max , where C max = Cost(1 5×N ) is the upper bound of the resource cost and only refers to Prob i (t).To simplify the problem, we user some simple linear functions to formulate cost functions The boundary parameters in the SEIV model are initialized as Thereinto, ξ i is a Gaussian random number and may be changed later under the influence of resource R 3 .γ i is a constant, initialized by a Gaussian random number.And empirically, we use three sets of parameters in the SEIV model to simulate three types of epidemic.First, the conventional infectious diseases with contact-based infection (CIDC), such as blood-borne diseases, usually spread among the population with specific contacts or relationships.Individuals infected by CIDC usually have limited candidates to infect (low β E i and β I i ), a relatively low self-healing ability (low δ E i and δ I i ), and a low self-immune ability (low θ i ).And we assume the individuals recovered from CIDC has a relatively high ability to retain its immunity (low γ i ) CIDC: Second, conventional infectious diseases with medium-based infection (CIDM), such as influenza flu, are spread to others by coughing or sneezing.CIDM usually has more potential candidates to infect than CIDC in real world (high β E i and β I i ), high self-healing ability (high δ E i and δ I i ), higher self-immune ability (high θ i ), and low ability to retain its immunity (high γ i ) CIDM: Third, the emerging infectious diseases (EIDs), such as severe acute respiratory syndromes (SARSs), are characterized by the high infection rates, low self-recovery rates, low self-healing ability, and high ability to retain its immunity.Therefore, we set its parameters by a compromise of CIDC and CIDM, as EID: The network and problem configurations are shown in TABLE S-I in the supplementary material.In the table, we take EID as an example to illustrate the happening environment of the problem.The problem configurations of CIDC and CIDM can be deduced in the following steps.And at time t, we use PE )N to represent the average probability of infectious nodes.A baseline is set to help determine the time point that resources are allocated, which is y = 0.1 for CIDC, y = 0.3 for CIDM, and y = 0.2 for EID.The crossing point of the baseline and the line of y = PE + PI will be the time point of starting resource allocation optimization.For example, in CIDC, the baseline y = 0.1 and y = PE + PI have a crossing point (14, 0.3046), then we will optimize the resource allocation at the time point t = 14.To facilitate the understanding, we plot the line of y = PE + PI of CIDC, CIDM, and EID in Fig. S1(a)-(c) in the supplementary material, respectively.In Fig. S1(d)-(f) in the supplementary material, we plot the dynamic of the averaging prevalence rate with AvgUi = ( N i=1 u i (t))/N.By comparing the former three panels and the latter three panels of Fig. S1 in the supplementary material, it can be observed that y = AvgUi and y = PE + PI is positively correlated, and the degree of correlation depends on the network topology.Note that the results in Fig. S1 in the supplementary material are obtained by averaging over 20 independent runs, with the SEIV model evolving for 200 time-steps in each run.
Additionally, all the experiments are conducted on the PC Clusters, with 4 Intel Core i5-3470 3.20-GHz CPUs, 8-GB memory, and Ubuntu 12.04 LTS 64-bit system for each PC.The programming language is Python.

B. Comparisons of PHSO and Other Optimizers
In this section, the performance of the proposed PHSO in suppress EID is tested, which is compared with the existing swarm optimizers mentioned in Section V-A, including In our preliminary experiments conducted on the 7 networks, the same conclusions are drawn about the ranking of comparison algorithms.Since it has been proven that the WS network can better characterize small-world phenomena happening in real-world biological, technological and social networks [55], we choose the WS network as an exemplar to display the conclusion.The convergence curves of comparison algorithms are plotted in Fig. 2. Results are obtained by averaging over 50 independent runs for each algorithm.We use the same color to demonstrate the same family of algorithms, where solid and bold lines represent the best family members.The used discretization methods for each continuous algorithm are tested to be better than others.In the figure, most binary swarm optimizers [Fig.2(a)] show better performance than the continuous swarm optimizers with different discretization methods [Fig.2(b)].Noteworthy, in the discrete swarm optimizers, BPSO-S4, MV-BPSO, and PHSO significantly outperform than other algorithms.In the continuous swarm optimizers, MV-LLSO and PHSO show relatively better performance than others.More detailed solutions are shown in TABLE S-II in the supplementary material, where Wilcoxon rank-sum test [56] is performed at a significance level of α = 0.05.From the table, we observe that BPSO-S4 and PHSO provide high-quality solutions.
Then, we compare the performance of BPSO-S4, BPSO-V2, MV-BPSO, MV-LLSO, and PHSO on seven networks.Other optimizers except PHSO are selected due to their relatively good performance in their specific optimizer family.Results are shown in TABLE S-III in the supplementary material, in which the instances conducted on WS(500, 6, 0.1) are obtained by averaging over 30 independent runs, and the other instances are obtained by averaging over 50 independent runs.We find PHSO and BPSO-S4 are significantly better than other algorithms on solution quality, and PHSO outperforms than BPSO-S4 on 6 of 7 networks.
In short, PHSO preserves good convergence speed and has better solution quality than most of the existing swarm optimizers in solving RCAP.It also has good adaptive capacity to different network topologies.

C. Parameter Settings of PHSO
Generally, there are three commonly used parameters in swarm optimizers: 1) the number of particles in the swarm NP; 2) the inertial coefficient w; and 3) the acceleration coefficient c.We discuss the influences of the three parameters, with results shown in Fig. 3(a)-(c).The results are obtained by averaging over 50 independent runs on a fixed WS network.The fixed network is an exemplar for 500 network realizations, whose average clustering coefficient and average shortest path is maximally close to the mean values of the 500 networks.We use Evaluations to describe the total calculation times of the fitness, and Iterations to describe the total generations of particles updating.In Fig. 3(a), the value of Evaluations is fixed, and Evaluations = NP × Iterations.The algorithm with NP ∈ {10, 20, 40} obtains better solution quality than NP ∈ {60, 80, 100}.And when NP = 40, PHSO has the fastest convergence speed.Fig. 3(b) demonstrates that PHSO is not sensitive to the parameter c, so we simply set c = 2 in all experiments.Next, as is shown in Fig. 3(c), PHSO is sensitive to the parameter w.As w increases from 0.4 to 1.0, the convergence speed and the solution quality are greatly improved.As the value of w increases, the historically effective resources are more likely to be retained in the next generation.As the resources having a better effect are gathered together, the solution quality will be greatly improved.But as w increases from 1.0 to 1.2, the solution quality is significantly decreased, because the particles are easy to trap into local optima under the exaggerated votes.Therefore, w = 1 is the best setting.
Specially, two extra parameters are highlighted in PHSO: 1) the number of groups NG and 2) the parameter threshold which decides the percentage of resources in cut_V being filtered out.Fig. 3(d) demonstrates that PHSO is not sensitive to parameter NG when NG > 2. As shown in Fig. 3(e), a large threshold (threshold = 0.5) helps accelerate the convergence speed and generates better solutions than the small values.A fixed value of threshold is proved to be better than a random value of threshold in solving RCAP.Finally, a fixed value of threshold ranging within [0.5, 0.9] and NG > 2 are recommended.
Moreover, we develop several variants of PHSO and compare them with the original PHSO in Fig. 3(f).The four PHSO-variants formulated as follows: 1) PHSO-exupdater, which exchanges the roles of (X g1,rg1 , X g2,rg2 ) and (Pbest i , Gbest), by using the former to determine Cut_X i for the second-priority resources, and the latter to determine Cut_V i for the first-priority resources; 2) PHSO-del-cutV, in which the New_X i directly learn from sigmoid(V i ) in the firsthierarchical learning process, without Cut_V i ; and 3) PHSOdel-cutX, in which New_X i does not learn from Cut_X i again.
Comparison results are as follows.First, PHSO-ex-updater has a slowest convergence speed in the former 650 iterations.Then, if do not consider Cut_V i , the algorithm is easy to trap into local optima and lead to worse solution quality.PHSO-del-cut_X is promising in accelerating the convergence speed, especially at the early stage of iterations, but its final solution quality is still a little worse than PHSO.To summarize, only the original PHSO shows a competitive convergence speed in the former stage, and obtains highest quality solutions in the end.The mechanisms of priority planning and hierarchical learning in PHSO have the potential in promoting the exploration and exploitation of the algorithm.

D. Resource Distribution and Intervention Results
The detailed resource allocation results are investigated in this section, including the resource distribution proportion under the increasing resource budgets (shown in Fig. S2 in the supplementary material) and under the different epidemic types (shown in Fig. S3 in the supplementary material).Experiments are conducted on six different networks, with 20 independent runs for each instance.
In Fig. S2 in the supplementary material, for the artificial networks have same average degree k = 4, the proportion of R 1 is significantly increased as the constraint cost increases from 0.1×C max to 0.9×C max .And simultaneously, the proportion of R 5 is significantly decreased.It illustrates that when in the network with a relatively low degree, if there are severely limited budgets, the curing resources to the exposed infections should be mostly allocated.The detective resource R 3 and the vaccinating resource R 1 are in the following.And if these low-degree networks are provided with more budgets, the vaccinating resources worth the most investment, and the protective resource R 2 .Besides, more budget means the better epidemic control effect.In short, prevention is better than cure, but it takes more cost.
Different from the above conclusions, in the Ps-contact and the Ex-contact networks, whatever the cost is, R 1 is always the most popular resource.The reasons are on two aspects.As for the Ps-contact network, it has a very high average degree k = 34.37 and a very low average shortest path length 1.73, which lead to the fast-spreading of epidemics.In a short time, nearly all nodes in the network are at high infection risk.In this situation, allocating vaccinating resources can directly lower the infection possibility of individuals.R 2 takes the same effect.As for the Ex-contact, the infectious individuals only take a small proportion of the population, but the upper bound of cost C max is very high.For example, in the WS network with 100 nodes, C max = 13.350957.But in the Ex-contact network with 410 nodes, C max = 78.459651.As the maximum cost increases, the percentage of curing resources is decreased.
In Fig. S3 in the supplementary material, given C = 0.3 × C max , we investigate the resource distribution proportion of CIDC, CIDM, and EID spreading on six networks.There are several noticeable phenomena.The vaccinating resource R 1 is no doubt the most popular resource, and the protective resource R 2 becomes the secondary popular resource.For the CIDC happening in RG(100, 4), ER(100, 0.04), WS(100, 4, 0.1), and Ex-contact, two curing resources R 4 and R 5 share the similar number of resources.But due to the single price of R 5 is a little more expensive than R 4 , the total cost of R 5 seems more than R 4 .For the CIDM happening in BA(100, 4) network, the investment of R 3 , R 4 , and R 5 seems unnecessary.For the EID happening in RG(100, 4), ER(100, 0.04), BA(100, 4), and WS(100, 4, 0.1), the costs invested on the detective resource R 3 are relatively much more than the CIDC and the CIDM.
Resource intervention results happening in BA and WS networks are shown in Fig. 4. Significantly, PHSO takes great advantages than BPSO, the random resource allocation strategy (termed as "Random"), and no resource intervention (termed as "None").With the solutions provided by PHSO, the system finally converges to disease-free equilibrium.The reason can be tracked back to Fig. 2(a) and Theorem 2. Theorem 2 provides a sufficient condition of disease-free equilibrium: R = λ * (L ) + 1 ≤ 1, namely, λ * (L ) ≤ 0. From Fig. 2(a), we find only PHSO and BPSOS4 can produce the solution satisfying this condition.Specifically, when in the final equilibrium point, the number of infectious individuals in relation to BPSOS4 is around 1.99E-5 for BA and 1.156E-4 for WS, while the two values for PHSO is both zero.It is a strong evidence proving that theoretical optimization results take direct influence on the final system equilibrium.We can, therefore, summarize an important conclusion: if the algorithm wins the fitness competition, it can win victory in the system equilibrium state.

E. Application Case
The perfect scenarios for applying epidemic analog control are in communities where human contacts are regular and periodic, such as schools, villages, or small cities.Simulations on these communities can generate hyper-realistic decisions to persuade policy makers.A simple application case is built in the following.
We use a high-resolution contact dataset [44] which is collected from a primary school in Lyon, France, 2009.Faceto-face contacts among children were recorded by wearable proximity sensors every 20 s.All the activities produced by 232 students within two school days were collected in total.Gemmetto et al. [57] built a simple periodic repetition model to extend the working life of the two-day data.Since children in France at that time went to school only on Monday, Tuesday, Thursday, and Friday, so they use two-round repetition to imitate the weekly contacts in school days.The children in the home may be infected by others with a small probability.Our experiments are conducted in this context.
First, let we introduce how to use the epidemic model.In real world, the individual state at each time is definite.Therefore, state probabilities will be transformed into binary states that corresponds to the health conditions of children ( For each element p in Prob i (t), Binarization(p) = 0 if Rand(0, 1) < p else Binarization(p) = 1.When the proportion of infectious individual is large than 10%, resource intervention starts.Noteworthy, the input states of children are sourced from real-time observations, while the future infection process are simulated through the epidemic model.Next, let us customize specific resource types and epidemic parameters by learning from domain knowledge or specialist experience.For examples, mosquito net to against Africa malaria, Face masks to prevent influenza flu, medicines to cure the patients, etc.Just as an exemplar, we set the epidemic as "Influenza," with parameters as follows.The SEIV model parameters for Influenza are θ i = 1/4, β E i = 0.007, β I i = 0.007 δ E i = 1/4, δ I i = 1/4ξ i = 1/2, γ i ∼ N 1/4, 1 6 .The cost and utility functions can be formulated by where the detective resource R 3 is assumed to be negligibly cheap, and the curative resource R 4 for state I is more expensive than resource R 5 for state E.These settings come from our empirical observations on Influenza.We set three levels of resource budget: 0.3C max , 0.5C max , and 0.7C max , to observe whether the algorithm performance is influenced by the cost constraints.As a more intuitive evaluation criteria, we use the number of children in state I [termed as N(I)] to evaluate the epidemic control results.
Finally, based on the real-time epidemic model and the customized resource model, PHSO can be used to solve the resource allocation problem.It will provide an optimal solution to effectively allocate the resources.
Resource intervention results are shown in Fig. 5.The line of PHSO is distinctly lower than BPSO, random resource allocation strategy (termed as Random), and no resource intervention (termed as None).As the cost increases from 0.3C max to 0.7C max , the advantage of BPSO is gradually lost, whose line is close to that of random resource allocation strategy.It indicates that BPSO has trap into the local optima, thus it loses the chance to find more optimal solutions.Besides, the more is the resources budget, the less is N(I) each day.But when the available cost is more than 0.5C max , the effect of more cost investment heavily depends on a reasonable resource allocation scheme, otherwise it will be wasted.The competitive performance of PHSO is verified.

VI. CONCLUSION
In this article, we use a modified SEIV model to simulate epidemic spreading dynamics and analyze the global stability.Then a concrete resource description model is designed, which helps identify a subset selection problem in epidemic control.A new swarm optimizer with priority planning and hierarchical learning (PHSO) is proposed to solve the problem, and necessary theoretical analysis is provided.Extensive numerical experiments performed on different network topologies show that PHSO performs better than other swarm optimizers in both solution quality and convergence speed.A practical application case illustrates how to use the epidemic control system in real-world epidemic control, whose results also verify the effectiveness of PHSO.
In the future, some important and unsolved problems deserve further studies, such as dynamic epidemic control and large-scale resource intervention with cooperatively co-evolutionary approaches [58], [59].Besides, the epidemic control happening in large-scale networks is still a challenge.Investigating the influence of community structure [60], [61] on the propagation dynamics may help solve this challenge, which is another direction.
c 1 (•) and c 2 (•), respectively, represent the cost function of R 1 and R 2 which affect susceptible individuals, c 3 (•) and c 5 (•), respectively, represent the cost functions of R 3 and R 5 taking effect to exposed individuals, and c 4 (•) represents the cost function of R 4 with respect to infected individuals.The resource quantity is replaced by the sum of state possibility under the guidance of mean-field theory.Finally, the inclusive cost of all resources is ), time complexity is O(9×NP × D).Separately, H1-learning consumes O(2×NP × D) due to the generating of Cut_V (g,r) and the updating of NEW_X (g,r) .H2-learning and H3-learning consumes O(3 × NP × D) and O(2 × NP × D), respectively.

Fig. 4 .
Fig. 4. Epidemic spread dynamics after resource intervention.Results are obtained by averaging over 100 algorithm realizations.BPSO and PHSO evolve for 1000 iterations at each realization.(a) For BA and (b) for WS.

Fig. 5 .
Fig. 5. Number of infected children in one week after resource intervention.Results are obtained by averaging over 100 independent algorithm realizations.At each realization, BPSO and PHSO holding 10 particles and evolving for 100 generations.(a) C = 0.3C max .(b) C = 0.5C max .(c) C = 0.7C max .
Divide the sorted particles into NG groups.2.3 // Update velocity and position: