A Novel Heterogeneous Sensor-Weapon-Target Cooperative Assignment for Ground-to-Air Defense by Efficient Evolutionary Approaches

Based on the tactics of “intercepting at the furthest and resisting in steps” in the ground-to-air defense, an integrated sensor/heterogeneous weapon allocation problem is proposed for the first time. It is formulated as a dynamic sensor/heterogeneous weapon target assignment (DS/HWTA) model in continuous time domain, including the models of the threat assessment, sensor detection probability, weapon damage probability, decision timing, unified optimization objective and constraints. An evolutionary algorithm for dynamic sensor/heterogeneous weapon target assignment (EA-DS/HWTA) is proposed, in which three coding methods are designed to build the solution individuals. The de-constrained initial population is realized based on type I coding. A modified position based crossover (MPBX) based on type II coding is presented to maintain the sensor-weapon synergy of the solutions. Based on a greedy fitness strategy, the type II coding individual is complemented and transformed to type III coding to calculate the solution fitness. The extensive experiments show that the DS/HWTA model reflects the cooperation requirements of sensor-weapon in anti-penetration operation, and the proposed solving algorithm has good convergence and real-time performance.


I. INTRODUCTION
Sensors and weapons are two significant operational resources that complement each other in modern warfare, and their cooperative engagement capability (CEC) has a crucial impact on the completion of operational missions. Missile Defense Agency states that ''sensor resource management has been exhaustively studied when weapons are selected and engagement timeline is known'' and ''individual study has lead to a performance gap with independently optimized weapons and sensors'' [1]. The joint optimization approach has not been investigated for the missile defense arena [2]. The future airstrikes has the characteristics of multi-level, The associate editor coordinating the review of this manuscript and approving it for publication was Huaqing Li . multi-batch, and multi-directional threats, which not only present new challenges to traditional decision-making issues such as sensor resource management (SRM) and weapontarget assignment (WTA), but also introduce new requirements for integrated sensor/ weapon operations.
With the independent development of SRM [3]- [5] and WTA [6]- [9], the organic integration of these two problems is crucial for defense operation. However, the integrated SRM and WTA has not a uniform formulation. Owing to SRM aims to track targets and guide weapons to intercept targets, the joint optimization approach for sensors and weapons is researched as a variant of the WTA problem, namely Sensor/Weapon-Target Assignment (S/WTA). The S/WTA model can be divided into two categories: the independent S/WTA and the integrated S/WTA. The difference between VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ independent S/WTA and integrated S/WTA is whether it considers the interdependencies of sensors and weapons to targets. The optimization objective of independent S/WTA model is the sum of the benefits of assigning each sensor to each target and each weapon to each target, and the sensortarget benefit is independent of the weapon-target benefit. For instance, Bogdanowicz et al. [10] established an independent sensor/weapon-target pairings model and proposed the auction algorithm to solve it. Zi-fen et al. [11] presented an improved Swt-opt algorithm to assign the independent S/WTA model. The integrated S/WTA model, in which the damage efficiency of weapons depended on the accuracy and reliability of sensors, is developed as the future direction of intelligent Command & Control (C2) system. For instance, Bogdanowicz and Coleman [12] introduced the S/WTA model of sensor-weapon pairs with targets at time dimension, and designed an exact optimization based on the Swt_opt component of auction algorithm. Chen et al. [13] considered the weapon-target efficiency depending on the pre-assigned sensor, and proposed a hybrid algorithm incorporating the particle swarm optimization (PSO) and genetic operator. Jian and Chen [14] presented a modified genetic algorithm to solve the integrated S/WTA model, which can dynamically adjust in three components of the search region.
Xin et al. [15] built a S/WTA model for Network-Centric Warfare (NCW) by modeling the probability of successful interception as the product of the damage probability and the detection probability, and presented a marginal-return-based constructive heuristic (MRBCH) algorithm. Li et al. [16] presented a modified genetic algorithm, which incorporates the population initialization and repair operators by prior knowledge, to solve S/WTA model. Mu et al. [17] established the S/WTA model for intelligent minefield and applied the multi-scale quantum harmonic oscillator algorithm on it. Obviously, the integrated S/WTA model is more in line with the task requirements of the future decision support system. WTA problem can also be divided into two categories: the static WTA (SWTA) and the dynamic WTA (DWTA), and the review of the WTA problem can refer to [18]. The difference between SWTA and DWTA is whether the time is considered as a dimension [19]. The above mentioned S/WTA researches belong to the SWTA problem. However, it is difficult for a single type of weapon to complete the defense task, and multiple types of weapons are usually engaged in several decision-salvo stages.
In ground-to-air defense scenario, WTA problem is the crucial decision support in Command & Control (C2) system. Li et al. [20] considered the layered Ballistic Missile Defense System (BMDS) of assigning interceptors to multiple waves of incoming ballistic missiles, and proposed a modified particle swarm optimization to solve the resource planning model. Summers et al. [21] formulated a dynamic WTA model of theater ballistic missile defense (TBMD) problem, and utilized an approximate dynamic programming (ADP) approach to solve it. Kim et al. [22] established the time-dependent SWTA model for high-speed enemy missile and presented the decentralized decision making algorithm. Hocaoğlu [23] formulated the allocating the air defense missiles to incoming air targets of land-based air defense systems. Nevertheless, the major WTA researches for air defense do not consider the integrated sensor-weapon decision system, and the CEC of sensors and heterogeneous weapons is critical for defense efficiency.
At present, few researches have focused on the S/WTA problem in ground-to-air defense. The actual ground-toair anti-penetration operations follow the dynamic tactics of ''interception as far as possible, stepped resistance, and layered defense'', and adopt the configuration of ''high outside and low inside, more outside and less inside.'' For the dynamic and heterogeneous requirements of antipenetration scenario, this paper establishes a dynamic sensor/heterogeneous weapon-target assignment (DS/HWTA) problem considering the interdependencies of sensors and heterogeneous weapons, and proposes the evolutionary algorithm to obtain the interception schemes for the decisionmaker. The contributions of this paper are described as follows: 1) For the operational requirements of ground-to-air antipenetration, the heterogeneous sensor-weapon-target cooperative assignment problem is presented, and is formulated as a dynamic sensor/heterogeneous weapon target assignment model in continuous time domain; 2) A novel evolutionary algorithm for DS/HWTA (EA-DS/HWTA) is proposed to obtain the sequential interception schemes excellently. The modified approaches of population initialization, crossover operator, and environment selection are designed to balance the optimality and real-time performance; 3) The simulation framework is established by incorporating the models of two types of missiles, flight target and state transition. The extensive experiments demonstrate that the proposed approaches are effective and promising.
The rest of this paper is organized as follows. Section II gives our motivation and formulates the DS/HWTA problem. Section III establishes the optimization objective and constraints of DS/HWTA model. Section IV presents the EA-DS/HWTA solver in detailed. Section V verifies the proposed algorithm solving DS/HWTA by experimental studies. The conclusion and future work is finally summarized in Section VI.
Missile Defense (NMD) as an example, it conducts early warning and classification of incoming missiles based on the space-based infrared system (SBIRS) and upgraded early warning radar (UEWR). X-band radar (XBR) performs highprecision irradiation and tracking on the targets and the ground-based interceptors (GBI) launch for the interception. XBR guides the medium and long-range interceptors to the upper-level targets with the greatest threat. If the number and coverage of interceptors are limited, the survival targets fall to the lower layer. The defense has less time to intercept these surviving targets with the higher speed and threat, and requests the close-range interceptors with stronger maneuverability. Battle management, command, control, and communications system (BM/C3) integrates the data analysis, scheme evaluation, command control, and coordination by data link in the whole interception process.
In the air defense combat, the main ground-based antiair firepower is composed of anti-air missiles, archibalds, and sensors. The missile-gun integrated weapon system is a complex air defense system composed of short range air defense missile, anti-aircraft gun, missile-gun integrated fire control system and so on. The tactics of missile-gun integrated weapon system is as follows: Air defense missile has the advantages of long-range and high precision, but it also has the disadvantages of large lower bound of attack zone, poor anti-saturation attack capability, and high cost. The archibald has the advantages of rapid response, strong anti-interference ability, intensive firepower, and low cost, but the interception airspace is small. Therefore, the joint fire control of the integrated missile-gun system can effectively handle low/mid-altitude defense task and improve the overall operational effectiveness of the defense system by unified command and control of anti-air missiles, archibalds, and sensors.
By refining the characteristics of the above multi-layered defense system and missile-gun integrated system, a heterogeneous sensor-weapon-target cooperative assignment problem is presented: during the target penetration from far to near, the defender has multiple weapon launch platforms, such as missile launch vehicle, at the unified level of the decision network. Each platform has its own sensor (mainly radar) and several interceptors, which are classified as close-range weapons and medium-range weapons. The characteristic of two types weapon is whether the weapon request the target information from sensors. The close-range weapon may be uncontrolled interceptors, such as guns, archibalds, or closerange dogfight missiles in which the missile-borne seeker can directly lock the target without the external radar providing target information, such as Patriot Advanced Capability-2 (PAC-2). The medium-range weapon requests the target information by the ground-based or air-based radar in the midcourse guidance, and the missile-borne radar turns on to lock the target in the terminal guidance. Hence the mediumrange weapon can only shoots the targets tracked by sensors, such as Patriot Advanced Capability-3 (PAC-3). All the above, the proposed DS/HWTA problem has the following collaborations. The architecture of the proposed decision support system is shown in Figure 1.
1) Coordination of close-range weapons and mediumrange weapons. The close-range weapons can not attack the target in the long-distance space because it is not designed to receive the midcourse guidance information. The medium-range weapon has a large interception distance and can maintain a considerable kill probability in a wide range. However, the available overload and maneuverability of the medium-range weapon are weaker than the close-range weapon, and the lower boundary of the no-escape area is large. Hence the medium-range weapon has a much lower damage probability than the close-range weapon for the close-space target. 2) Cooperation of sensors and weapons. The mediumrange weapon is responsible for the long-distance interception. The sensors are required to capture and track targets, and provide midcourse guidance information for medium-range weapon. The close-range weapon is the uncontrolled weapon, or the missile-borne seeker directly locks the target without the tracking information from sensors.

B. THREAT ASSESSMENT
In the anti-penetration scenario, the target threat can be assessed by the relative situation between the penetration target and the defender. The threat assessment model can be established by the following track attributes.
(1) Altitude. Due to the impact of terrain factors, the lower the target's flight altitude, the lower the probability of detection by the sensor, and the greater the threat to the defense side. When the flight altitude is lower than a certain value, the threat is maximum, and when the flight altitude is higher than the value, the threat can be presented by the descending normal distribution function where h is the current flight altitude; h l is the highest altitude corresponding to the maximum threat value; k h is the attenuation parameter. VOLUME 8, 2020 (2) Velocity. The faster the target flies, the stronger the penetration capability is, and the higher the threat to our side will be. In particular, the hypersonic/supersonic penetration target will affect the accuracy of sensor tracking and weapon damage, thus reducing the interception probability. The velocity threat degree can be expressed as where v is the target velocity; k v is the gain coefficient.
(3) Short course. The short course is also an important indicator that reflects the intent and threat of the target. The smaller the short course of the target to the asset, the more significant the attack intention and threat are. The intermediate normal distribution can be used to evaluate the course threat where c is the distance of the asset and the short course; k c is the attenuation parameter of target attack range.
(4) Remaining intercept time. Since ground-to-air weapons have the limitation of the minimum/maximum attack boundary, the remaining intercept time refers to the time of the penetration target reaching the intercept boundary with the current velocity. Generally, the shorter the remaining interception time, the lower the interception success probability, that is, the greater the target threat degree. The reduced seminormal distribution function can be used to show the threat of the target's remaining intercept time.
v t (t) = e −k t t 2 (4) where t is the remaining intercept time; k t is the attenuation parameter.
In summary, the threat evaluation of penetration targets, namely the interception value model, can be established as where w h , w v , w c , and w t are the weight parameters of the threat factors.

C. DETECTION PROBABILITY OF SENSOR
Considering the radar sensor, the detection probability of the radar sensor is the signal-to-noise (SNR) ratio function related to the false alarm probability, which is challenging to be solved accurately due to the complicated theoretical calculation [24], [25]. In the case of low requirement for detection accuracy, the empirical formula used to fit the simplified detection probability by distance. where D is the distance between the radar and target; k d is the radar descent coefficient; D max is the maximum effective range of radar detection.

D. INTERCEPTION PROBABILITY OF WEAPON
Different types of weapons have different probability distributions for target interception, such as uncontrolled interceptors (antiaircraft gun), simple control interceptors (rocket projectile), guided interceptors (missile). This paper takes the medium/close-range ground-to-air missiles as two types of weapons. The interception probability of guided missile is investigated in [26]- [28]. It can be concluded that the interception probability of ground-to-air missiles is related to the guidance accuracy and the consumed energy in the flight process. The major evaluation indicators are: (1) Miss-distance. The smaller miss-distance indicates the higher interception probability, which is an essential indicator for evaluating interception efficiency. (2) Remaining flight time. The remaining flight time is the time required for the weapon to intercept the target. Less remaining flight time represents less flight energy consumption, which contributes to the realtime and accuracy. (3) Line-of-sight (LOS) rate. The guidance law of most homing weapons achieve the interception by converging the LOS rate, and the smaller LOS rate represents the less energy loss of the normal acceleration. Therefore, the interception probability of weapon can be constructed by miss-distance, remaining flight time, and LOS rate. Derived from Figure 2 of weapon guidance geometry and guidance system model, the miss distance, remaining flight time and line-of-sight angular velocity can be calculated as where N is the navigation constant. In the above three indicators, the miss-distance evaluates the interception efficiency, and the remaining time and LOS rate evaluate the energy consumed during the flight process. Negative exponential function can be used to construct the interception probability of weapon i against target j where δ S , δ T go , and δq are obtained by the mean indicators of operational scenario. Take m weapons intercepting n targets as an example Interception probability of weapon i against target j can be indicated by weighted processing where β S , β T go , and βq are the weight parameters of the three indicators.

III. DYNAMIC SENSOR/HETEROGENEOUS WEAPON-TARGET OPTIMIZATION MODEL
The operational scenario of the proposed DS/HWTA is: In the air-defense environment, the early warning radar captures n incoming enemy targets and detects track information such as position and speed. The defender has s weapon launch platforms, that is, the number of platform sensors is m, and the number of available weapons of each platform is m h , h = 1, 2, . . . , s. By the types of interception weapon, weapon resources can be divided into medium-range and close-range, in which the interception process of mediumrange weapon needs the attack guidance of sensors. Based on the current relative situation, the defender should solve the sensor/weapon attack decision to maximize the interception efficiency. The notation employed in the context is listed in Table 1.

A. DECISION TIMING MODEL
Firstly, an essential variable of the weapon launch state is introduced where l j hi and u j hi respectively represent the lower and upper boundary of the attack zone of platform h weapon i to target j; a j hi = 1 indicates that target j is in the attack zone of medium-range weapon w hi and requires sensors cooperation for midcourse guidance; a j hi = −1 indicates that target j is in the attack zone of close-range weapon w hi ; a j hi = 0 represents that weapon w hi does not satisfy the attack condition for target j. In actual combat, the weapon attack area is obtained by look-up table interpolation or fitting method.
When the weapon resources are not enough to saturation attack, there is a situation that the target successfully avoids the interception of medium-range weapons and has not entered the attack zone of close-range weapons, which is defined as the target in the firepower vacuum zone. In order to judge whether any target is in the fire vacuum zone, the of firepower handover judgment variable F is presented where sign(·) is the sign function, the function value is 0 when the parameter is negative, otherwise the function value is 1. If F(t) = n, it means that all targets can be assigned at least one weapon meeting the attack condition simultaneously, that is, no target is in the firepower vacuum zone, and a decision request can be initiated immediately. Let the lth round of attack decision timing be t(l), and the l + 1 round of decision timing model is where T g h represents the weapon remaining flight time of platform h at the launch moment; T o denotes the situation assessment time; T s denotes the system response time (allocation decision, command transmission, weapon launch preparation, etc.).
If F(t) < n, it means there is at least one target in the firepower vacuum zone. According to the OODA loop theory, when any target is located in the firepower vacuum area by the ''Observe-Orient'' phases, the Command&Control system faces two decisions: (1) Directly execute the ''Decide-Act'' phases to intercept as quickly as possible, but t is not conducive to obtaining the global optimal solution; (2) Continuing the Orient phase, wait for the target, which is in the firepower vacuum area, entering the attack zone, but it may delay the optimal launch time of weapons. In order to balance the above ''optimality-rapidity'' dilemma, a delayed decision strategy is introduced where t m is the delay time parameter. Equation (15) indicates that if any target is in the vacuum zone after the lth round interception, the l + 1th decision is delayed for time t m . If the target group enters the firepower coverage area within the delay time, the l + 1th decision is initiated immediately.
In summary, the decision timing model in DS/HWTA is established as follow.
where t m is the natural time of operation initialization. The decision-making moment is executed according to the above model until the termination situations: (1) all targets are successfully intercepted; (2) no weapon is available; (3) any target has penetrated the defense system.

B. OPTIMIZATION OBJECTIVE AND CONSTRAINTS
According to the tactics described in the ground-to-air antipenetration scenario, the objective function of the DS/HWTA model should cover the following situations.
(1) If target j has penetrated the close space at time t and the defender assigns the close-range weapon i of platform h, the interception probability of target j is (2) If target j is in medium space, the close-range weapons cannot directly capture target j. The defender needs sensor k to track target j and cooperatively guide the medium-range weapon i of platform h to intercept. At this point, the interception probability of target j can be evaluated by the following formula Considering weapon types, launch timing, and sensor cooperation, the optimization objective of damage efficiency is established as where σ is the model parameter, which is set to a small positive constant.
In Equation (19), if close-range weapon w hi is assigned to intercept target j and satisfies the launch timing at time t, that is a j hi (t) = −1, x j hi (t) = 1, the decision return of weapon w hi is equivalent to Equation (17). If medium-range weapon w hi is assigned to target j and meets the launch timing at time t, and sensor s k cooperates to guide, namely a j hi (t) = 1, x j hi (t) = 1, the decision return of weapon w hi is equivalent to Equation (18). If weapon w hi does not meet the attack conditions on target a j hi (t) = 0, namely D, the decision return of weapon w hi is zero. It can be seen that the objective function (19) can fully reflect the solution fitness of DS/HWTA model in different situations.
Analysis of the DS/HWTA model shows two model constraints: (1) Consistency of weapon type and target state. The solutions violating the consistency are infeasible, in which weapons are assigned to the targets with no shoot condition.
(2) Cooperation of sensor and weapon. There is that sensors are assigned to the target only with close-range weapons. Therefore, the solutions waste the defense resource and are inefficient. The above constraints cause the redundant search space of solving algorithms. The penalty function method is adopted to handle the constraints, which has the advantage of real-time and diversity.
Given the solution {X (t), Y (t)} and weapon launch state A(t), the penalty term of the weapon-target consistency constraint is designed as when platform h intercepts target j at time t (x j h = 1) but has no weapon satisfying the shoot conditions (a j hi ≤ 0∀i = 1, 2, . . . , m h ), the value of the penalty term is greater than zeros, otherwise g 1 (t) = 0.
The penalty term of the sensor-weapon cooperation constraint is designed as when sensor k is employed to capture and track target j at time t (y j k = 1), and the weapons intercepting target j are close-range and have no guidance requirement (x j hi a j hi ≤ 0∀h = 1, 2, . . . , s; i = 1, 2, . . . , m h ), the value of cooperation penalty is greater than 0, otherwise g 2 (t) = 0.
In summary, the proposed DS/HWTA model is (22) and (23), as shown at the bottom of the next page. where α 1 and α 2 are the negative weights of weapon-target consistency and sensor-weapon cooperation respectively; l x and l y are the upper limit of weapons and sensors allocated to a single target in each decision. Since DS/HWTA model has a more extensive search space than the other WTA models, setting l x and l y is beneficial to pruning the optimization region, improving the real-time performance, and avoiding the waste of sensor and weapon resources.

IV. SOLVING ALGORITHM BASED ON EVOLUTIONARY FRAMEWORK
Based on the weapon platform setting, the defender can use up to s radars to track targets and launch up to s weapons to intercept targets during a salvo. In a single decision of DS/HWTA model, the scale of sensor-target decision space is n s , and the scale of decision space of platform h is m h n, so the scale of decision space is n 2s s h=1 m h . The dynamic decision-making also involves determining multiple decision time, weapon shoot conditions, and weapon type of platform in the continuous-time domain. The exact algorithm is challenging to handle the DS/HWTA problem's characteristics, which is not conducive to expanding the algorithm as the scenario changes. The swarm intelligence algorithm based on the heuristic stochastic optimization does not consider the mathematical characteristics of objective, and can realize parallel computation, which is suitable for solving complex engineering optimization problems [29], [30]. Considering real-time and optimality, the swarm intelligence algorithm is the mainstream solver for S/WTA model [31]- [34]. The evolutionary algorithm, which is the most widely studied and applied swarm intelligence algorithm, is proposed to solve the DS/HWTA model. The algorithm flow of EA-DS/HWTA is shown in Algorithm 1.

Algorithm 1 Main Loop of EA-DS/HWTA
1: Population initialization: Randomly generate the initial population P t of size pop by the de-constraint initialization algorithm. Set t = 1; 2: Fitness evaluation: According to the greedy strategy, generate the complement genes of individuals, and calculate the fitness; 3: Termination condition: Obtain the optimal solution P * of the tth generation population. If P * has no improvement in the recent s generations or t = T , set P * as the output and terminate algorithm. 4: Elite population generation: Sort the solution individuals of population P t in descending order of fitness, and select the first sep individuals to generate the elite population EP t ; 5: Population evolutionary: Generate random number p ∈ [0, 1], and perform the following evolutionary operations on EP t to generate the offspring population Q t of size pop; 6: Crossover operation: If p < p c , randomly select two solutions P t 1 and P t 1 from EP t as the parent individuals, perform the MPBX operator to generate two offspring individuals O 1 and O 2 . Set the number of offspring solutions k = k + 2; 7: Mutation operation: If p > p c , randomly select a solution from EP t as parent individual, and perform mutation operator to generate the offspring individual O. Set the number of offspring population k = k + 1; 8: Next generation population: Set P t+1 = EP t ∪ Q t and t = t + 1. Go to Step 2.

A. SOLUTION BUILDING
The feasibility and optimality of the initial population is essential to the performance of GA-DS/HWTA. The initial solutions about sensor-weapon-target should satisfy the following criteria as efficiently as possible.
1) At the decision moment, only one interceptor weapon of each platform can be used; 2) The targets, which are intercepted by medium-range weapons, is assigned at least one sensor for tracking guidance; 3) ''Target proximity situation -interception weapon type -sensor cooperation'' is consistent; 4) In the initial solutions, the number of sensors and weapons assigned to each target distributes uniformly. According to the above criteria, a hierarchical de-constrained algorithm, which is based on ''target proximity state -interception platform condition -sensor cooperation requirement,'' is proposed to generate an initial population of the type I solution coding.
In type I coding of Figure 3, each column corresponds to target 1 to n, and {d 1 , d 2 , · · · , d n } denote the target proximity state (position, speed, etc.) detected at the current situation. The chromosome coding of the solution is composed of sensor genes and platform genes. The sensor genes Y I j = {y I 1j , y I 2j . . . , y I l y j } represent the sensor set that is assigned to capture and track target j y I ·j = 0, no sensor is assigned to target j at the kth gene l, sensor l is assigned to target j at the kth gene (24) The platform genes X I j = {x I 1j , x I 2j , . . . , x I l x j } represent the platform set that is assigned to intercept target j 0, no platform is assigned to target j at the kth gene h, platform h is assigned to target j at the kth gene (25) The consistency logic of the chromosome coding is as follows: Taking the platform genes Secondly, if the platform in genes X I j has medium-range weapon meeting the shoot condition for target j, there should be sensor used for cooperation guidance, namely Y I j = {y I 1j , y I 2j , . . . , y I l x j }, ∃y I ·j = k, k ∈ {1, 2, · · · , n}. If the platforms represented by genes X I j have only close-range weapons, the genes of Y I j is 0. From the consistency logic of type I coding, the deconstrained population initialization algorithm is as follows: Obtain the current weapon configuration, platform position O, and target state U , V ; solve the weapon launch state launch condition model. First, available platform set W j , j = 1, 2, . . . , n for each target is obtained by launch state A By available platform set W j , j = 1, 2, . . . , n, randomly generate type I chromosome platform genes X I = {X I 1 , X I 2 , · · · , X I n } which meet the following constraints where the second constraint indicates that each platform intercepts at most one target at each decision-making. The sensor requirement variable B = [b j ] 1×n can be obtained by the platform genes X I where b j = 1 represents that there is the medium-range weapon in the platforms of intercepting target j, and the sensor should be assigned for cooperation. b j ≤ 0 denotes that no platform satisfies the shoot condition for target j, or only close-range weapons of the assigned platforms satisfy the shoot condition, so there is no sensor requirement. According to the sensor requirement vector B, randomly generate sensor genes satisfying the following constraints where the fourth constraint indicates that each sensor tracks at most one target at each decision-making. In summary, the type I solutions generated by the deconstrained population initialization algorithm ensure the consistency of ''target proximity state -interception platform condition -sensor cooperation requirement'' and meet the consistency constraints of DS/HWTA model. The pseudocode of the de-constrained population initialization algorithm is shown in Algorithm 2.

B. MODIFIED EVOLUTIONARY OPERATION
Although the type I solutions have met the consistency constraints, the following infeasible genes are easy to generate when the evolutionary operations (crossover, mutation) are directly performed on the type I solutions.
• A single platform intercepts more than one target at the same decision-making; • A single sensor captures more than one target at the same time;  while min{c} < l x do 6: if W j > 0 and c j < l x then 7: c j = c j + 1 8: if j > n then 30: j = 1 31: end if 32: end while 33: end for • Platform genes conflict with target proximity state, that is, the weapon in the platform assigned to a target does not satisfy the launch condition; • Sensor gene conflicts with platform gene. The sensor intercepts a target, and the platforms assigned to this target only remain close-range weapons. VOLUME 8, 2020 After the initial population is generated, it is proposed that the type I solutions based on target index are transformed into a type II coding based on platform index. Then the modified evolutionary operators are designed to control the infeasible genes in offspring individuals. The type II chromosome coding of solutions is shown in Figure 4.
In Figure 4, the columns correspond to platform 1 to s respectively. The capture genes P y = {y II 1 , y II 2 , · · · , y II s } represent the intended capture target of each sensor, y II k = j represents that sensor k is employed to capture target j, and y II k = 0 represents that sensor k is not used. The interception genes P x = {x II 1 , x II 2 , · · · , x II s } represent the target captured by each platform, x II h = j represents that platform h is to intercept target j, and x II h = 0 represents platform h is not employed. The transcoding algorithm of type I to type II is shown in Algorithm 3.

Algorithm 3 Transcoding Algorithm of
. . , s 2: for j = 1 to n do 3: for i = 1 to l y do 4: y II y I ij = j 5: end for 6: for i = 1 to l x do 7: x II To avoid the infeasible region of the decision space in the optimization process, a modified position based crossover (MPBX) is presented based on the shared feasibility. The MPBX makes the offspring individuals inherit the feasibility from the parent individuals, and has the advantage of diversity.
The design idea of MPBX operator is as follows: Compared with distance from the incoming target to the defensive position, the platforms in the defensive position are usually close to each other, and there is inheritance in the feasibility of weapon types. Therefore, the crossover strategies of capture genes and interception genes in type II encoding are as follows: 1) To limit the infeasible genes, the intended capture targets of offspring individuals should be reorganized based on the capture targets of parent individuals; 2) In the interception genes of type II coding, the platform gene positions can be divided into three independent sub-sets by the remaining weapon status: (1) the platform set S with only close-range weapons; (2) the platform set L with only medium-range weapons;

Algorithm 4 Algorithm of MPBX Operator
(3) the platform set E with two types of weapons. Theoretically, if the S genes of offspring individuals are generated by the crossover of the S ∪ E genes of parent individuals, it has more feasible and optimal information than the random crossover. It is similar to L genes.
Based on the above strategy, the MPBX operator execution process is shown in Algorithm 4, and the example figure is shown in Figure 5.
As shown in Figure 5, taking the solutions of nine platforms and six targets as an example, the platform set S of   Analyzing the above example, the MPBX operator has the following advantages: 1) The MPBX operator makes the sensors only cross the medium-range targets identified by the parent individuals, and the close/medium-range platforms mainly cross the close/medium-range targets identified by the parent individuals. This mechanism makes the offspring individuals inherit the consistency information of type I coding solutions, and can effectively control the constraint violation of population individuals; 2) Offspring individuals have different differentiation directions, which is conducive to population diversity and improves search efficiency; 3) Combined with the penalty function items in Equation 22, there is no need to perform detection and repair operations, which significantly reduces the algorithm complexity and improves real-time performance.
The mutation operator has less influence than the crossover operator in evolutionary process, and the single point mutation operator is adopted. Randomly select one capture gene and interception gene, then a new gene code is generated from the other executable target set.

C. ENVIRONMENT SELECTION BASED ON GREEDY FITNESS
The purpose of the elite selection algorithm is to make individuals with better fitness have a higher probability of being retained in the next generation population, which is an essential step in the iterative optimization of swarm intelligence algorithms. In the evolutionary algorithm, the common selection algorithms include roulette, tournaments, and elitist selection. The binary tournament selection method is used, that is, two individuals are randomly selected for fitness evaluation, and the better one is moved to the elite population.
In EA-DS/HWTA, the selection algorithm for elite population is shown as Algorithm 5. Generate the weapon complement codes X c i and X c j of the individual P i and P j respectively by a greedy strategy; 5: Transform the completed codes

Algorithm 5 Selection Algorithm for Elite Population
Calculate the objective fitness J (X III i , Y III i ) and J (X III j , Y III j ) of solution P i and P j ; Move the solution P i from population P to elite population EP; 9: else 10: Move the solution P j from population P to elite population EP; 11: end if 12: In Algorithm 5, lines 4∼6 denotes the fitness calculation method for type II coding solution. As described in section IV-B, the population evolution of EA-DS/HWTA is operated on type II coding. Type II coding inherits the feasibility of type I coding and is conducive to population diversity. However, type II coding does not determine specific weapon in platforms, and cannot be directly used for solution fitness evaluation. A complement coding method based on the greedy strategy is presented to obtain the optimal fitness of type II coding solution. The calculation steps of greedy fitness are: 1) Based on the type II coding solution, generate weapon complement code for each platform by maximizing damage return; 2) Convert the completed type II codes into the 0-1 sparse decision matrix X III and Y III , which are the decision variables X and Y in Equation (22); 3) Calculate solution fitness by Equation (22). The specific calculation method of individual fitness is as follows. First, the weapon complement codes of type II coding is defined as The greedy strategy of complement coding is the local optimal of weapon efficiency under type II coding solution Therefore, the generation method of weapon complement codes is otherwise, for h = 1, 2, . . . , s (32) where the first item denotes that when platform h has at least one weapon satisfying shoot condition for target x II h , namely ∃a x II h hi = 0, the weapon with the highest interception probability is used; the second item indicates that when no weapon in platform h meets the shoot condition for target x II h , no weapon is allocated.
To directly calculate the solution fitness, {X II , X c } is converted into type III coding, that is 0-1 sparse decision matrix where {X III , Y III } is equivalent to the decision variables ({X , Y }) in Equation (22), and can evaluate individuals directly.
All above, the algorithm of complement codes and fitness calculation is shown in Algorithm 6, and the flow diagram of EA-DS/HWTA is shown in Figure 6.

V. EXPERIMENTAL STUDIES
Since the DS/HWTA model is first proposed, there is no comparable algorithm for comparing with EA-DS/HWTA. This section verifies the effectiveness of proposed approaches by simulation analysis. All experiments are performed on the Matlab 2019b with i5-2.5GHz CPU and 16GB memory computer.  10: i = x c h 11:

Algorithm 6 Algorithm of Complement Codes and Fitness Calculation
x III hij = 1 12: end if 13: end for 14: Calculate the fitness J (P, Q, A, X III , Y III , t) of solution {X , Y } by Equation (22).

A. OPERATIONAL SCENARIO AND PARAMETER SETTING
According to the ground-to-air anti-penetration scenario, the following initial situation is randomly generated. In the threedimensional coordinate system, the Y-axis represents the altitude coordinates, and the X-Z plane represents the horizontal plane coordinates.
(1) Parameter setting in weapon platform. The system response time T s = 0.15s; the situation assessment time is 0.2s. The maximum detection range D max of radar is set to 100km, and the descent coefficient k d = 1.23. The weight of interception probability β S = 0.5, β T go = βq = 0.25. The performance parameters of close-range and medium-range weapon are respectively set as our previous work [35] and [36]. The performance parameters of the close-range missile are set as: m m (0) = 100kg,P m (0) = 15.6kN , t p = 5.2s, τ m = 0.2s, N m = 4, n m max = 50, t max = 30s, v m min = 400m/s, r d = 12m, and t = 0.02s. The performance parameters of the medium-range missile In each salvo, The state transition of target follows the Bernoulli distribution about the interception probability [37], which can be calculated by Equation (19). The performance parameters of the fighter target are also set as [35]:  target are initialized as (3) Parameter setting in EA-DS/HWTA. The population size pop = 200; the max generation maxgen = 100; the crossover probability p c = 0.8; the mutation probability p m = 0.2; the elite population size ep = 0.5pop; the constraint weight of weapon-target consistency α 1 = −1; the constraint weight of sensor-weapon cooperation α 2 = −1; the crossover number of capture genes k y = s/2 ; the crossover number of interception genes k s = |S| /2 where |S| is the number of the platforms with only close-range weapons; the delayed time parameter t m = 30s. Based on the above initial setting, the following two cases are randomly generated to research the DS/HWTA problem with different sizes and situations.
Case 1: Let the number of platforms be m = 4, each platform is configured with four guided weapons, including two medium-range weapons and two close-range weapons. Randomly generate the initial platform positions with an altitude of 0km, X-axis of 0∼10km, and Z-axis of 30∼50km. The distance between each platform is not less than 4km. The number of penetration targets is n = 8. Randomly generate the initial target positions with an altitude of 8∼10km, X-axis of 70∼80km, and Z-axis of 20∼60km. The distance between each target is not less than 4km. The initial operational scenario is shown as Figure 7(a).
Case 2: Let the number of platforms be m = 10, each platform is configured with four guided weapons, including two medium-range weapons and two close-range weapons. Randomly generate the initial platform positions with an altitude of 0km, X-axis of 0∼20km, and Z-axis of 20∼80km. The distance between each platform is not less than 8km. The number of penetration targets is n = 20. Randomly generate the initial target positions with an altitude of 8∼12km, X-axis of 60∼80km, and Z-axis of 10∼90km. The distance between each target is not less than 8km. The initial operational scenario is shown as Figure 7(b).

B. EXPERIMENTS ON EA-DS/HWTA
The metrics for evaluating EA-DS/HWTA is: whether the defender can complete the interception task within a certain time window. Hence EA-SHWTA is performed on case 1 over 30 independent runs, and the targets adopt the penetration route of random maneuver. The statistics of the obtained results is shown in Table 2, and the metric distribution data of each indicator is shown in Figure 8.
According to Table 2 and Figure 8(a), the decision schemes solved by EA-DS/HWTA enable the defender to complete the interception task successfully in 30 independent runs. As shown in Figure 8(b), the number of decision-salvo times distribute in the interval of 2 to 6, and 50% of the salvo times do not exceed 3. The distribution of weapon consumption in salvoes is shown in Figure 8(c), and the values of mean and standard deviation are 10.93±2.18. Figure 8(d) gives the distribution of the number of intercepted targets in each salvo. It can be seen that if the interception rate increases in the first or second salvo, the number of decision stages trends to decrease. The distribution of salvo time is shown in Figure 8(e). Since the natural time is set 0s when the simulation starts after the situation data is loaded, the 1st salvo times of 30 simulations are the same as 1.5s through the system response (situation assessment, algorithm decision) time. The subsequent salvo time is determined by the ballistic flight time of the previous stage, and the second salvo time of each simulation is close to 95s. The EA-DS/HWTA time-consuming is stable in each decision, and the value of mean and standard deviation is 0.9309±0.0143s, as shown in Figure 8(f).
To intuitively illustrate the effectiveness of EA-DS/HWTA, the critical decision stages in simulation 26 are selected to illustrate the interception process visually. The red curve represents the weapon trajectory; the blue curve represents the target flight; the green line represents the timing of the sensor capturing the target. The black graphic represents the target being killed, the weapon missing the target, or the platform exhausted, as shown in Figure 9.
In Figure 9(a), load the situation data, let the natural time t = 0. After the system response time 1.5s, platform I∼IV employ the sensors to track targets {3,7,5,6} respectively, and guide the medium-range weapons {2,1,2,2} to intercept   Figure 9(c). At t = 190.45s, it is assessed that at least two different weapons satisfy attack condition for target 2 and 8. Then platform II selects weapon 3 to intercept target 2 and fails. Platform III and IV launch close-range weapons {4,3} for intercepting target 8 simultaneously. Platform IV damage target 8 firstly, and platform III loses target, as shown in Figure 9(d). After assessment and decision, the fourth salvo scheme is that platform II and IV use close-range weapons {4,4} to intercept target 2 at t = 227.48s, and the weapons of platform II and IV are consumed out. Platform II damages VOLUME 8, 2020 target 2 firstly, and platform IV loses target. The interception mission is completed, as shown in Figure 9(e). Figure 10 gives the dynamic performance of the mean and mean square deviation values of operational efficiency and constraint violation during each decision stage when EA-DS/HWTA solves case 1 over 30 independent runs. It can be seen that as the defense phase advances, the fewer targets need to be intercepted, the mean square error and convergence value of the damage efficiency decrease in the population, and the convergence speed of the constraint violation is faster than the previous decisions.

1) EXPERIMENTS ON CASE 2
Similarly, EA-SHWTA is executed on case 2 over 30 independent runs, and the targets adopt the penetration route of random maneuver. The statistics of the obtained results is shown in Table 3, and the metric distribution data of each indicator is shown in Figure 11. From Table 3 and Figure 11, EA-DS/HWTA can maintain the successful interception of the target group in 30 independent executions to case 2. The computation time of case 2 is 0.21∼0.24s higher than that of case 1. It is worth noting that the weapon consumption in case 1 is 10.93∼2.18 for intercepting four targets, and the weapon consumption in case 2 is 29.17∼3.31 for intercepting 10 targets. The weapon efficiency-cost ratio of case 1 is higher than that of case 2. However, the interception times of case 1 is higher than that of case 2. For illustration, the interception numbers of case 1 and 2 are 3.63∼1.03 and 3.47∼0.63 respectively, and the maximum interception times of case 1 and 2 are 6 and 5 respectively. It is deduced that the average interception rate in each stage of case 1 is lower  than that of case 2, and Table 4 gives the number of weapon consumption and successfully intercepted targets at each interception stage of case 1 and 2. By analyzing the results, the case 2 is more complicated than case 1, and there is no fire vacuum zone during the whole interception process. Therefore, in each interception process, the defender can intercept all surviving targets with maximum efficiency. In case 1, the situation enters the firepower vacuum after the second round interception. According to the firepower transfer model, the third round interception can not be guaranteed as soon as possible and with maximum efficiency simultaneously. Hence case 1 has a lower damage efficiency than case 2 in decision 3, and the fourth or fifth round interception is required, as shown in Table 4. Figure 12 gives the dynamic performance of the mean and mean square deviation values of operational efficiency and constraint violation during each decision stage when EA-DS/HWTA solves case 1 over 30 independent runs. In the first two decision stages, the defender employs the mediumrange weapons, most of which meet the launch condition. The damage efficiency is determined by the cooperation of weapons and sensors. In decision 4 and 5, the platforms launch the close-range weapons. Therefore, decision 1 and 2 have a larger mean square deviation of damage efficiency and constraint violation than decision 4 and 5.
Based on the results of case 1 and 2, EA-DS/HWTA can effectively support the defender to complete the dynamic autonomous firepower decision. In the early stages of interception, more weapons meet the launch conditions for targets, and EA-DS/HWTA can assign weapons to intercept different targets as much as possible. In the later stages, with fewer penetration targets and more threat, EA-DS/HWTA can dynamically cooperate the medium-range and close-range weapons as quickly as possible according to the firepower transfer model. In addition, the numbers of sensors and weapons assigned to a single target do not exceed the upper limit l x and l y to maintain the weapon efficiency-cost ratio.

C. EXPERIMENTS ON DE-CONSTRAINED INITIALIZATION
To verify the proposed de-constrained initialization algorithm, the following population initialization algorithm is adopted in the comparison algorithm EA-DS/HWTA-r1: Each gene position randomly generates the corresponding gene code of survival target in the individuals encoded by type II chromosomes. EA-DS/HWTA and EA-DS/HWTA-r1 are performed on case 2 over 30 independent runs. The statistics of the optimal solutions obtained in each decision is shown in Table 5, and the distribution of damage efficiency and constraint violation is shown in Figure 13. By Table 5, both EA-DS/HWTA and EA-DS/HWTA-r1 can maintain the mission completion rate and target interception rate of 100% in 30 independent simulations. In addition, comparing the weapon consumption and decision-salvo times, EA-DS/HWTA has a higher utilization rate of weapon resources and can intercept targets in fewer salvoes. Considering Table 5 and Figure 13, although EA-DS/HWTA and EA-DS/HWTA-r1 can obtain the feasible optimal solutions with constraint violation of 0, the former is superior to the latter in the damage efficiency of the obtained operational scheme. In terms of real-time performance, since there is no need to compute the weapon-target attack condition and sensor-weapon cooperation, EA-DS/HWTA-r1 is slightly better than EA-DS/HWTA.
To analyze the influence of the de-constrained population initialization algorithm on the evolutionary framework, Figure 14(a)∼Figure 14(c) give the dynamic performance of population fitness metrics of EA-DS/HWTA and EA-DS/HWTA-r3 in the evolutionary process. In Figure 14(a), since the proportion of infeasible solutions in the initial population of EA-DS/HWTA-r1 is higher, the mean value of damage efficiency is lower than EA-DS/HWTA in the evolutionary process, and the mean square deviation is larger than that of EA-DS/HWTA. Analyzing Figure 14(b)∼Figure 14(c), the de-constrained population initialization algorithm makes the constraint violation of   weapon-target attack condition and sensor-weapon cooperation be 0 in the initial population. The infeasible solutions generated in the second generation population are less than 10%, then the constraint violation decreases to 0 within 40 generations. In decision 1 and 2, the close-range weapons do not satisfy the launch condition, and the interception method is sensors guiding medium-range weapons. At this time, each platform has complete weapons, and meet the attack condition easily. Hence the random initialization algorithm has less influence on weapon-target attack constraint, but can not effectively reduce the constraint violation of sensorweapon cooperation. The constraint violation mainly comes VOLUME 8, 2020  from the sensor-weapon cooperation consistency in decision 1 and 2, as shown in Figure 14(b) and 14(c). In decision 3, both two types of weapons are involved in the interception. Compared with decision 1 and 2, the constraint violation of two algorithms increase, and the convergence speed of EA-DS/HWTA is faster than EA-DS/HWTA-r1, as shown in Figure 14(b) and Figure 14(c). In decision 4 and 5, the most platforms only remain close-range weapons, and the launch conditions are rigor. Hence the constraint violation of weapon-target attack condition increases compared with decision 1 3, and the constraint violation of sensor-weapon cooperation remains 0. Owing to the shrinkage of the feasible regions in decision 4 and 5, the constraints convergence speed of EA-DS/HWTA-r1 is slower than the previous decision stages. The constraint violation of decision 4 converges to 0 in the 70 to 80 generations, and the constraint violation of decision 5 converges to 0 in the 80 to 90, as shown in Figure 14(b) and Figure 14(c).
In conclusion, the de-constrained population initialization algorithm improves the proportion of feasible solutions, suppress the constraint violation, and enhances the convergence performance. With the advance of the interception phase, the effect of improving the initial population quality is more contributes to the search efficiency.

D. EXPERIMENTS ON GREEDY FITNESS
In order to verify the greedy fitness strategy, the following solution building is introduced in the comparison algorithm EA-DS/HWTA-r2: After the type II chromosome code is determined, the complement code is randomly generated by available weapons in each platform, and the other solution operation is the same as EA-DS/HWTA. EA-DS/HWTA and EA-DS/HWTA-r2 are performed on case 2 over 30 independent runs. The statistics of the optimal solutions obtained in each decision is shown in Table 7, and the distribution of damage efficiency and constraint violation is shown in Figure 15. Based on Table 7 and Figure 15, at each decision of 30 independent runs, both EA-DS/HWTA and EA-DS/HWTA-r2 can give the optimal solution with the constraint penalty of 0. However, the former can maintain the completion of the interception mission, and the latter has a completion rate of 96.67%, that is, the interception mission fails in one simulation. EA-DS/HWTA-r2 has a lower mean value of damage efficiency than EA-DS/HWTA in each decision stage, and a higher mean square deviation of damage efficiency than EA-DS/HWTA. Therefore, the weapon efficiency-cost ratio of EA-DS/HWTA-r2 is lower than that of EA-DS/HWTA, and the number of interception times and weapon consumption is higher than that of EA-DS/HWTA. Two comparison algorithms are close in computation time, and the reason is that the greedy fitness strategy and exact fitness strategy have no significant influence on algorithm computation complexity. Figure 16 shows the visualization of the interception failure process of simulation 13 under EA-DS/HWTA-r2. It can be seen that the number of salvoes is five, and the number of successful penetration targets is one. Target 6 adopts the strategy of following other targets to penetrate along the boundary to reduce the weapons that meet the launch condition for it, to improve its penetration probability.
To specifically analyze the influence of greedy fitness strategy on the evolutionary algorithm, Figure 17(a)∼Figure 17(c) show the dynamic performance of the population fitness metrics of EA-DS/HWTA and EA-DS/HWTA-r2 in the evolutionary process. In Figure 17(a), the mean damage efficiency of EA-DS/HWTA-r2 is lower than EA-DS/HWTA, the mean square deviation is larger than that of EA-DS/HWTA, and the convergence speed is slower than that of EA-DS/HWTA. The reason is that EA-DS/HWTA adopts the greedy strategy to maximize damage efficiency based on type II coding. While    is 0 in the initial population. The infeasible solutions are generated in the second generation population, and the constraint violation value gradually converges to 0 in the sequent evolutionary generations. EA-DS/HWTA-r2 is close to EA-DS/HWTA in the mean values of attack constraint violation C 1 and cooperation constraint violation C 2 , and the convergence of mean square deviation is obviously worse than EA-DS/HWTA. The reason is that the initial population fulfills the feasibility with 0 constraint violation by the deconstraint initialization algorithm. The individual form of EA-DS/HWTA is type II coding, and only the platform gene codes are operated in crossover and mutation. The offspring individuals can select the feasible weapons with the maximum damage efficiency by greedy fitness strategy, and the generated weapon complement codes do not violate constraints. However, EA-DS/HWTA-r2 exacts the weapon gene codes accompanied by platform gene codes. Therefore, EA-DS/HWTA-r2 is more likely to violate the attack constraint and cooperation constraint during evolutionary generations. In decision 1 and 2, the defender employs medium-range weapons, and EA-DS/HWTA-r2 is slightly less adaptive to attack condition constraint than EA-DS/HWTA. In decision 3, the defender uses heterogeneous weapons. Compared with the other decision stages, EA-DS/HWTA-r2 is more likely to produce infeasible solutions than EA-DS/HWTA. In decision 4 and 5, the targets penetrate into the close space without the need for sensors to illuminate and guide the weapon. Therefore, the values of sensor-weapon cooperation constraint violation of EA-DS/HWTA and EA-DS/HWTA-r2 are 0, as shown in Figure 17(b) and 17(c). According to the evolutionary mechanism, increasing the population size can make the optimal fitness of EA-DS/HWTA-r2 close to that of EA-DS/HWTA. In conclusion, the main contribution of the greedy fitness strategy is to enhance the convergence performance effectively. VOLUME 8, 2020

E. EXPERIMENTS ON MPBX
To verify the MPBX operator, the classic exchange crossover (EX) operator is adopted as the comparison. EA-DS/HWTA and the comparison algorithm EA-DS/HWTA-r3 are performed on case 2 over 30 independent runs. The statistics of the optimal solutions obtained in each decision is shown in Table 7, and the distribution of damage efficiency and constraint violation is shown in Figure 18. As shown in Table 7, EA-SHWTA-r3 fails the interception mission in eight of the 30 independent simulations. The number of successful penetration targets is eight, and the target penetration rate is 1.33%. EA-DS/HWTA-r2 has higher weapon consumption and decision stages than EA-DS/HWTA, indicating that the weapon efficiency-cost ratio of the interception schemes under MPBX operator is higher than EX operator. Because of the lower complexity, EX operator is better than MPBX operator in real-time performance. In Figure 18, EA-DS/HWTA and EA-DS/HWTA-r3 can obtain the feasible solutions with the constraint violation value of 0. The damage efficiency of solutions under MPBX operator is higher than EX operator.
To analyze the influence of MPBX operator on the evolutionary framework, Figure 19(a)∼Figure 19(c) give the dynamic performance of population fitness metrics of EA-DS/HWTA and EA-DS/HWTA-r3 in the evolutionary process. In Figure 19(a), EA-DS/HWTA has a higher mean value of damage efficiency and faster convergence speed of mean square deviation than EA-DS/HWTA-r3. As shown in Figure 19(b), EA-DS/HWTA has the lower mean and mean square deviation of constraint violation than EA-DS/HWTA-r3 in weapon-target consistency. In the decision 1 to 3 of Figure 19(b), the mean and mean square deviation differences between EA-DS/HWTA and EA-DS/HWTA-r3 increased from 0.01 to 0.05, and decreases to 0.01 in the decision 4 and 5 of Figure 19(b). In Figure 19(c), since the decision only involves close-range weapons and does not require sensor cooperation, the constraint violation of sensor-weapon cooperation of EA-DS/HWTA and EA-DS/HWTA-r3 is 0. As illustrated in Figure 19(c), EA-DS/HWTA is lower than EA-DS/HWTA-r3 in the mean and mean square deviation of the sensor-weapon cooperation constraint violation. In the decision 1 to 3 of Figure 19(c), the difference between two algorithms' mean increases from 0.4 to 0.8, and the difference between mean square deviations increases from 0.2 to 0.7.
Analyzing Figure 19(a)∼Figure 19(c), it can be seen that MPBX operator makes the evolutionary operation more likely to be performed between the platforms with similar weapon types, thus effectively reducing the penalty value of attack constraint C 1 and cooperation constraint C 2 in population evolution, and the repair effect of cooperation constraint C 2 is more obvious than that of attack constraint C 1 . Therefore, the proportion of feasible solutions or near-feasible solutions in the EA-DS/HWTA population is greater than that of EA-DS/HWTA-r3. MPBX operator improves the population fitness and convergence speed, and EA-DS/HWTA has higher search efficiency. Especially with the evolution of the operational situation, the type of weapon used in decision 1 to 5 transforms from medium-range to close-range. In decision 3, the cooperation of sensors, medium-range weapons, and close-range weapons is more complicated than in other decision stages, and the optimization effect of MPBX operator on population fitness is more than other decision stages.

VI. CONCLUSION AND FUTURE WORK
At present, most of the researches on sensor/weapon-target assignment are static and weak constrained. This paper first proposes a dynamic sensor/heterogeneous weapontarget assignment problem by refining the critical factors of typical ground-to-air anti-penetration scenarios. Aiming at the characteristics of the established DS/HWTA model, a solving algorithm based on the evolutionary framework is designed and verified by simulation. Simulation results show that the DS/HWTA model can actually reflect the mission requirements and constraints of ground-to-air antipenetration operations, and the designed EA-DS/HWTA can obtain the optimal solution for interception scheme in realtime. The de-constrained population initialization algorithm based on type I coding ensures the feasibility of the initial population, which is conducive to the convergence of constraints. The MPBX operator based on type II coding makes the offspring solutions close to the feasible region, which improves the population quality. The greedy fitness strategy based on type III coding effectively improves individual fitness and enhances search efficiency.
Comparing the simulation results of case 1 and 2, when the firepower vacuum zone exists, the handover of medium-range and close-range weapons is an essential factor affecting the interception times and mission completion. This paper introduces the delayed decision strategy to control the firepower handover. The smaller delay time parameter is not conducive to searching the global optimal solution, and the larger delay time parameter violates the mission requirement of ''intercept as soon as possible.'' How to obtain the decision time in firepower handover is a future challenge for us.