Two Heuristic Algorithms for the Minimum Weighted Connected Vertex Cover Problem Under Greedy Strategy

The Minimum Weighted Connected Vertex Cover problem (MWCVC) is to find a subset <inline-formula> <tex-math notation="LaTeX">$F\subset V(G)$ </tex-math></inline-formula> with minimum weight in a node-weighted graph <inline-formula> <tex-math notation="LaTeX">$G$ </tex-math></inline-formula>, such that when removing the set <inline-formula> <tex-math notation="LaTeX">$F$ </tex-math></inline-formula>, the inducing graph of remaining vertices holds no edges, and the graph induced from set <inline-formula> <tex-math notation="LaTeX">$F$ </tex-math></inline-formula> in <inline-formula> <tex-math notation="LaTeX">$G$ </tex-math></inline-formula> is required to be connected. This problem comes from the classical combinatorial problem in graph theory, i.e., the Vertex Cover Problem. A large number of results on algorithms for the MWCVC problem have been reported. In this paper, we proposed two heuristic algorithms, denoted as VCC and LCVCC, to find a connected vertex cover set in a general weighted graph. The time complexity of both two algorithms are less than <inline-formula> <tex-math notation="LaTeX">$O(n^{4})$ </tex-math></inline-formula>. We compare these two algorithms with two known heuristic algorithms GR and GD (proposed by Dagdeviren in 2021) on connected graphs, and draw a conclusion that both of VCC and LCVCC perform better than GR or GD. Relatively speaking, LCVCC is expected to have better performance in dense graphs than VCC.


I. INTRODUCTION
All the graphs we considered are simple, without loops and multi-edges. Given a graph G, we denote the vertex set and edge set of G by V (G) and E(G). For v ∈ G, we use d (v) to represent the degree of vertex v. is used to denote the maximum degree and δ is used to denote the minimum degree of the vertex in G. If S ⊂ V , then G[S] is used to represent the graph induced from S, and E(G [S]) is the edge set of G [S]. For a vertex v ∈ V , we use N (v) to represent the neighbor vertex set of v and for a vertex set S ⊂ V , we define its neighbor by N (S) = v∈S N (v) \ S.
The Minimum Vertex Cover problem (MVC) is to find a subset VC of V (G) as small as possible and the inducing graph of remaining vertices holds no edges when removing the VC set from the graph, i.e. E(G[V (G) \ VC]) = 0. Moreover, if the VC set is required to be connected in the original The associate editor coordinating the review of this manuscript and approving it for publication was Sun-Yuan Hsieh . graph, which means a path at least can be found between any two vertices of the VC set, the problem becomes the Minimum Connected Vertex Cover problem (MCVC), which was first introduced by Garey and Jonson in 1976 [4] and is NP-hard to be approximated within 10 √ 5 − 21 [3]. We use CVC to represent the subset we select as the solution. Both of the problem above are applied in Wireless sensor networks (WSNs), signal station construction, terminal connection and resources transportation by pipeline, etc.
A more complex form of MVC problem is to give weights for every vertex in graph. This problem has a name of Minimum weighted vertex cover problem (MWVC) [5]. In this situation, denoting the vertex cover set as F, if the induced graph G[F] is connected, the set F is a solution for Minimum Weighted Vertex Cover problem (MWCVC). Fujito proved that for any ε, MWCVC can not be approximated within (1 − ε) ln n unless NP ⊂ DTITM (n O(log log n) ) [6]. Shimizu   problem, Zhang et al. propose a two stage algorithm that a greedy algorithm and a configuration checking method are used [8]. Dagdeviren gives a Hybrid genetic algorithm in 2021 to solve MWCVC problem [1].
In section 2, we introduced two known heuristic algorithms and propose two new algorithms for MWCVC problem. We also provide examples for those four algorithms. In section 3, we investigate the performance of these algorithms from the different aspects such as the number of vertices, the weight of selected vertices and cost time of algorithm. Section 4 is a short conclusion and our future work.

II. TWO PROPOSED HEURISTIC ALGORITHM
In this section, we first introduce two simple greedy heuristic algorithms GR and GD proposed in 2021 by Dagdeviren [1]. Secondly, we propose two heuristic algorithmss for MWCVC also under greedy strategy.

A. TWO KNOWN HEURISTIC ALGORITHMS FOR MWCVC PROBLEM
Both GR and GD are two stage algorithms and based on the algorithms for solving weighted connected dominating set problem in [2]. In each algorithm, all vertices are initially WHITE. When an vertex is selected into CVC set, it turns BLACK and all of its neighbors are colored GRAY. These two algorithms both are started with a vertex, and then select GRAY vertices by some greedy strategy in the first stage. When there is no WHITE vertices, choose additional GRAY vertices having the smallest weight until all edges are covered. The difference between them is the greedy strategy. GR chooses the vertex among GRAY vertices with minimum ratio A simple connected weighted graph with 10 vertices is used to show how these two algorithms work, see Figure 1.
(v i , w i ) denotes the vertex v i and its weight. GR first chooses v 10 , since its rate 3 3+5+19+16+14 = 0.0526 is the minimum one among the all nodes. Then among all neighbors of v 10 , v 1 has the minimum rate 14 50 = 7 25 . After that v 5 , v 9 , v 4 and v 1 are selected step by step. Finally algorithm stops, because N (S) are all isolated vertices, that means 8 , v 10 } has been a connected vertex cover set already.
When executing GD, see Figure 2, v 1 , v 4 , v 10 all have the largest degree but v 10 has the minimum weight, so v 10 is the first one to be selected. At the second step, v 1 is selected because it has two neighbors v 2 and v 5 , and its weight is less than v 9 . Then GD chooses v 3 , v 4 and v 5 . Then in the second stage, choose v 6 and v 8 . In this example, 8 , v 10 } that forms a connected vertex cover set.

B. THE FIRST PROPOSED HEURISTIC ALGORITHM VCC
We propose a two-stage heuristic Vertex Cover and Connectivity (VCC) algorithm to compute a connected vertex cover set with relatively minimum weight by finding a vertex cover set first and selecting more vertices into this set to ensure its connectivity. In the first stage, the algorithm selects the most cost-efficient vertex iteratively until all the edges has been covered. Let S be the selected vertex set the costeffectiveness of vertex u is defined as so VCC always chooses the vertex with the smallest value of cost-effectiveness. At the end of the first stage, the selected vertex set S is a vertex cover set, VCC will then find the most cost-efficient vertices to ensure the connectivity of S in the second stage. We use κ(S) to represent the number of components of the graph G[S]. In this case, the cost-effectiveness is defined as . The algorithm is described as Algorithm 1. Figure 3 gives an example for VCC algorithm. v 12 is the first vertex to be selected because its cost-effectiveness is 0.25 that is smallest among all vertices' cost-effectiveness.
After v 12 is selected, all edges incident to it are covered. Then algorithm updates the cost-effectiveness and selects v 5 (or v 11 ). Repeat this operation until all edges are covered. In this graph, the vertex cover set selected in the first stage of this greedy algorithm also is a connected vertex cover set, i.e. κ(S) = 1. So, VCC does not need to select more vertices in the second stage, which means the second while loop does not work on this graph. The solution given by VCC algorithm 6 , v 13 , v 10 , v 1 }, and the total weight is 109. The time complexity of VCC algorithm is equal to O(n 2 ), where n is the number of vertices. In the worst situation for the first while loop, it may run n+(n−1)+(n−2)+· · ·+2+1 = O(n 2 ) times. For the second while, it runs n 2 + n−1 2 + · · · = O(n 2 ) times in the worst situation. In fact, these two situations can not happen together, because the more vertices selected in the first while loop, the better the connectivity of S will be, which makes the second while loop selects less vertices.

C. THE SECOND PROPOSED HEURISTIC ALGORITHM LCVCC
VCC Algorithm is a simple greedy algorithm that hardly gives an optimal solution, especially when the number of vertex is large. We design Local Connected Vertex Cover and Connectivity (LCVCC) algorithm is designed to improve the VCC algorithm. The main idea is to find a local connected vertex cover set for a part of graph by Algorithm 2 rather than a select single vertex in every iteration. It begins with a labelled vertex u. If not all vertices in N (u) are isolated in G[N (u)], LCVCC algorithm finds a vertex cover set in G[N (u)] and labels the vertices in the set. Then iteratively update the labeled vertex set by adding these new labeled vertices into the previous vertex cover set. The algorithm searches the neighbors of this labeled vertex set again and then labels more vertices, until the neighbor set forms an independent set. After that, LCVCC searches for more vertices to ensure the connectivity of labeled vertices under the same strategy as VCC, until the labeled vertex set forms a connected vertex cover set. We show how to find a (local) connected vertex cover set for a labeled set. Proof: Since the loop in Algorithm 2 stops only when there is no edge between neighbor vertices of labeled set. The labeled set L obviously is a vertex cover set for graph G[L ∪ N (L)]. Furthermore, all of labeled vertices are neighbors of the labeled set in the previous iteration, which leads to that at least one path can be found between a labeled vertex and the first given vertex set L I . For any two of labeled vertices, we can use two such kind of paths to link them up, which means the labeled set is not only a vertex cover set, but also a connected vertex cover set. Proof: We denote by N i the neighbor vertices searched in the ith iteration, i = 1, 2, · · · , s. With the property of greedy algorithm, we know the time complexity of the ith iteration is

Algorithm 2 Algorithm to Find a Local Connected Vertex Cover Set
Then we propose LCVCC algorithm (Algorithm 3) to compute a connected vertex cover set in graph by using Algorithm 2.
Here we run LCVCC on the example, in Figure 4. First compute all vertices' significant value and v 1 is the first one to be choose. When taking v 1 as the first initial vertex, algorithm 2 labels it, then searches its neighbor.  So the algorithm computes the significant value again and selects the next initial v 13 , so is the v 7 . Then the algorithm gets a vertex cover set {v 1 , v 12 , v 6 , v 14 , v 15 , v 5 , v 11 , v 13 , v 7 }, then the algorithm selects v 14 to ensure the connectivity and outputs the solution {v 1 , v 12 , v 6 , v 14 , v 15 , v 5 , v 11 , v 13 , v 7 , v 4 } with total weight 101.
Theorem 3: The vertex set S given by LCVCC algorithm is a connected vertex cover set.
Proof: Assume, for contradiction, that S is a solution given by algorithm LCVCC, but not a vertex cover set for G. Then there must be an edge in G[V \ S], in which case additional vertices in V \S will be selected, contradicting the fact that S is a solution given by algorithm LCVCC. Likewise, the second while loop ensures the connectivity of the solution S. Theorem 4: The time complexity of LCVCC algorithm is less than O(n 4 ).
Proof: We assume the first while loop in LCVCC algorithm runs s times in total, P i is used to represent the vertex set labeled in ith iteration. To find the specific initial vertex in ith iteration, the algorithm runs on every vertex whose corresponding labeled vertex set have the size of |P i j |. By using lemma 2 we have their cost time is O(|P i j | 2 ), where j = 1, 2, · · · , n − i−1 k=1 |P k |, representing the remaining vertices that the algorithm has to operate on. Thus, when selecting the ith initial vertex, the time cost is Then we have the entire time cost of the first while loop is less than Because in any iteration, the amount of remaining vertices is less than n, and in each iteration at least one vertex is labeled resulting i < n.

III. PERFORMANCE OF ALGORITHMS
Those four algorithms are implemented in MATLAB to test their performance. The used graphs are undirected and connected, with different scales. We compare these four algorithms (GR, GD, VCC and LCVCC) on three aspects: the number of vertices in graph, the weight of the connected vertex set selected and time cost. For every plotted point, we test the algorithm for 100 times and use mathematical expectation as the value and the variance as the error to compare their stability. The graphs are randomly generated. For example, if we need a graph of order n, we first generate n vertices. Then, for arbitrary two vertices, we generate a random number between 0 and 1. If the random number is larger than a given number p, then we add an edge between those two vertices. Obviously, the smaller the p is, the denser the graphs are.
The weight of the vertices are random number between 0 and 1. Figure 5 shows the average degree of the graph with 50, 100, 150 and 200 vertices when p is 0.5 or 0.86. From Figure 5 we know the average degree increases linearly with n, and p affects the slope of the line.
As n (the order of graph) increases, the number and the total weight of the selected vertices also increase. When p = 0.86, denoted the number of the selected vertices by ST , it shows that ST GR = ST GD > ST VCC = ST LCVCC considering the   The selected weight rate WR is defined as , here S is the connected vertex set selected by algorithm. WR = 1 means the algorithm selected the all vertices. When p = 0.86, see Figure 8, it holds that WR GD > WR GR > WR LCVCC = WR VCC . When p = 0.5, Figure 9 shows that WR GD = WR GR > WR VCC > WR LCVCC . By comparing Figure 8 and 9, it can be concluded that VCC and LCVCC performs better than GD or GR, and LCVCC is expected to have better performance in dense graphs than VCC. Notice that the error bar is smaller, which means the solution is stable.
As for the time cost, VCC is the fastest algorithm among these four algorithms, see Figure 10 and Figure 11. It can be also seen LCVCC performs much better in dense graphs than other algorithms.
We also investigated the performance of these algorithms on random graphs, Cartesian product graphs, Strong product   • Given a set of intervals on the real line, an interval graph is an undirected graph in which a vertex for each interval and an edge between vertices whose intervals intersect.
• A unit disk graph is the intersection graph of a family of unit disks in the Euclidean plane.
• The Kneser graph K (n, k) is the graph whose vertices correspond to the k-element subsets of a set of n VOLUME 10, 2022 In our test, for Cartesian product graph and Strong product graph, we focus on the situation when graph G and H are both paths. Figure 12 shows the weight of vertex set selected by VCC and LCVCC compared with GR and GD on random graphs and five special graphs. We can see that VCC and LCVCC reduce the weight of selected vertex set by 20% to 50% compare with GR and GD.

IV. CONCLUSION AND FUTURE WORK
In this paper, we introduce two heuristic algorithms (GR and GD), and propose two new heuristic algorithms (VCC and LCVCC) to solve MWCVC problem, and then compare their performances. Algorithm VCC and LCVCC are expected to have much better performance. In sparse graphs, algorithm VCC and LCVCC performance similar, but much better than GR or GD, and VCC costs the minimum time among these algorithms. In dense graphs, algorithm LCVCC has the best performance. In our test, VCC and LCVCC always give better solutions than GR or GD. We would like to explore more to figure out if better solutions given by VCC and LCVCC all the way.
To improve the performance of LCVCC on sparse graphs, we would like to try different strategy to choose the initial vertex set and the cost-effectiveness function. She received the bachelor's degree in mathematics and applied from Chang'an University. She participated in the National Training Program of Innovation and Entrepreneurship for Undergraduates, in 2020. After graduation, she will go to the Tongji University to pursue the master's degree in solving the independence test problem. Also, she won the National Second Prize in Chinese Mathematical Contest in Modeling and the Meritorious prize in American Mathematical Contest in Modeling, in 2022.
HONGQIANG WANG was born in Hebei, China, in 2000. He entered Chang'an University, in 2019, and majors in mathematics and applied mathematics. He participated in the National Training Program of Innovation and Entrepreneurship for Undergraduates, in 2020. After graduation, he will go to the Northwestern Polytechnical University to pursue the master's degree in harmonic analysis. He won the National Second Prize in Chinese Mathematical Contest in Modeling, in 2020.