Sparse Control Node Scheduling in Networked Systems Based on Approximate Controllability Metrics

This article investigates a novel sparsity-constrained controllability maximization problem for continuous-time linear systems. For controllability metrics, we employ the minimum eigenvalue and the determinant of the controllability Gramian. Unlike the previous problem setting based on the trace of the Gramian, these metrics are not the linear functions of decision variables and are difficult to deal with. To circumvent this issue, we adopt a parallelepiped approximation of the metrics based on their geometric properties. Since these modified optimization problems are highly nonconvex, we introduce a convex relaxation problem for its computational tractability. After a reformulation of the problem into an optimal control problem to which Pontryagin’s maximum principle is applicable, we give a sufficient condition under which the relaxed problem gives a solution of the main problem.

Abstract-This article investigates a novel sparsityconstrained controllability maximization problem for continuous-time linear systems. For controllability metrics, we employ the minimum eigenvalue and the determinant of the controllability Gramian. Unlike the previous problem setting based on the trace of the Gramian, these metrics are not the linear functions of decision variables and are difficult to deal with. To circumvent this issue, we adopt a parallelepiped approximation of the metrics based on their geometric properties. Since these modified optimization problems are highly nonconvex, we introduce a convex relaxation problem for its computational tractability. After a reformulation of the problem into an optimal control problem to which Pontryagin's maximum principle is applicable, we give a sufficient condition under which the relaxed problem gives a solution of the main problem.
Index Terms-Convex optimization, networked systems, optimal control, resource-aware control, sparse control.

I. INTRODUCTION
T HESE DAYS, control system designs that incorporate a notion of sparsity have attracted a lot of attention in the control community. Such an approach is useful to find a small amount of essential information that is closely related to the control performance of interest. There are mainly two types of penalty costs to enhance the sparsity. The first one is the 0 norm, which is defined as the number of nonzero components. This cost is widely used in sparse modeling motivated by the success of compressed sensing, and most of the related works in control systems adopt this type. The second one is the L 0 norm, which is defined as the length of the support. This is an extended version of the 0 norm for functional spaces, and it seems to appear in relatively recent works, e.g., [1]- [5]. However, it should be emphasized that optimization problems involving both the 0 norm and the L 0 norm have not been investigated in the area of sparse optimization, except for our recent study [6], to the best of our knowledge. This study investigates an application of the sparse optimization to the control node selection problem in large-scale networked systems. The purpose of the node selection problem is to identify a small number of nodes called control nodes that should receive exogenous control inputs so that the overall system of interest is effectively guided. This selection problem naturally arises in large-scale networked systems due to physical or financial reasons. For example, let us consider a rebalancing problem on the mobility network of a sharing system with oneway trips, where the control input is the number of vehicles rebalanced by the staff between stations [7]. Staffing (when, which station, and how much) needs to be decided in advance based on the expected demand, behavior dynamics model, and human resource constraints. This is certainly a control node scheduling problem. We also note that the node selection problem is useful to identify leaders in multiagent systems [8], [9]. In recent works, control nodes are chosen based on a metric of controllability. For example, the work [10] considers the minimum set of control nodes that ensures the classical controllability in [11]; the work [12] considers the structural controllability; the works [13] and [14] introduce quantities, such as the trace of the controllability Gramian, that evaluate how much the system is easy to control. More recently, the work [15] has focused on lattice graph with linear dynamics consisting of an infinite number of nodes and has given the analytical expression for the minimum control energy. The work [16] considers a minimizing problem of the maximum eigenvalue of the controllability Gramian subject to a Frobenius norm constraint on the input matrix and gives a closed-form expression for the optimal values. The work [17] shows that the controllability Gramian can be expressed as a Hadamard product of two positive-semidefinite matrices when the system matrix is diagonalizable and provides an algorithm for a single-input system that can avoid the explicit computation of the Gramian when the determinant of the Gramian is used as the controllability metric.
While the aforementioned works investigate the selection problem in which the set of control nodes is fixed over time, more recent works alternatively consider time-varying (TV) This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ control node selection, which is also referred to as control node scheduling. The node scheduling problem finds not only which but also when nodes should be activated and, hence, it seems more challenging and efficient for achieving high control performance. Indeed, Zhao et al. [18] and Nozari et al. [19] consider the scheduling problem for discrete-time systems and show its effectiveness over the time-invariant (TI) control node selection. Mathematically, all of the aforementioned works consider 0 constrained optimization problems, in which the number of nodes selected at the same time is constrained. On the other hand, in networked systems, it is also important to effectively compress control signals and reduce communication traffic. To achieve this, it is desirable to find the best time duration over which controllers should become active. Then, we considered the L 0 constraint on control inputs and formulated a node scheduling problem for continuous-time systems in [20], which is based on a controllability metric of the trace of the Gramian. This scheduling problem is furthermore analyzed in [21], which provides an explicit formula of the optimal solutions and shows that the solutions are obtained by a greedy algorithm. However, these two works on continuous-time systems mainly consider the L 0 control cost, and the resulting number of activated control nodes at each time instance (i.e., the 0 control cost) is not taken into account. Furthermore, the classical controllability is not automatically ensured by the trace metric, since the designed Gramian may include the zero eigenvalue.
In view of this, this article newly proposes an optimal node scheduling method that satisfies both the 0 and L 0 constraints and ensures the classical controllability for continuous-time linear TI systems. By introducing the two constraints, we can find a TV small number of control nodes while reducing the support of control inputs. As the network controllability, we consider two types of metrics: 1) the minimum eigenvalue of the controllability Gramian, which is inversely related to the worst-case control energy to steer the network state from the origin to any point on the unit sphere in the state space and 2) the determinant of the controllability Gramian, which is proportional to the volume of the ellipsoid consisting of the states that can be reached from the origin with a unit energy control input. Note that both of the controllability measures naturally ensure the classical controllability, since any selection that makes the system uncontrollable returns the worst cost value, unlike the trace measure addressed in [20] and [21]. The formulated problem includes a combinatorial structure caused by the L 0 and 0 norms. To circumvent this, we introduce a convex relaxation problem and establish a condition for the main problem to be exactly solved via the convex optimization. For the analysis, we transform the convex relaxation problem to an optimal control problem to which Pontryagin's maximum principle is applicable.
The rest of this article is organized as follows. Section II provides mathematical preliminaries. Section III formulates our node scheduling problem. Section IV introduces a convex relaxation problem and gives a sufficient condition for the main problem to boil down to the convex optimization. Section V illustrates numerical examples of the proposed node scheduling. Finally, Section VI concludes this article.

II. MATHEMATICAL PRELIMINARIES
This section reviews notation that will be used throughout this article. We denote the set of all positive integers by N and the set of all real numbers by R. Let m, n ∈ N and Ω ⊂ R. For a vector a = [a 1 , a 2 , . . . , a m ] ∈ R m , diag(a) denotes the diagonal matrix whose (i, i)-component is given by a i , and a ∈ Ω m means a i ∈ Ω for all i. The 0 norm and 1 norm of a are defined by a 0 #{i ∈ {1, 2, . . . , m} : a i = 0} and a 1 m i=1 |a i |, respectively, where # returns the number of elements of a set. We denote the Euclidean norm by a ( m i=1 a 2 i ) 1/2 . We denote the identity matrix of size m by I m . For any M ∈ R m×n , M denotes the transpose of M . The intersection of all the convex sets containing a given subset C of R m is called the convex hull of C and is denoted by coC. Note that the convex hull of a finite subset {c 1 , c 2 , . . . , c n } of R m consists of all the vectors of the form n i=1 ν i c i , with ν i ≥ 0 for all i and n i=1 ν i = 1. Let C be a closed subset of R m and a ∈ C. A vector δ ∈ R m is a proximal normal to the set C at the point a if and only if there exists a constant σ ≥ 0 The proximal normal cone to C at a is defined as the set of all such δ, which is denoted by N P C (a). We denote the limiting normal cone to C at a by N L C (a), i.e., where μ L is the Lebesgue measure on R. We denote the set of all functions s with s L p < ∞ by L p . The subgradient of a function f : R n → R at x ∈ R n is denoted by ∂f (x), i.e., the set of all ζ ∈ R n satisfying for all y ∈ R n . We say that ζ ∈ R n is the proximal subgradient of f at x if for some σ ≥ 0 and for some neighborhood X of x, we have for all y ∈ X. The set of all such ζ is called the proximal subdifferential of f at x and is denoted by ∂ P f (x). The limiting subdifferential ∂ L f (x) of f at x is defined by We call a vector-valued function with absolutely continuous components arc [22, p. 255].

A. System Description
Let us consider a network model consisting of n nodes and define the overall system bẏ . , x n (t)] ∈ R n is the state vector consisting of n nodes, where x i (t) is the state of the ith node at time t, u(·) ∈ R m is the exogenous control input that influences the network dynamics, A ∈ R n×n is the dynamics matrix that represents the information flow among nodes, is a constant matrix that represents candidates of control nodes, s(·) ∈ {0, 1} m represents the activation schedule of the control input u, and T > 0 is the final time of control. Throughout this article, we put the following assumption on A.
, is able to affect the system through the vector b j at time t if and only if s j (t) = 1, and the nodes that receive the inputs are called control nodes. In other words, the control node scheduling problem seeks an optimal variable s over [0, T ] based on a given cost function and some constraints. In particular, this article considers the question of which and when control nodes should be activated so that the control energy required to steer the network state from the origin (i.e., x(0) = 0) to any target state is as small as possible. We note that after the optimal variable s is found, the function S(·) = diag(s(·)) is fixed and the minimum energy control is naturally expected to be implemented.

B. Main Problem
This article is interested in an energy-saving node scheduling in networked systems. To quantify the required control energy, several metrics have been proposed (see, e.g., [13] and [14]). Among them, this article considers two metrics, i.e., 1) the minimum eigenvalue and 2) the determinant of the controllability Gramian. In short, these metrics are used to design the shape of the reachable set R with a unit energy control input, where the set R for the system (1) is defined by We recall that the minimum energy controlǔ, which steers the state from the origin to a target state x f at time T with minimum L 2 norm, satisfies ǔ 2 [23], where G is the controllability Gramian for the linear system (1) defined by Hence, the reachable set R can be rewritten as an ellipsoid This implies that the directions and the lengths of the axes of R are given by the eigenvectors of the Gramian and the square root of the corresponding eigenvalues, respectively. Hence, the minimum eigenvalue of the Gramian, denoted by λ min (G), is related to the minimum length of the axes of R and is adopted as a controllability measure for the worst-case analysis. On the other hand, the determinant of the Gramian, denoted by det(G), is proportional to the volume of the reachable set R. Precisely where Γ is the Gamma function. Thus, the network controllability is enhanced by selecting a function s(·) that makes the metrics λ min (G) and det(G) large.
Here, we provide the geometric property of the reachable set R, which is the counterpart of [24,Th. 4.7] for TV systems.
In what follows, we denote the eigenvalues of A by λ i ∈ R and define Λ diag([λ 1 , λ 2 , . . . , λ n ] ). We also denote by v i eigenvectors corresponding to the eigenvalues where the symbol ǐ denotes that the ith vector is removed from the list. Assumed i > 0 for all i. The following holds. 1) Fix any x f ∈ R and take u such that Then, we have   Let us consider a system (1) with .
In Fig. 1 , the ellipsoid in solid blue shows the reachable set R defined by (2). (We recall that R is the set of all states that can be reached from the origin by using a unit energy control.) The parallelepiped P defined by (3) is shown in solid green. (Note that in this example, P is a rectangle, since the matrix A is symmetric and v 1 and v 2 are orthogonal to each other.) As shown in Proposition 1, we can see that P is tangent to R. We also show the axes of the ellipsoid R in the dashed blue line, which is given by where μ i and ρ i are the eigenvalue and the eigenvector of the Gramian G, respectively. Motivated by this geometric property, this article considers the design of P that can make the aforementioned controllability measures large. In other words, we consider the maximization problems of the minimum value ofd i and the volume of P. Note that we have Then, we define the cost functions by The cost functions J 1 and J 2 represent the minimum length of the axis and the volume of the parallelepiped P up to constant, respectively. As the constraints, we introduce the L 0 and 0 constraints on inputs to take account of the upper bound of the total time length of node activation and the number of activated nodes at each time. Now, we are ready to describe the main problems.
In this article, we will show that Problems 1 and 2 are exactly solved via an equivalent convex optimization problem. Note that two optimization problems are said to be equivalent if the set of all optimal solutions coincides.
Remark 2: An efficient sparse activation schedule of control inputs is addressed in related works [2] and [25]. While these works consider the activation schedule after the B-matrix is given, we note that our study is interested in the optimal design of the matrix (precisely, BS(·)) based on the controllability performance and the sparsity constraints.

IV. ANALYSIS
In this section, we provide an equivalence theorem between the main problems and the corresponding convex relaxation problems.

A. Minimum Eigenvalue of the Gramian
We first consider Problem 1. The convex relaxation problem is described as follows, where the L 0 and 0 norms are replaced by the L 1 and 1 norms, respectively. Problem 3: Given A ∈ R n×n , B ∈ R n×m , T > 0, β ∈ [1, m) ∩ N, and α j ∈ (0, T ], j = 1, 2, . . . , m, find a TV matrix S(·) diag(s(·)), s(·) [s 1 (·), s 2 (·), . . . , s m (·)] , which solves The set of all functions that satisfy the constraints of an optimization problem is called feasible set. Let us denote the feasible set of Problems 1 and 3 by S 0 and S 1 , i.e., Note that the set S 0 is also a feasible set of Problem 2. Note also that S 0 ⊂ S 1 , since s j L 1 = s j L 0 for all j and The inclusion is proper in general, since the 1 and L 1 constraints do not automatically guarantee that the 0 and L 0 constraints and some functions in S 1 are not obviously binary. Then, we first show the discreteness of solutions of Problem 3, which guarantees that the optimal solutions of Problem 3 belong to the set S 0 . For this, we prepare lemmas.
Lemma 1: The following holds. 1) Define a set E {a ∈ R m+n : a j ≤ α j , j = 1, 2, . . . , m} 2) Define f (ξ) max{ξ m+1 , ξ m+2 , . . . , ξ m+n } on R m+n . For any ξ ∈ R m+n , we have where 0 m is the zero vector in R m , e k is the kth canonical vector in R n , and Hereafter, we impose the following assumption; see Remark 3 for its interpretation.
Assumption 2: A is diagonalizable and nonsingular, and all the systems (A, b j ) and (A, b i ± b j ) are controllable for all i, j with i = j.
Remark 3: For an intuitive understanding of Assumption 2, let us consider the case with a scalar state and two nodes: We can observe that the following modifications do not change the cost functions J 1 and J 2 .
These modifications, however, degrade the sparsity. Assumption 2 excludes similar possibilities that the relaxed L 1 -1 problems have a nonsparse optimizer for general cases. Lemma 2: Denote by Q ∈ R n×m the matrix whose (i, j)component is given by where Hence, for each j, the value s j L 1 is equal to the final state y j (T ) of the systemẏ is equal to the final state z i (T ) of the systeṁ with z i (0) = 0. Define y [y 1 , . . . , y m ] ∈ R m , z [z 1 , . . . , z n ] ∈ R n , ξ [y , z ] ∈ R m+n . Then, Problem 3 is equivalently expressed as follows: where f is defined in Lemma 1 and Q is defined in Lemma 2. This is an optimal control problem to which Pontryagin's maximum principle [22,Th. 22.26] is applicable. Let the process (ξ * , s * ) be a local minimizer of the problem (9), and define the Hamiltonian function H : R m+n × R m+n × R m → R associated with the problem (9) by Then, it follows from the maximum principle that there exists a constant η equal to 0 or 1 and an arc p : [0, T ] → R m+n satisfying the following conditions: 1) the nontriviality condition for all t ∈ [0, T ]: (η, p(t)) = 0 2) the transversality condition: where the set E is defined in Lemma 1; 3) the adjoint equation for almost every t ∈ [0, T ]: where D ξ H is the derivative of the function H at the first variable ξ; 4) the maximum condition for almost every t ∈ [0, T ]: (12). Note also that the supremum in (13) is attained by a point in S, since the right-hand side is a continuous function of s and S is a closed set. In other words, we have almost everywhere, where We here claim that η = 1. Indeed, if η = 0, then it follows from Lemma 1 and (11) that From (17), we have p (2) (0) = 0 and p (2) (t) = 0 on [0, T ]. Hence, from (10) and (16), there exists j 0 ∈ {1, 2, . . . , m} such that p (1) j 0 (0) < 0. Then, s * j 0 (t) = 0 almost everywhere by (14). This implies ξ * j 0 (T ) = 0 by the dynamicsξ j 0 = s j 0 with the initial condition ξ j 0 (0) = 0. Then, we have j 0 (0)α j 0 = 0 which contradicts to (15). Thus, η = 1.
In what follows, we show the following under the assumption. 1) We have almost everywhere. By showing this, we find We first show (18). For this, let us suppose φ j (t) = 0 on a set of positive measure for some j. Then, we have φ j (t) = 0 on [0, T ], since φ j is analytic [26]. Hence for r ∈ {1, 2, . . . , n}. In other words where the matrix M j is defined in Lemma 2. From Lemma 2, M j is nonsingular, and hence, we have p (2) (0) = 0, which implies p (2) (T ) = 0. However, this contradicts to (11) from Lemma 1. We next show (19). This can be confirmed similarly. Precisely, let us suppose φ i (t) = φ j (t) on a set of positive measure for for r ∈ {1, 2, . . . , n}, which gives Hence, we have p (2) (0) = 0 from Lemma 2, which contradicts to (11) from Lemma 1. This completes the proof.
The following theorem is the main result, which shows the equivalence between Problems 1 and 3.
for all j, where we used the discreteness ofŝ. Sinceŝ ∈ S 1 , we have ŝ(t) 0 ≤ β and ŝ j L 0 ≤ α j for all t and j. Thus, s ∈ S 0 . Then where the first relation follows fromŝ ∈ S 0 , the second relation follows from S 0 ⊂ S 1 , and the last relation follows fromŝ ∈ S * 1 . Hence, we have which impliesŝ ∈ S * 0 . Hence, S * 1 ⊂ S * 0 and S * 0 is not empty. Next, take anys ∈ S * 0 . Note thats ∈ S 1 , since S * 0 ⊂ S 0 ⊂ S 1 . In addition, it follows from (20) that J 1 (s) = J 1 (ŝ). Therefore, s ∈ S * 1 , which implies S * 0 ⊂ S * 1 . The existence of optimal solutions of Problem 3 is assumed in Theorems 1 and 2. We finally show the existence. Proof: Note that the set S 1 is not empty, since α j and β are positive for all j. Hence, we can define the optimal value θ sup Then, there exists a sequence {s (l) } l∈N ⊂ S 1 such that lim l→∞ J 1 (s (l) ) = θ. Define q (l) 2s (l) − 1. Since the set {q ∈ L ∞ : q L ∞ ≤ 1} is sequentially compact in the weak * topology of L ∞ [27], there exists q (∞) with q (∞) L ∞ ≤ 1 and a subsequence {q (l ) } such that each component {q (l ) j } converges to q (∞) j in the weak * topology of L ∞ , i.e., for any ψ ∈ L 1 and j = 1, 2, . . . , m.

B. Log Determinant of the Gramian
We next consider Problem 2. The convex relaxation problem is described as follows.
2) Problem 4 is equivalent to Problem 2. Proof: Note that it follows from the assumption that for any i ∈ {1, 2, . . . , n}, there exists j ∈ {1, 2, . . . , m} such that w i b j = 0, which is observed from the proof of Lemma 2. Hence, the optimal values are finite in Problems 2 and 4. We first show the existence of an optimal solution of Problem 4. Let us denote the optimal value by θ, and take a sequence {s (l) }, subsequence {s (l ) }, and functions (∞) , as in the proof of Theorem 3. Then, as seen in the proof, we haves (∞) ∈ S 1 . In addition, we have from the continuity of the logarithmic function. This implies the optimality ofs (∞) in Problem 4.
We next show the equivalence between Problems 4 and 2. Note that, for all i ∈ {1, 2, . . . , n}, is equal to the final state z i (T ) of the systeṁ with z i (0) = 0. Hence, Problem 4 can be written as follows: where g(ξ) − m+n i=m+1 log ξ i . By applying Pontryagin's maximum principle to this optimal control problem, we can see that any optimal solution takes only the values in {0, 1} almost everywhere, in a similar way to the proof of Theorem 1. Note that in this case, we have Finally, the equivalence follows from the proof of Theorem 2.
Remark 4: The sparse control node scheduling problem as in Problems 1 and 2 can also be formulated for discrete-time systems. However, we cannot show any equivalence for its convex relaxation similar to Problems 3 and 4. Actually, through numerical simulations, we have confirmed that the optimal solution of the relaxed convex problem is frequently not sparse.

A. Example 1
This section illustrates our node scheduling with numerical examples. We first consider a network model (1) consisting of two nodes with For this network, we simulated our node scheduling method with T = 2, α 1 = α 2 = 0.8, and β = 1. In this example, each node is a candidate for control nodes since B is the identity matrix, but the L 0 and 0 constraints impose us to select at most one control node at each time and provide a control input to each node at most 0.8 s. Note that this example satisfies the assumption in Theorems 2 and 4. Hence, Problems 1 and 2 are equivalent to Problems 3 and 4, respectively. Then, we applied CVX [29] in MATLAB, which is a software for convex optimization, to Problems 3 and 4.
Figs. 2 and 3 show the resulting time series of the control node on [0, T ] and the corresponding reachable set. Certainly, we can see that the set of control nodes depends on the time and satisfies both the L 0 and 0 constraints. Thus, we can find a finite number of essential nodes at each time and an essential time interval to provide control inputs, based on a given controllability measure. For comparison, we also simulated the TI node selection [14], where the matrix S is constant and the control node is fixed on [0, T ]. From β = 1, one node can be activated on [0, T ], that is, we consider for each k = 1, 2. Note that the space sparsity is taken into account in this TI case, while the time length of activation is necessarily 2 s, which is greater than 2 j=1 α j , and the time sparsity is not considered. The optimal solution v * of the problem (25) is v * = [0, 1] for each k ∈ {1, 2}, which implies that node 2 should be activated on [0, T ]. The minimum eigenvalue of the resulting controllability Gramian and the volume of the resulting reachable set are 0.0301 and 0.8116, respectively. In our scheduling method, these values are 0.6471 and 3.0457, which   are increased by about 20.6 times and 3.8 times, respectively. The reachable sets for the TI selection are shown in Fig. 4.

B. Example 2
We next simulate our proposed method for three kinds of networks (i.e., Erdös-Rényi [30], , and Watts-Strogatz [32] models) with n = 100 nodes. For each model, we randomly generate 50 networks and define them as the adjacency matrix A and then solve Problem 3. Through the 50 simulations for each model, we compute the average of the minimum eigenvalue of the optimal controllability Gramian. In this example, we assume that the first 50 nodes are the candidates for control nodes, i.e., m = n/2, B = [I m×m , 0 m×m ] . For the sparsity constraints, we take β ∈ {m/5, 2m/5, . . . , m} and α j = T β/m for all j, where T = 1 is fixed. The parameters of each model are as follows. In the Erdös-Rényi (ER) model, we first define the n nodes and then connect each pair of nodes with probability 0.3. In the Barabási-Albert (BA) model, starting from five nodes, five new nodes are added to the network at each time step. In the Watts-Strogatz (WS) model, after constructing a regular ring lattice with n nodes connected to four neighbors, edges are rewired with probability 0.2. Fig. 5 shows the obtained results (log 10 scale), where our optimal solution is colored in red and the worst of the minimum eigenvalues by the proposed method through the 50 simulations is shown in yellow. For comparison, we also simulate a TI selection method based on a greedy algorithm presented in [14] (in blue) and a TI selection method that randomly selects fixed β nodes (in green). We note that in this example, for β = m, the TI selections do not satisfy the sparsity constraint in the proposed optimization, since α j < T for all j. The line in black shows the average of the minimum eigenvalue of the Gramian when all the candidate nodes are selected on [0, T ], i.e. β = m, α j = T , ∀j. (Hence, this line shows an upper bound of the control performance. ) We first note that, taking the upper bound shown in black into account, sufficiently large eigenvalues are obtained with a relatively small β and α j by the proposed method, compared to the TI methods. Furthermore, as the yellow line indicates, our proposed method stably provides a good minimum eigenvalue. From this observation, we can expect the reduction of node activations by the proposed TV method while keeping the control performance.
We next note that our proposed method particularly outperforms the TI methods for especially small β and α j . For example, the TI methods for the BA and WS models with β/m = 0.2 provide quite small eigenvalues. This illustrates the difficulties of making the system controllable with a fixed small number of control nodes and the advantage of the proposed TV method. On the other hand, for large β, the advantage of the TV method over the TI method tends to be small or zero. This is because when a large number of control nodes can be selected at a time, the number of candidates for control nodes that can be newly selected at the next time is necessarily small even if the TV method is applied, and the improvement seems to be less. In addition, in this setting, our optimal solution can totally activate nodes at most T β and its time sparsity gets higher than that of the TI methods for especially large β. The qualitative investigations on the advantages of the TV method over the TI method and the statistical characterizations would be addressed in future work.

VI. CONCLUSION
This article analyzed two node scheduling problems that are related to the minimum eigenvalue of the controllability Gramian and the volume of the reachable set. This framework enables us to find an activation schedule of control inputs that steer the system while saving energy. Taking the number of control nodes and the time length of activation into account, our optimization problem newly includes two types of constraints on sparsity. We showed a sufficient condition under which our sparse optimization problems can be solved by convex optimization. With numerical examples, we illustrated the advantage of the proposed method over the conventional TI method. In this article, we introduced the approximated controllability metrics J 1 and J 2 to mathematically prove the equivalence of the relaxed problems. From an optimization perspective, the maximization of the exact cost function log det(G), which is convex, is tractable once the L 0 / 0 constraints are replaced by L 1 / 1 ones (see also Remark 4). Theoretical analysis for these relaxation is currently under investigation. In another direction, while this article assumes that the network topology among nodes is given and fixed, the design of the TV topology based on the multiple sparsity would be included in future work.

APPENDIX A PROOF OF PROPOSITION 1
The relation (4) follows from: In addition, it follows from the Schwarz inequality and u L 2 ≤ 1 that we have:  numbers {c 1 , . . . , c i −1 , c i +1 , . . . , c n } such that On the other hand, take any x ∈ R n such that w i x =d i . Since {v 1 , v 2 , . . . , v n } is a basis, there uniquely exists a set of real numbers {c 1 , c 2 , . . . , c n } such that i . Therefore, the hyperplane H + i is given by H + i = x ∈ R n : w i x =d i . Let us denote by x (i ) the state at time T corresponding to the control u (i ) (t) d −1 i e λ i (T −t) S(t)B w i . Then, we have x (i ) ∈ H + i ∩ R from (4) and statement 2). It also follows from (4) and statement 2) that w i x ≤d i for any x ∈ R. Thus, H + i is a supporting hyperplane to the convex set R at x (i ) ∈ R. Since the convex function x G −1 x : R n → R is differentiable, such a supporting hyperplane is unique [33] and gives the tangent hyperplane [34].

APPENDIX C PROOF OF LEMMA 2
Fix any vector b ∈ R n . Then, there exists a unique vector ρ = [ρ 1 , ρ 2 , . . . , ρ n ] ∈ R n such that b = V ρ = n k=1 ρ k v k , since {v 1 , v 2 , . . . , v n } is a basis. Then n ] ∈ R n such that b j = V ρ (j) . It follows from the observation above that if the system (A, b j ) is controllable, then: λ k = λ l ∀k, ∀l with k = l.
Note that w k b j = ρ