Size Constrained Clustering With MILP Formulation

Clustering is one of the essential tools for data mining since it reveals the natural structures of the unlabeled data. Many clustering algorithms have been proposed in the last decades. However, few of them are designed to adapt prior knowledge that is available in many real applications, such as the sizes of clusters. In this paper, we propose a novel iterative clustering algorithm that can impose the constraints on the sizes of clusters. Given an unordered set of cluster size constraints, the proposed method minimizes the mean squared error (MSE) while simultaneously considers the size constraints. Each iteration of the proposed method consists of two steps, namely an assignment step and an update step. In the assignment step, the observations are assigned into clusters under the size constraints. The assignment task is modeled as an integer linear programming (ILP) problem. We prove that part of the constraint matrix of this ILP problem is total unimodular. Therefore, the integer constraints on most of the variables can be omitted so that the problem would become a mixed integer programming (MILP) problem which is much easier to solve. In the update step, new cluster centroids will be updated as the centers of the observations in the corresponding clusters. Experiments on UCI data sets indicate that (1) imposing the size constraints as proposed could improve the clustering performance; (2) compared with the state-of-the-art size constrained clustering methods, the proposed method could efficiently derive better clustering results.


I. INTRODUCTION
Clustering is one of the most fundamental unsupervised learning methods that has been employed in many disciplines [1]- [4]. Many clustering algorithms have been proposed [5], such as k-means [6], spectral clustering [7], hierarchical clustering [8], fuzzy c-means [9], clustering ensemble methods [10]- [12], etc. The algorithms are intended to partition observations into k homogeneous and well-separated clusters so that observations in a cluster are similar to one another, yet dissimilar to observations in other clusters. Although traditional clustering algorithms have achieved decent performance in wide applications, the solution naturally found from a set of data by using a fully unsupervised clustering algorithm may not always be close to the one that users seek.
The associate editor coordinating the review of this manuscript and approving it for publication was Abdullah Iliyasu .
Fortunately, in many real applications such as gene clustering [13], [14], face clustering in videos [15], facility location problem [16], automatic lane detection problem [17] and customer segmentation problem [18], there exists some background knowledge about the data which can be obtained beforehand. Such background knowledge usually reflects itself as the user specified constraints which can be classified into two types [19], namely clusterlevel constraints [16], [18], specifying requirements on the clusters, and instance-level constraints [13]- [15], [17], specifying requirements on pairs of observations. In the last decades, many studies have been done in the field of constrained clustering [20]. However, most of them focus on the instance-level constraints. Little attention is drawn to the cluster-level constraints. In this paper, our focus is on the cluster-level constraints, specifically, cluster size constraints.
One of the motivations for introducing the size constraints into clustering is to improve the clustering results. As shown VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/ in [21]- [24], introducing the size constraints could prevent the formation of tiny or even empty clusters. Moreover, the studies in [25]- [27] indicate that imposing the actual sizes of the clusters as constraints could improve the clustering performance. The size constraints driven by the needs of improving results are mostly soft, i.e., the size constraints do not have to be strictly satisfied. Another motivation for the research of size constrained clustering is the application requirements. For example, [28] proposes to impose upper bounds on the sizes of clusters so as to maximize the lifetime of the wireless sensor network. In the article clustering problem [29], the articles are clustered under the equality constraints that there are a specific number of articles in each session. The authors in [27] believe that the size constrained clustering in equality form could be applied to job scheduling, where the jobs are assigned to machines with different capacities. [30] claims that the task of resource allocation can be modeled as the size constrained clustering problem, where the sizes of clusters equal the fixed resource capacities. More examples can be found in image searching [31], customer segmentation [18], where the balanced size constraints are imposed on the clustering tasks. The size constraints driven by the application requirements are mostly hard, i.e., the size constraints have to be strictly met.
A common strategy in the field of clustering is to choose k centroids and then minimize the averaged squared distance between the observations and the corresponding cluster centroids. There is a proportional choice to measure the distance aforementioned, i.e., the mean squared error (MSE) [32], which is one of the most popular cost functions used in clustering [24], [33]- [35]. By optimizing the MSE, similar observations are put into the same cluster, yet dissimilar observations are arranged in different clusters. Furthermore, the MSE can be optimized by a mature iterative solution (k-means) that converges rapidly. In this paper, we adopt a similar iterative strategy in our size constrained clustering method. The high efficiency of the iterative strategy makes it possible for us to evaluate the proposed method on large data sets.
In this paper, we propose a novel clustering algorithm that optimizes the MSE under the hard size constraints. Given an unordered set of cluster size constraints as prior knowledge, the proposed method minimizes the MSE while simultaneously ensures that each cluster chooses the optimum size. The proposed algorithm runs in an iterative manner. There are two steps in each iteration, namely an assignment step and an update step. In the assignment step, the assignments between the observations and the clusters are established. The assignment task is formulated as an integer linear programming (ILP) problem [36]. There are two types of variables in the constraints of this ILP problem, which we define as the observation partition decision variables (OPDVs) and the cluster size decision variables (CSDVs). We prove that the integer constraints on the OPDVs can be directly removed. The ILP problem is then simplified as a mixed integer linear programming (MILP) problem [37]. In the update step, the observations in each cluster are averaged to derive the corresponding cluster center. We have conducted experiments to evaluate the proposed method. The data sets involved in our experiments are taken from the UCI machine learning repository [38]. Various external validity indices including the Entropy (ENT) [39], Accuracy (ACC) [40], Fowlkes and Mallows Index (FMI) [41] and Jaccard Index (JCI) [42] are explored. Besides, we evaluate the methods regarding objective function values and efficiency.
The results indicate that the proposed method could efficiently leverage the size constraints to improve the clustering performance.
The rest of this paper is organized as follows. The literature review is provided in Section II. Section III covers the details about our proposed size constrained clustering algorithm. In Section IV, the experimental settings and results are given. The discussion is presented in Section V. Finally, Section VI concludes the paper and presents possible directions for further investigation.

II. RELATED WORK
In this section, we briefly review some of the methods that are proposed for clustering with size constraints. Typically, the size constraints can be roughly separated into two categories [43], namely soft size constraints and hard size constraints. Soft size constraints are usually added to the objective functions as regulation terms, so they may not be strictly satisfied. On the other hand, hard size constraints are the conditions that must be met.

A. SOFT SIZE CONSTRAINED CLUSTERING
The soft size constrained clustering methods are more often used to improve the clustering results. For example, the ratio cut [22] and normalized cut [21] introduces direct and indirect equipartition constraints to the objective function of min-cut [44] to prevent the formation of tiny or even empty clusters. [31], [45] present frequency sensitive competitive learning (FSCL) with the multiplicative or additive bias to penalize large clusters so that large clusters are less likely to win observations. [46] proposes a scalable framework to keep the balance of clusters, which applies to a wide range of clustering algorithms. The method first performs clustering on the downsampled data set and then populates the clusters with the remaining data. It is reported that the method is very efficient in practice (O(kn log(n))). In [47], an extension of k-means algorithm was presented. It introduces the size constraints by adding three punishment terms, which includes the overall size divergence cost, oversize cost, and undersize cost. Although this approach achieves certain improvement, it has too many parameters that need to be set, and there is no guidance for the setting of these parameters. [25] presents a size regularized cut (SRcut) by exploring the sizes of the clusters as prior knowledge to guide the clustering process.
As a result, the method improves the clustering performance over traditional methods. [26] proposes to regularize the size constraints with submodular functions. An algorithm based on submodular optimization techniques is presented to solve the size constrained clustering problem. In [23], the authors exploit the exclusive lasso to exert the balanced size constraints, and they apply the idea into the min-cut and k-means algorithm. Their experiments indicate improved results. Although there has been significant progress in the soft size constrained clustering algorithms, they may not cater to a large number of real applications since they only treat constraints as guidance rather than requirements that must be met. In this paper, we mainly focus on clustering with hard size constraints.

B. HARD SIZE CONSTRAINED CLUSTERING
Despite the great application requirements, few studies have been done in the field of hard size constrained clustering. [33] proposes an iterative method for clustering with hard balanced size constraints. The method transforms the k-means assignment step into a balanced assignment problem that can be solved by the Hungarian algorithm [48]. It works fine when the number of observations n is divisible by the number of clusters k. However, when n is not divisible by k, it has trouble in deciding the optimal size for each cluster. [49] imposes size constraints to an adapted neural gas algorithm. The method ensures that the constraints on the cluster sizes are satisfied. However, the greedy strategy cannot guarantee the optimality of the clustering algorithm. A variant of fuzzy c-means (FCM) clustering algorithm is proposed in [30]. The size constraints are integrated into the objective function via Lagrange multipliers. The experimental results show that it outperforms traditional clustering algorithms. The authors in [50] propose a deterministic clustering approach based on the deterministic annealing (DA) algorithm to address the capacitated resource allocation problem with several forms of size constraints. [27] proposed a method for clustering with hard size constraints. A clustering algorithm without considering any size constraints is applied to get the initial partition, and the final result is derived by finding a constraint-satisfying partition that maximizes its agreement with the initial partition. The method is very efficient in practice. However, it fails to consider the similarity between the observations during the reassigning process. [29] presented a hard size constrained clustering methods based on ILP. The method works if we assume that the correspondences between the initial centers and the cluster sizes are known. Nevertheless, the assumption is rarely practical in real applications. There are also studies which put hard lower bounds [24], [51] and upper bounds [52] on the sizes of clusters. In Section V, we show that the proposed method could also be adapted to facilitate the inequality constraints.

III. SIZE CONSTRAINED CLUSTERING A. NOTATIONS
The key notations involved in our method are shown in Table 1.
where o i − c j 2 denotes the squared Euclidean distance between the i-th observation and the j-th cluster center. Notice that the size of the j-th cluster f j does not have to be the j-th size constraint s j . Because at the beginning of the clustering process, the correspondences between the cluster sizes and the size constraints are unclear. We can not specify which cluster is the j-th cluster f j to have s j observations. The number of possible correspondences between the cluster sizes and the size constraints is k!, and we want the algorithm to automatically choose the optimum one from the k! possible correspondences.
Let p be the partition matrix of size n × k, where each row of p represents an observation, and each column represents a cluster. p i,j = 1 indicates that observation x i belongs to cluster j, while p i,j = 0 means otherwise. It is clear that summing each row of p equals 1 because each observation can only be assigned to one cluster, i.e., k j=1 p i,j = 1, i = 1, 2, ..., n. On the other hand, summing each column of p equals the size of the corresponding cluster f j . Thus we reformulate the problem as: Let q be an auxiliary matrix of size k × k, where q j,l = 1 indicates that j-th cluster chooses l-th size constraint, q j,l = 0 means otherwise. It is obvious that summing each row or column of q equals 1 as the correspondences between cluster sizes and size constraints are one-to-one. The problem can be further reformulated as: To simplify the description in Section III-D, we define three kinds of variables for the above problem, the cluster centers c, the variables in the partition matrix p which we call observation partition decision variables (OPDVs), and the variables in the auxiliary matrix q which we call cluster size decision variables (CSDVs). The OPDVs are used to indicate the relations between the observations and the clusters. The CSDVs have no explicit impact on the objective function E. They are used to indicate the correspondences between the cluster sizes and the size constraints.
It should be noted that once the sizes of the k−1 clusters are set, the sizes of all k clusters can be determined, so one of the cluster size constraints in Equation (3) is redundant. However, we have found that the redundant size constraint has no significant impacts on the results. For ease of description and comprehension, we intend to keep the redundant constraint for the rest of this paper.

C. PROPOSED SOLUTION 1) ASSIGNMENT STEP
In the assignment step, we try to solve Equation (3) with respect to p while holding c fixed, i.e., we try to assign the observations to the cluster centers so as to optimize the MSE under the given set of size constraints. The problem here is an ILP problem and the integer constraints on the decision variables make it difficult to solve. A typical solution to the ILP problem is to repeatedly solve the LP relaxations with the simplex algorithm in a branch-and-bound way [53]. In the worst case, the simplex algorithm needs to be executed exponential times which is computationally expensive.
If the constraint matrix of the ILP is totally unimodular, then the integer constraints can be removed [54]. In our case, the constraint matrix on all the decision variables is not totally unimodular. However, as indicated by Theorem 1 in Section III-D, the constraint matrix on the OPDVs is totally unimodular, so most of the integer constraints can be removed from our ILP problem. Specifically, the integer constraints on the n × k OPDVs can be removed, which leaves us only the integer constraints on the k × k CSDVs. The ILP problem would then become an MILP problem that can be solved in much less running time.

2) UPDATE STEP
In the update step, we try to solve Equation (3) with respect to c while holding p fixed. Actually, once the observations are assigned, new centroids should be updated so that the MSE is minimized. Since p is fixed, Equation (3) can then be relaxed to an unconstrained optimization problem as shown below.
Obviously, the optimal E can be achieved when With the description above, the proposed size constrained clustering can be detailed as Algorithm 1.
The proposed algorithm is guaranteed to converge, but not always to the global optimum due to the non-convexity of the objective function. In each iteration, the value of the objective function monotonically decreases. Assuming that p (t) and c (t) are the OPDVs and centroids at the end of the t-th iteration respectively. In the assignment step of the (t + 1)-th  solve the MILP problem stated in Section III-C.1 for p (i) and q (i) ; 6: Update step: update the centroids as c (i) according to Equation (5); 7: until The centroids no longer change 8: return p (i) , q (i) , c (i) ; iteration, we optimize Equation (3) with respect to p while holding c fixed, thus we have E(c (t) , p (t+1) ) ≤ E(c (t) , p (t) ). In the update step, we optimize Equation (3) with respect to c while holding p fixed, thus, there must be E(c (t+1) , p (t+1) ) ≤ E(c (t) , p (t+1) ).

D. EFFICIENCY ANALYSIS
Typically, the ILP problem in standard form can be formulated as follows. An interesting property of the ILP problem is that if the constraint matrix M is totally unimodular (a matrix is totally unimodular if and only if the determinant of every square submatrix is 0,1,or −1) and the vector b is integral, the integer constraints can be removed so that the ILP problem can be relaxed to an LP problem that still guarantees the integral solution [56]. Although the constraint matrix in our case is not totally unimodular, it is special in that its submatrix on OPDVs is totally unimodular according to Theorem 1 (the proof can be found below). Therefore, we could remove the integer constraints on the OPDVs. It will greatly reduce the complexity during the course of branch-and-bound. Before the proof of Theorem 1, we first introduce Lemma 1 and Lemma 2. The two Lemmas can also be found in [54].
Lemma 1: The constraint matrix M remains totally unimodular if multiplying a column (row) with −1.
For example, if is totally unimodular, then we multiply the first row with −1, we can get According to Lemma 1, the total unimodularity is preserved. Lemma 2: The constraint matrix M is totally unimodular if it has at most two non-zero entries being ±1 in each column (row), and, for every column (row) with two non-zero entries, the sum of the column (row) is 0.
For instance, given a matrix According to Lemma 2, the matrix is totally unimodular since it only contains two non-zero entries with ±1 in each column, and the sum of each column equals 0. Theorem 1: The constraint matrix on the OPDVs in the ILP problem described in Section III-C.1 is totally unimodular. Proof: Putting all the OPDVs in a vector y, we have: y = p 1,1 p 1,2 · · · p n,k then the constraint matrix on the OPDVs can be derived as: where A 1 consists of the coefficients of the first constraint set n i=1 p i,j = b = k l=1 q j,l s l , j = 1, 2, ..., k and A 2 contains the coefficients of the second constraints set k j=1 p i,j = 1, i = 1, 2, ..., n.
A concrete form for A 1 and A 2 can be formulated as following.
where, I k×k is the identity matrix of size k × k. According to Lemma 1, we multiply every row in A 2 with is totally unimodular, so does A. According to Lemma 2, it is clear that A is totally unimodular as every column has only two non-zero elements being ±1, and the sum of of each column equals 0.
Therefore, the constraint matrix on the OPDVs is totally unimodular.
Due to the total unimodularity of the constraint matrix on the OPDVs in the ILP problem described at the beginning of Section III-C.1, we only need to keep the integer constraints on the CSDVs thus leading us to the MILP problem described in Section III-C.1. Therefore, the MILP problem is equivalent to the ILP problem. Solving the MILP problem still results in integral solutions on both OPDVs and CSDVs.
The ILP and MILP problems can be addressed by repeatedly solving the LP relaxations with the simplex algorithm in a branch-and-bound way. In the worst case, it needs to solve O(2 α ) LP problems, where α is the number of integer constraints. Accordingly, to address the ILP problem described at the beginning of Section III-C.1, we need to solve O(2 n×k+k 2 ) LP problems, where n is the number of observations and k is the number of clusters. According to Theorem 1, we can remove the integer constraints on the n × k OPDVs, so that we only need to solve O(2 k 2 ) LP problems. Under certain circumstances when n k, the strategy would make a great contribution to decreasing the total time complexity.

A. EXPERIMENTAL SETTINGS
In this section, we present the experiments conducted on UCI machine learning data sets to evaluate the performance of the proposed algorithm. Table 2 shows the details of the data sets that are used in our experiments.
We integrate the size constraints into the k-means algorithm as described in Section III. We compare the proposed algorithm with the k-means algorithm and other size constrained k-means algorithms which include the algorithm presented in [27] and the algorithm in [33]. For simplicity, we refer to the k-means algorithm as KM, the algorithm in [27] as SCK1, the algorithm in [33] as SCK2, and the proposed algorithm as MILP-KM. To further study the performance of the proposed algorithm, all the k-means based algorithms mentioned above are adapted to the normalized cut based algorithms as the normalized cut clustering algorithm can be seen as applying the k-means algorithm on the data set with reduced dimensions. The normalized graph Laplacian matrix is implemented as a fully connected graph according to the research in [21]. For simplicity, we refer to the normalized cut algorithm as NC, the algorithm adapted from [27] as SCN1, the algorithm adapted from [33] as SCN2, and the proposed algorithm as MILP-NC.
For the evaluation criterion, we consider four external indexes, including the Clustering Accuracy (ACC) [40], Entropy (ENT) [39], Fowlkes and Mallows Index (FMI) [41] and Jaccard Index (JCI) [42]. Apart from these measures, the MSE for the algorithms based on the k-means, the NCUT for the algorithms based on the normalized cut and running time (measured in seconds) are explored. All these measures are recorded and averaged over 10 runs. Besides, the statistical tests are applied to further validate the results on the MSE, NCUT, and running time. For all the data sets except EMPGA2, we apply the Games-Howell test [57], as there are more than two methods to compare. For EMPGA2, both SCK1 and SCK2 can not derive a result within an acceptable time, which leaves us with only two methods to compare, so we make use of the Student's t-test [57].
All the algorithms involved in this paper are implemented in MATLAB, which run on an Intel i7-7700HQ 2.8 GHz processor with 16 GB memory. We explore the build-in MILP solver of MATLAB with default parameters except for "RootLPMaxIterations", "LPMaxIterations", and "MaxTime", which are set as 1000000, 100000, and 18000, respectively. In addition, all clustering algorithms get the initial centroids by k-means++ algorithm [55]. They share the same stopping criterion ||c (t+1) − c (t) || 2 < 0.0001, i.e., the centroids barely change between adjacent iterations. The code and data sets can be downloaded from https://github.com/IGGIUJS/SizeConstrainedClustering.
Notice that we do not conduct the experiments on EMPGA1 and EMPGA2 for the normalized cut based algorithms, because these methods need to construct a normalized graph Laplacian matrix that enormously beyond the memory capacity of our experimental equipment (16 GB RAM).

B. CONVERGENCE
The proposed method is guaranteed to converge according to the analysis in Section III-C.2. The process of convergence can be found in Fig. 1. Since the MSE varies greatly across the data sets, we normalized it to the range [0, 1]. We can observe that for the eight data sets involved, our algorithm converges. For EMPGA1 and EMPGA2, the proposed algorithm takes dozens of iterations, while for other data sets, it converges in less than ten iterations.

C. COMPARISON AMONG K -MEANS BASED ALGORITHMS
In this section, we report the performance comparison among KM, SCK1, SCK2 and MILP-KM in terms of the ENT, ACC, FMI, JCI, MSE and running time.
It can be observed from Table 3 that mostly KM achieves the best MSE and efficiency. It is natural because KM optimizes the MSE without any constraints. As far as the size constrained k-means algorithms are concerned, MILP-KM outperforms SCK1 and SCK2 in the resultant MSE. The statistical test results shown in Table 4 indicate that the differences on the MSE are mostly significant (p <= 0.05). In addition, MILP-KM is significantly faster than SCK2 on most of the data sets. Although MILP-KM takes more time than SCK1 on small data sets, it outputs results with far better MSE. Moreover, for large data sets, such as EMPGA1 and EMPGA2, MILP-KM is even faster than SCK1. Especially for EMPGA2, SCK1 is no longer able to produce a result within an acceptable time.
From the external indexes shown in Fig. 2, we can see that mostly MILP-KM outperforms KM, SCK1, and SCK2. This indicates that incorporating the size constraints as proposed could better improve the performance on the external indexes for the k-means algorithm. Notice that the size constrained method SCK1 performs even worse than KM, this is because that SCK1 is a heuristic method with strong randomness when adjusting the results given by an initial clustering to adapt the size constraints.
In most cases, MILP-KM outperforms SCK2 on the four external indexes. However, there are cases when SCK2 performs better, such as the external indexes on Wine Quality Red shown in Fig. 2. The reason for this is over-fitting, as we can see in Table 3, the MSE of MILP-KM is lower than that of SCK2, yet the SCK2 performs better than MILP-KM on the four external indexes.

D. COMPARISON AMONG NORMALIZED CUT BASED ALGORITHMS
In this section, we report the performance comparison among NC, SCN1, SCN2 and MILP-NC in terms of the ENT, ACC, FMI, JCI, NCUT, and running time.
It can be observed from Table 5 that among the size constrained normalized cut algorithms, MILP-NC mostly outputs results with the optimum NCUT. The Games-Howell test results shown in Table 6 indicate that the differences on the NCUT are significant (p <= 0.05). In addition, MILP-NC runs significantly faster than SCN2 on all the data sets. Despite that the MILP-NC is less efficient than SCN1, it is signifcantly more accurate.
From the result shown in Fig. 3, we can see that MILP-NC outperforms SCN1, SCN2, and NC on the four external indexes. Thus, the proposed method could adapt the size constraints to better improve the clustering performance on the external indexes for the normalized cut algorithm.

V. DISCUSSION
This paper tackles the problem of incorporating equality constraints in the clustering task, i.e., the sizes of the clusters equal a set of constraints. The proposed method could be extended to a general framework that adapts to any kinds of size constraints. The generalized framework requires the user-specified lower bounds s = {s 1 , s 2 , ..., s k } and upper bounds s = {s 1 , s 2 , ..., s k } on the sizes of the clusters. If 0 < s j n, s j ≥ n, then there is only a lower bound constraint on the size of the j-th cluster (the upper bound constraint does not affect the results). If s j 0, 0 s j < n, then there is only a upper bound constraint. If s j 0, s j ≥ n, then there is no constraint on the size of the j-th cluster. If 0 < s j < s j < n, then there are both lower and upper bounds. If 0 s j = s j n, then there is an equality constraint.
The framework works in a similar way as the proposed method, i.e., iterating between the assignment step and the update step. The update step is the same as the one in the proposed method. To solve the problem in the assignment step, firstly, we transform the lower bound constraints into upper bound constraints by placing negative signs on both sides of the inequations. Then, we change the inequality constraints into equality constraints by adding slack variables. Finally, we have an ILP problem with a set of equality constraints. The constraint matrix on the OPDVs and slack variables is totally unimodular so that we can remove the integer constraints on these variables. The original ILP problem would become an MILP problem that can be efficiently solved. Our trials show that the solution of the framework converges as fast as the proposed method. We intend to skip the proof of the total unimodularity here, as it is beyond the scope of this paper.

VI. CONCLUSION AND FUTURE WORK
In this paper, we propose a novel iterative approach to address the issue of size constrained clustering, which consists of an assignment step and an update step. In the assignment step, the prior knowledge about the size constraints specified by users are modeled into an ILP problem. We show that the integer constraints on the OPDVs can be removed due to the total unimodularity. Thus, the ILP problem is equivalent to an MILP problem, which can be much more efficiently solved. In the update step, new centers are updated as the centroids of the clusters. We have conducted extensive experiments on common data sets to evaluate the performance of the proposed method in terms of recognized benchmarks. The experimental results show that the clustering performance could be better improved by leveraging the cluster size constraints as proposed.
Several issues remain to be investigated in the future work. For example, we can explore more real applications where the sizes of clusters need to be restricted, such as the capacitated resource allocation problem. In addition, it is a challenging work to introduce other types of constraints into clustering, such as instance-level constraints. Last but not least, it is interesting to adapt the size constraints into other types of clustering algorithms.
WEI TANG is currently pursuing the master's degree with the Department of Computer Science, Jiangsu University. His research interests include cluster analysis and motion analysis.
YANG YANG received the Ph.D. degree in engineering from the University of Science and Technology of China. He is currently an Associate Professor with the Department of Computer Science, Jiangsu University. His research interests include cluster analysis and motion analysis.
LANLING ZENG received the Ph.D. degree in mathematics from Zhejiang University. She is currently an Associate Professor with the School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang, China. Her research interests mainly include computer graphics and plant modeling.
YONGZHAO ZHAN received the Ph.D. degree in computer science and technology from Nanjing University. He is currently a Professor with the School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang, China. His research interests mainly include sparse representation and video analysis. VOLUME 8, 2020