A Three-Way Clustering Mechanism to Handle Overlapping Regions

The conventional clustering methods assume a binary classification and establish a complete inclusive or exclusive type relation of an object with a cluster. In contrast, a three-way paradigm handles situations where an object may or may not belong to a cluster, i.e., uncertain. The objects belonging to the uncertainty region may lead to inclusion or exclusion after further processing and information. One of the use cases of the three-way paradigm is the overlapping region between different clusters. Effective computation of overlapping objects is crucial to the application’s overall success. In this paper, we employ a three-way clustering approach inspired by image blurring and sharpening operations that consider the objects in the inside or outside regions of a cluster to be non-overlapping. The objects belonging to the partial region of more than one cluster are considered overlapping. The experiment conducted on Birds, Scenes, and 20 newsgroups datasets indicates that the proposed approach improves the F1 measure and hamming loss up to 18.6% and 4.9%, respectively. Furthermore, the system’s robustness for overlapping regions is observed using typical clustering measures. The experimental results suggested that the proposed approach may improve the computation of overlapping regions effectively.


I. INTRODUCTION
In many real-life applications, the clusters tend to have fuzzy boundaries and therefore have an overlapping region.A key challenge in such situations is to effectively identify objects belonging to the non-overlapping and overlapping regions [1], [2], [3].The objects in the non-overlapping regions are clustered into a single cluster, while the objects belonging to the overlapping regions are clustered into multiple clusters, thereby depicting an overlapping area.The overlapping regions have many potential applications and may be considered an important research area.Xu proposed a method to detect the overlapped strawberries with the The associate editor coordinating the review of this manuscript and approving it for publication was Pasquale De Meo .
help of a histogram of oriented gradients and support vector machine [4].Similarly, the retrieval of the overlapping cloud top heights, segmentation of overlapping nuclei, and detection of overlapped gravitational wave signals and genes are some of the most rated applications of suitable detection of overlapping regions [5], [6], [7], [8].
Many researchers tried to find overlapping regions or objects using traditional machine learning algorithms or their variants, such as Overlapping K-Means (OKM) and Weighted OKM (WOKM), Multi-Cluster Overlapping K-means Extension (MCOKE), Overlapping Partitioning Cluster (OPC) and SVM-cone [9], [10], [11], [12].Moreover, many existing approaches for overlapping clustering are based on methods derived from conventional hard clustering approaches and soft clustering strategies [13], [14], [15], [16], [17].The first category of these approaches includes the extensions of commonly used hard clustering algorithms such as kmeans clustering, k-centroids clustering, and k-medoids clustering [9], [13], [18].They incorporate rough sets or fuzzy set theories to formulate overlapping clustering algorithms such as fuzzy k-means, rough k-means, and rough-fuzzy kmeans [15], [16], [19].The essential idea is to make a single cluster assignment in case of non-overlapping or multiple cluster assignments in case of overlapping for an object by considering various association thresholds [9].The key issue in these approaches is to find suitable thresholds [20].There is another category for overlapping clustering, which uses graph theory.These approaches include star, estar, suffix treebased, connected graph-based iterative scan, and overlapping clustering based on relevance (OClustR) [21], [22], [23], [24].These approaches generally suffer from high overlapping, hinder useful information, and have a high computation that makes them unable to cope with real-life problems [24], [25].
Three-way clustering has recently been used as an effective and alternative approach for handling overlapping clusters [2], [26], [27].The three regions of three-way clustering provide an easy interpretation for defining objects in the non-overlapping and overlapping regions.The objects in the inside regions are disjoint and mutually exclusive and therefore used to define the non-overlapping region.On the other hand, the objects in the partial regions may belong to multiple clusters and are therefore used to define the overlapping region.Yu et al. presented a density-based threeway decision approach using decision-theoretic rough sets for detecting overlapping regions [28].Yu et al. formulize a three-way approach using interval sets to detect and shape the overlapping communities in complex networks [2].Yu et al. also work on dynamic datasets and proposed a three-way approach that populates and increments a tree structure to detect overlapping regions [27].In these approaches, the primary concern was formulating a three-way approach and exploring it in the different application areas.Furthermore, fixed and restricted thresholds are used to construct the three regions.Next, we have to automate the process of finding thresholds.Afridi et al. introduced variance varianceoriented three-way approach by incorporating game theoretic rough sets (GTRS) and genetic algorithms to automate the determination of suitable thresholds [26].These methods still need refinements and improvements.
This paper introduces a three-way approach motivated by the blurring and sharpening spatial filtering operations in image processing and explores its application in identifying and detecting overlapping regions.In contrast to the previously proposed approaches, it does not need thresholds for constructing the three regions of a cluster.This approach converts the hard clusters into their respective images.Next, each cluster in its respective formatted image is realized as a typical object.These blurring and sharpening spatial filtering operations determine the core and support sets used to construct the three regions associated with a cluster.The data points in a single cluster's inside or partial region is considered non-overlapping.On the other hand, the data points shared by multiple partial regions are identified as overlapping objects, and these regions are called overlapping regions.The contributions of this study can be precisely defined below: • Introducing a three-way approach to identify and detect overlapping regions.
• The proposed approach does not need thresholds for detecting the three regions and the overlapping regions.The performance of the proposed approach on Scenes and Birds datasets in comparison to some of the previous approaches, including 3WC-OR GTRS , 3WC-OR OR , (1, 0) model and (0.5, 0.5) model shows improvement in the typical evaluations and multi-label measures.More precisely, the proposed approach improved the F1 score and hamming loss up to by 21.6% and 7.2%, respectively.
The remaining article is structured from sections II to VI. Section II discusses the related literature.Sections III and IV introduce the proposed working methodology.Section V elaborates on the performance of the proposed method.Section VI concludes the research findings.

II. BACKGROUND
This section discusses the related background of this study.It includes 3WC and blurring and sharpening operations.

A. THREE-WAY CLUSTERING
Yao pioneered three-way clustering by extending the concept of three-way decisions [29], [30].Let U be a universal set containing finite objects x i , for i = {1, 2, 3, . .., n}.A clustering algorithm results in a set of crisp partitions or clusters C = { c 1 , c 2 , . .., c n }.The main idea behind three-way clustering (3WC) is to represent a cluster c k using core and support sets i.e., c k = { Core(c k ), Support(c k )} such that Core(c k ) ⊆ Support(c k ) and Core(c k ), Support(c k ) ⊂ U .The core set is the compact, condensed, and concise representation of a cluster containing objects strongly related to the cluster.In contrast, the support set is a cluster's expanded, diffused, and discursive representation, containing core objects and some additional objects weakly related to the cluster that relax the cluster representation.These two sets define the three regions associated with the cluster c k , The inside region includes the instances with a strong relationship to the cluster; therefore, they definitely belong to that cluster.The outside region contains objects not belonging to the cluster.The partial region contains uncertain objects that require further investigation to decide their relationship with the cluster.In literature, [20] and [31], some other notions are also used to represent a cluster, such as core Co(c k ) and fringe Fr(c k ).These sets are equivalent to the core set and support set in the sense of representation of a cluster and exist in the following relation, The three regions, i.e., Inside, Outside, and Partial, are defined using an evaluation function and a threshold pair.The evaluation function and threshold pair determine the object's strength and placement, respectively.For a pair of thresholds (α, β) based on the value of the evaluation function e(c k , x i ), the three regions are defined as, An object will belong to the inside region if it evaluates the evaluation function above α, mentioning its strong relationship with cluster c k .An object will belong to the partial region if it evaluates the evaluation function between α and β, mentioning an uncertain relationship with the cluster c k .Finally, an object will belong to outside if its evaluation function value evaluates less than β, mentioning no relationship with the cluster c k .The determination of thresholds is the key to determining cluster boundaries for the inclusion and exclusion of an object x i and clustering accuracy.
It is important to mention some notable works similar to the idea of 3WC with a different way to approach the solution using rough clustering, shadowed set clustering, fuzzy set clustering or interval set clustering [15], [32], [33], [34], [35].

2) 3WC AND ROUGH CLUSTERING
A rough set (R) was introduced using equivalence relations that may be defined based on a pair of lower and upper approximations as [R, R] till Palwak properties being satisfied [53], [54], [55].Lingras and West introduced rough clustering based on the interval representation of rough sets [15].In the case of rough clustering, an object, and a cluster satisfy the following basic conditions of rough set theory, • ∃!Core i ∋ x i , it means that there exists exactly one lower approximation or equivalently a core to which an object can belong.
• x i ∈ Core i ⇒ x i ∈ Support i , It means that the belonging of an object to the lower approximation implies its belonging to the upper approximation of the same cluster.
• if Partial i , such that n = 2, 3, 4, • • • , n, It means that the object not belonging to any cluster implies its presence in the upper approximation of at least two clusters.Furthermore, the rough clustering may allow the lower approximation to be empty if objects belong to more than one upper approximation, i.e., In contrast, 3WC does not allow a core region to be empty.Moreover, there is no constraint on an object to belong with the boundary region or equivalently partial region of more than one cluster.Some of the notable works in the direction of rough set clustering are put forth in [38], [39], [40], [41], [42], [43], [44], [45], and [46].

3) 3WC AND SHADOW-SET CLUSTERING
Pedrycz introduced shadowed-set as a three-way approximation of fuzzy sets [49], [56].The shadowed set uses a threshold pair (α, β) for a three-way approximation of a fuzzy set.The objects are approximated in these three regions, namely, core, shadow, and uncertain or excluded, as follows: The above Equation ( 9)-( 11) differentiate 3WC with shadowed-set clustering.The shadowed set clustering allows the association of an object to more than one core region.Further, it allows empty core and empty shadowed region [49], [50].In the shadowed set, it is possible that some of the objects became unclustered and declared as outliers [49].

4) 3WC AND ORTHOPARTITIONS CLUSTERING
An orthopartition O is the collection of orthopairs O i with the following basic properties [33], The orthopartition framework has some differences with 3WC.The orthopairs are disjoint in orthopartitions, while 6548 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
uncertain elements must belong to more than one partial region of two different orthopairs.In addition, orthopartitions allow a core region to be empty.Furthermore, it may also allow an empty partial region and even a complete orthopair [33], [36].

5) 3WC AND INTERVAL-SET CLUSTERING
Interval set clustering represents a cluster in the form of an interval set.It was basically an improvement to rough k-means clustering by proposing a two-level lower and upper approximation to unify the respective lower and upper approximation of a rough cluster under consideration [15], [57].The following properties are obeyed by interval set clustering The above equations show that the core region is not allowed to be empty, implying that the support or partial region will also not be empty.The second property suggests that each of the objects will be clustered.Finally, the core regions are mutually exclusive [15], [34], [57].

B. BLURRING AND SHARPENING OPERATIONS
Sharpening is a procedure used in the spatial domain to enhance the visual quality of an image by manipulating its edges.The enhancement is first reduced to separate the edges by their amplification and sum them back into the original image [58].Mathematically, sharpening is defined as, where f (x, y), f Sharpen (x, y) and f mask (x, y) represent the pixel values of the original image, sharpened image, and unsharp mask, respectively.Here, k is the scaling factor; its value may be 1, ≥ 1, or < 1 and determine the nature of sharpening, high boost filtering, and sharpening de-emphasizer, respectively.The sharpened image is obtained by deducting a blurred version of the processed image from the original image, which is given as, where f lp is a low pass filtered image obtained using the most basic average filter and defined as, where S xy is an m × n processing window centered at pixel with coordinates (x, y).The S xy surrounds the neighborhood, i.e., immediate neighbors of pixel (x, y).The blurring operation changes the intensity value of the pixel (x, y) to the average value of its neighborhood or immediate neighbors S xy For a solid mathematical understanding of insight working mechanism of sharpening, we visually represent an input image in Figure 1(b), and the resultant images after blurring and sharpening can be seen in Figures 1(a) and 1(c), respectively.Each cell or box approximates a pixel in the above images.The boxes are colored white, gray, or shaded, representing their intensities.The white color represents an intensity value of 1, and the gray color corresponds to 0, while the shaded boxes portray an intensity value between 0 and 1.Consider that the object of interest is represented by the white boxes positioned in the center of Figure 1(a).A low-pass filter attenuates higher frequencies and passes low frequencies, while a high-pass filter does not affect the higher frequencies.The average filter is a low-pass filter used to smoothen or blur the image.It only computes the average of the processing pixels and its immediate eight neighbors.The averaging result is replaced with the value of the processing pixel.The blurring operation expands the boundaries by contracting the object inside.The sharpening is carried out by high-pass filtering.It emphasizes finer details in the image while not affecting higher frequencies.It works exactly opposite to lowpass filters.The idea is to blur the original image using an average filter, subtract it from the original image to get the sharpening mask, and add it to the original image to sharpen it.The sharpening contracts the boundaries by expanding the object within boundaries or edges.Inspired by the idea of image blurring and image sharpening, a three-way clustering (3WC) approach is discussed in the coming section.

III. A THREE-WAY CLUSTERING APPROACH BASED ON IMAGE BLURRING AND SHARPENING (C3BS)
The C3BS has three consecutive steps: • Converting hard clusters into their respective images • Applying blurring sharpening operations on each cluster • Extracting the evolved three-way soft clusters This section elucidates each step in reasonable detail.

A. CONVERTING HARD CLUSTERS INTO THEIR RESPECTIVE IMAGES
The first step in the process is to represent the dataset in the form of a grid.The C3BS works on normalized attributes and division of the unit square (in case of two attributes) or unit hypercube (in case of more than two attributes) into an equally distributed grid [59].The grid is realized as a pixelated image where each grid cell is equivalent to a pixel in an image.Each grid cell or pixel has the same size, containing objects or empty, representing their intensity.Further, we count the number of objects in each cell to determine their grayscale intensity.
This approach uses the Euclidean distance metric to measure the influence of each attribute.Therefore, first, we normalize the attributes A in the range [0,1] to balance the effect of each attribute during distance analysis [60].It scales the whole problem space to a unit hypercube.There are O(n.A) operations required to complete this step.Further, we divide the unit square (in case of two attributes) or unit hypercube (in case of more than two attributes) into p number of equidistant parts such that p ∈ N. Hence, the total number of grid cells became p A where p represents the number of grid cells or pixels in one dimension, and A denotes the number of attributes in the data space.Each grid cell can be located through A-tuple (j 0 , j 1 , . .., j A−1 ) where j k ∈ {0, 1, . .., p −1}.Furthermore, we can map this multidimensional grid to a single value index i.e., I = {0, 1, 2, . .., p A }. Algorithm 1 demonstrates the grid-based or image representation of data space.
The number of objects in each cell is a key determinant of its relevant intensity.In order to determine the intensity of each cell or pixel, the number of objects in each cell is counted, and the gridded data is clustered using a clustering algorithm or labeled data to obtain the initial partitions.The intensity level used is grayscale between 0 to 255.The number of objects n(o) in each cell is divided by the maximum number max(o) of objects contained in a participating cell to scale the values between 0 and 1.The intensity of each cell in a 2D grid is defined as, In the above equation, I (x, y) shows the intensity of the corresponding cell, P(o i |C i ) determines the probability of x i given C i , where x i belongs to the grid cell under processing.
In the case of n-dimensions, the intensity of a pixel is given by, where x 1 , x 2 , . . ., x n and I (x 1 , x 2 , . . ., x n ) represents the n-attributes and intensity of the cell corresponding to these dimensions, respectively.

B. BLURRING AND SHARPENING OPERATIONS
The pixelated image obtained in the previous step is feasible for blurring and sharpening operations.In order to blur the for each o i ∈ U do 3: Normalize A attributes between 0 and 1 return I , n[p], Mapping 21: end function image, the intensity of each pixel is updated using the average filter.Let Neig q (cell i ) be the set of q immediate neighboring pixels of a particular pixel cell i .The blurring operation is formally defined as, Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.The above equation causes a reduction in the intensity of processing pixel cell c k with respect to its immediate neighboring pixels cell i to an average value.Thus, the blurring operation results in cluster contraction by expanding boundaries.
The blurring operation will only produce a value of 1 if all the neighboring pixels have a value of 1.This property of blurring operation can be given as, The above condition will only be satisfied by the pixels strongly associated with the cluster under consideration.Hence, the objects belonging to the processing pixel are strongly related to the cluster c k .The sharpening operation updates the intensity of a pixel cell c k concerning the immediate neighboring pixels cell i .A sharpened image is derived from the addition of the original image to the substitutional result of the original image with the blurred image.Mathematically, it is given as, The above equation enhances the intensity of the processing pixel.Thus, the sharpening operation results in an inside cluster expansion while contracting boundaries.The sharpening operation results in a value of 1 for a pixel in each of the following cases.
• Case 1: If the blurring operation for the pixel evaluates to the value equals 1, i.e., e blur (cell c k , cell i ) = 1.
• Case 2: If the pixel has the intensity value of 1.The sharpening operation will be evaluated to a value greater than 1, which will be truncated to 1.
• Case 3: In some scenarios where the intensity of the pixel is greater than the computed mean intensity of its q immediate neighboring pixels.
The operations of blurring and sharpening are equivalent to the core set and support set, respectively.The core set contains elements in strong relation to the cluster, while the support set includes elements with relation ranges from strong to weak.The cells are finally mapped to objects for a visual demonstration of the clustered data of the given dataset.

1) THREE-WAY CLUSTERS EXTRACTION
We noted in the above section that the blurring operation incorporates a strict condition indicating a strong bonding between an object and its associated cluster.Since the core set has similar features, therefore we can find a tie-in between both of them and define the core set as, The above equation aims to produce the core set.It emphasized that the core set contains objects or pixels where the blurring operation is evaluated to a value of one.According to Equation 17, it can only be accomplished when all the q immediate neighboring cells intensify with a value equal to one.This strict condition will be obeyed by a few members who are strongly associated with the cluster, resulting in a shrinkage of the hard cluster.
Next, we observe a sharpening operation incorporating a relatively relaxed condition indicating a weak bonding between an object and the associated cluster.Since support set has similar characteristics, therefore a tie-in can be found among both of them and define support set as, The above equation aims to produce the support set.It means that the support set contains those objects or cells where the sharpening operation results in a value of one.The definition of sharpening in Equation 19indicates that a cell or objects will be included in the support region when at least one corresponding cell has an intensity of 1.This results in a relaxation of the given hard cluster.
There exists a relationship between the Core(c k ) and Support(c k ) i.e., Core(c k ) ⊆ Support(c k ) based on their definitions in Equations ( 20)- (21).This relationship indicates that a cell in the core set must be present inside the support set.Furthermore, a cell with an intensity value of one for all its immediate neighboring cells will be included in the core set, while at least one of its immediate neighboring cells with an intensity value of one resides in the support set.Three regions of a three-way cluster based on the twain sets Core(c k ) and Support(c k ) are defined as, The above Equations ( 20)-( 21) are used to obtain three regions of a cluster.The inside region has a compact and consolidated representation containing cells or objects strongly bonded to the cluster.The partial region has a relaxed representation of the same cluster containing cells or objects weakly related to the cluster.The outside region contains those objects or cells that have no relationship with the cluster.

IV. A C3BS BASED ALGORITHM FOR OVERLAPPING REGIONS
This section presents a use case for the C3BS, i.e., overlapping regions.The algorithm 2 is the modified version of C3BS to handle the overlapping clusters.This algorithm has two inputs: a universal set U and its initial partition.The algorithm's output is the sets of OR and NOR representing the sets containing the overlapping and non-overlapping objects.The algorithm begins by executing an algorithm corresponding to C3BS, which creates three-way clusters corresponding to each initial cluster given in the set C. In line 2, the algorithm defines the set of overlapping objects given by the set OR based on the results of three-way clustering.In particular, all the objects that are not inside any cluster and do not belong to the partial region of a single cluster are considered to be overlapping objects.The second set defined in line 2 reflects only those objects in a single cluster's partial region.Line 3 defines the non-overlapping objects given by the set NOR.In lines 4 to 6, the overlapping objects are assigned to multiple clusters.More specifically, an object in the overlapping region is assigned to all those clusters for which it belongs to its respective partial regions.Finally, in lines 7 to 9, the non-overlapping objects are assigned to a single cluster.Each object, in this case, is assigned to a single cluster for which it belongs to its respective inside or partial region.

V. PERFORMANCE OF C3BS FOR OVERLAPPING REGIONS
In this section, we analyze the performance of C3BS for overlapping clusters.Multi-labeled datasets may be used to depict overlapping clusters by realizing a one-to-one correspondence between clusters and classes.In particular, all the multi-labeled instances may be treated as overlapping objects.We use multi-labeled datasets of scenes and birds.The scenes dataset consists of 2407 instances, 294 features, and six clusters.A total of 177 instances have multi-labels and are, therefore, in the overlapping region, while the remaining 2230 instances lie in the non-overlapping region.The bird's dataset consists of 645 instances, 277 features, and 19 clusters.There are 162 instances with multi-labels which suggests these instances overlap, while the remaining 189 instances lie in the non-overlapping region.We considered two types of benchmarks for evaluating performance.The first type of benchmark provides a general insight into the quality of the clusters.The measures of DB index, silhouette, and classification accuracy are used for this purpose.The second type of benchmark provides insight into the performance of correctly predicting the overlapping objects.The hamming loss and F1 measures defined for multi-label problems are used for this purpose [62].Let TL i and PL i represent the true label and predicted labels for an object o i , respectively.The F1 measure for a dataset with N objects is given as, Hamming loss is the fraction of wrongly labeled objects used in multi-class classification scenarios and given as, In the above equation, K represents the number of classes or clusters in the dataset while PL c i and TL c i are the complements sets of PL i and TL i , respectively.Hamming loss is used to detect both prediction errors and classification errors.
Table 1 reports the experimental results for the considered datasets.It may be noted that overlapping clustering studies generally considered datasets that are artificially created and are not available publicly.Moreover, each study uses different datasets to verify their respective approaches.For these reasons, we provide comparisons with our previous work on overlapping clustering in the first place [26].The comparative algorithms are based on threshold settings that are fixed or automatically determined using game-theoretic rough sets.The (0.5, 0.5) and (1, 0) models correspond to three-way clusters obtained with thresholds of (α, β) = (0.5, 0.5) and (1, 0), respectively, using equations ( 6)-( 8).The 3WC-OR GA results are bounded by strict conditions where (0.5, 0.5) is more generally experiencing poor Hamming loss and Silhouette score for the Scenes dataset.The 3WC-OR GA has a compromising DB index for the Scenes and Silhouette for Birds datasets.It determines its strength of better interpretation of the consistency of the objects within the cluster.On the other hand, 3WC-OR GTRS uses game-theoretic rough sets to automatically determine thresholds.The multi-label measures are very compromising for the 3WC-OR GTRS model.The (1, 0) model has the same F1 score as 3WC-OR GTRS ; however, it has a high Hamming loss comparatively.Moreover, the SVM-cone and MCOKE methods evaluate good accuracies.The typical measurement metrics for OKM and OPC are very close.The WOKM performs poorly due to its weighted nature.The proposed model significantly improves the results for both datasets.More specifically, it tackles overlapping clustering in a well-organized way and reduces miss-hits.The typical evaluation and multi-label measure are improved for both Scenes and Birds datasets.The F1 measure indicates the overall performance, which is improved by 18.6% for the Scene dataset and 13% for the Birds dataset.Similarly, miss-hits are determined by Hamming loss, and it has been improved by 2.1% for the Scene dataset, where 2.4% for the Birds dataset.

A. ANALYZING OVERLAPPING REGIONS
In this section, we take a closer look at the performance of the C algorithm in the area of overlapping regions.More precisely, we observe the robustness of C in computing the overlapping regions.For such analysis, we use three metrics i.e., Precision OL , Recall OL and Accuracy OL [26].Mathematically, these metrics can be defined as ( 27)-( 29), shown at the bottom of the page.

Precision OL =
Correctly identified objects from overlapping region Total Objects (27) Recall OL = Overlapping objects correctly assigned to overlapping region Total Objects (28) Accuracy OL = Correctly identified overlapping and non overlapping objects Total Objects (29)   6, and the results for the second phase are tabulated as Table 7 and Table 8.The first phase is to evaluate the performance of the proposed approach in case of an increase in the overlapping classes.Table 3 reports the results when only 25% of the classes, i.e., five, participated in the experiments.
Table 4 shows the results when the classes became 50%, i.e., ten participated.The performance of participant methods improved with an increase in the number of classes.The precision for C3BS drops by 5.3%, and recall has an increase of 5%, resulting in a drop of F1 score by 3%.Overall the positive difference of C3BS drops in the F1 score from 7.5% to 4.9%, but still high in comparison.
Further, the classes increased from 50% to 75%, i.e., 15 participants.The performance became improved for all the methods.Overall, the precision value for C3BS drops by 0.8% while recall increases by the same percentage, resulting in a maintained 88.3% F1 score.In comparison, the C3BS is still dominant, improving hamming loss from 0.102 to 0.098.The final addition of 5 classes made the whole dataset completely included.These experiments are tabulated as Table 6.The C3BS has a drop in precision by 0.8% with an increase in the recall by 0.9% and hamming loss from 0.098 to 0.095.These experiments show that the performance of most of the overlapping methods improves with an increase in overlapping classes.The C3BS performed well in this set of experiments by improving Hamming loss, i.e., decreasing miss hits.The next phase is to evaluate the performance of the proposed approach in case of an increase in the overlapping objects or overlapping regions.The experiments are performed initially on the 20 newsgroups dataset, and the results have been tabulated as the first iteration of the process in Table 'tableiteration1.In the next iteration, we include words that can increase overlapping objects or regions.For instance, words related to hockey are included in the electronics news, making it more favorable for both hockey and electronics classes.Similarly, a list of words has been added to increase the chances of overlapping and overlapping regions.The results are reported in Table 8.The recall for C3BS increased by 3.3% while the precision, F1 score, and Hamming loss dropped by 10.3%, 3.8%, and 0.8%.Similarly, the participants have a drop in F1 score and hamming loss.In comparison, with rival methods, the Hamming loss improved from 1% to 2.4%.These experiments show that the performance drop may be due to an increased overlapping region.The C3BS performed well in this set of experiments by maintaining its supremacy over the state-ofmethods.
Normalization involves iterating through each object in the universal set U and normalizing its attributes.Let there be N objects in U, and each object has A attributes, then the time complexity for normalization is O(N × A).Grid generation possible by each attribute into p parts.If there A attributes and each is divided into p parts, the total number of grid cells will be p A .So, the time complexity for grid generation is O(p A ). Next, to count objects in each grid cell, the algorithm counts the number of objects in that cell.Since there are p A cells and N objects, the time complexity for counting is O(N × p A Finally, mapping coordinates to cells takes place.The algorithm maps each object's coordinates to the corresponding grid cell in the pixelated image.It involves determining the cell for each of the A attributes, and therefore, the time complexity for mapping is O(N × A).The overall time complexity T I −Map is the sum of these steps: Since we are concerned with asymptotic behavior, it can be simplified to: We consider the dominant term O(N × p A ) to express this in big O notation.Therefore, the time complexity of the I-Map algorithm is O(N × p A ). Regarding omega notation, since O(N × p A ) is also a lower bound, we can say the time complexity is also (N × p A ). Algorithm2 can also be summarized in four parts, i.e., the first part is the iteration over clusters (Lines 1-9), cluster operations (Lines 2-7), and Core, Support, Inside, Outside, Partial regions creation (Lines 4-8), The algorithm iterates over each cluster in the initial partition phase; for each cluster, the algorithm computes sets Core(c k ), Support(c k ), Inside(c k ), Outside(c k ), and Partial(c k ).The computation involves iterating over objects in U and checking conditions based on the blur and sharp operations.If there are N objects in U and the image size is M , the time complexity for these sets is O(N × M ) for each cluster, resulting in O(K × N × M ) overall.
The next part is the Construction of C ′ (Line 10).The algorithm constructs a new set C ′ by combining each cluster's Inside, Partial, and Outside sets.If there are K clusters, the time complexity for this step is O(K × N × M ).In the next step, the Construction of OR and NOR occurred in (Lines 11 -12).The algorithm constructs overlapping (OR) and nonoverlapping (NOR) sets based on the constructed C ′ .The time complexity for this step is O(K ×N ).Finally, the iteration over OR and NOR occurs in (Lines [13][14][15][16][17][18].The algorithm in this step iterates over objects in OR and NOR, making decisions based on cluster membership.Let there are L objects in OR and M objects in NOR; the time complexity for this step is O(L + M ).
The overall time complexity is the sum of these steps: + O(L + M ) Since we are concerned with asymptotic behavior, we can simplify this to: We consider the dominant term O(K × N × M ) to express this in big O notation.Therefore, the time complexity of the 6556 VOLUME 12, 2024 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

C3BS algorithm is O(K ×N ×M
). Regarding omega notation, since O(K × N × M ) is also a lower bound, we can say the time complexity is also (K × N × M ).Now, to determine the total time complexity of the process involving Algorithm 1 (I-Map) and Algorithm 2 (C3BS), we need to consider the time complexity of each algorithm separately and then combine them.Therefore, the total time complexity T Total of the whole process, considering the worst-case scenario and taking the dominant term, is given by: Since we are considering the worst-case scenario, we take the dominant term O(K × N × M ).Therefore, the total time complexity of the combined process is O(K × N × M ).Regarding omega notation, the time complexity is also (K × N × M ).

VII. CONCLUSION
In real life, in many applications, such as wireless sensor networks and topic modeling, an instance can belong to more than one cluster.Three-way clustering is capable of organizing a cluster into three regions which can be used to identify the objects belonging to multiple clusters.We explored the role of blurring and sharpening in identifying the objects among the clusters and proposed a blur-sharp-based algorithm.The algorithm first obtains three regions associated with each cluster.It next finds the objects that belong to the partial regions of more than one cluster.Compared with previous state-of-the-art methods, the experiments were conducted on two widely used multilabel datasets, Scenes and Birds, showing that the proposed approach effectively solves overlapping clustering.Furthermore, the performance of is evaluated a textual dataset, namely, newsgroups.More precisely, C3BS improves the results on Scenes, Birds, and 20 newsgroups datasets by up to 18.6%, 13%, 4.9%, respectively, for F1 score and 2.1%, 2.4% and 0.4%, respectively for Hamming loss.In our future work, we will refine the 3WC paradigm to handle more complex scenarios.Furthermore, methods will be investigated to optimize the computational efficiency of 3WC, especially larger datasets.

FIGURE 1 .
FIGURE 1.The blurring and sharpening operations.

Algorithm 1 I
-Map Algorithm Input A universal set U , p > 1 Output A pixelated image I , Number of objects in each cell n[i], Mapping of data into image I 1: function I-Map(U ) 2:

Algorithm 2 3 :
C3BS for Overlapping Clusters Input A universal set U = {o 1 , o 2 , o 3 , . .., o n }, An initial partition C = {c 1 , c 2 , . . ., c K }.Output The set OR and NOR depicting overlapping and non-overlapping regions.1: for each c k ∈ C do 2: Obtain Image c k corresponding to cluster c k Apply cluster blur and cluster sharp operations on Image c k 4:

OR 13 :
for each o i ∈ OR do 14: Decide o i in all multiples clusters for which o i ∈ Inside(c k ) 15: end for 16: for each o i ∈ NOR do 17: Decide o i in a cluster for which o i ∈ Inside(c k ) ∨ o i ∈ Partial(c k ) 18: end for

TABLE 1 .
Typical and multi-label evaluation measures.

TABLE 2 .
Quality attributes of overlapping regions.

TABLE 3 .
Overlapping measurement for 25% classes with overlapping objects.

Table 2
reports the quality-based performance of C3BS in correctly identifying the overlapping regions by considering typical classification measures.The measure of precision in the table shows how many predicted overlapping objects are truly overlapping objects.On the other hand, recall shows what proportion of the actual overlapping objects are correctly classified as overlapping objects.Finally, the measure of accuracy depicts the percentage of correct assignments of objects to the overlapping and non-overlapping regions.The next set of experiments is conducted on text data from 20 newsgroups.This data set contained news from 20 groups broadly categorized as political, religious, sports, sales, electronics, and graphics-related news.The experiments are performed in two phases.The first phase of experiments is reported in

TABLE 4 .
Overlapping measurement for 50% classes with overlapping objects.

TABLE 5 .
Overlapping measurement for 75% classes with overlapping objects.

TABLE 6 .
Overlapping measurement for 100% classes with overlapping objects.